How to properly validate a `polars.LazyFrame`? #1776

csubhodeep · 2024-08-04T15:27:58Z

Question about pandera

Hello pandera community, I am trying out pandera to validate a normal polars.LazyFrame as described in the first example in the docs.

Now if I understood the docs correctly, by design, calling the validate method on the LazyFrame would only check the schema. I have the following questions:

What is the extra benefit here for the user to declare a pandera.DataFrameSchema when they can just use the == operator to compare the schema with a pre-defined polars.Schema object?
Now in case we want to do in-depth data validation on the LazyFrame we should call the collect method on it but then if in a situation we have, let's say, 50 columns but in the pandera.DataFrameSchema we have 3 columns then does it make sense to pull the rest 50 columns in-memory?

Would it make more sense to do control this behaviour inside the validate method, this way pandera could add a projection on columns selecting only the ones that have been defined in the pandera.DataFrameSchema and then maybe execute the validation checks/logics and then finally call the collect internally instead of asking the user to call collect before doing the validations.

The text was updated successfully, but these errors were encountered:

butterlyn · 2024-08-06T12:08:29Z

For example (2), can't you just select the columns you want to validate before collecting?

csubhodeep · 2024-08-06T16:36:48Z

For example (2), can't you just select the columns you want to validate before collecting?

@butterlyn do we do the same for pandas? If not, then I am not sure why we need to make an exception wrt the usage only for polars

csubhodeep added the question Further information is requested label Aug 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to properly validate a `polars.LazyFrame`? #1776

How to properly validate a `polars.LazyFrame`? #1776

csubhodeep commented Aug 4, 2024 •

edited

Loading

butterlyn commented Aug 6, 2024

csubhodeep commented Aug 6, 2024 •

edited

Loading

How to properly validate a polars.LazyFrame? #1776

How to properly validate a polars.LazyFrame? #1776

Comments

csubhodeep commented Aug 4, 2024 • edited Loading

Question about pandera

butterlyn commented Aug 6, 2024

csubhodeep commented Aug 6, 2024 • edited Loading

How to properly validate a `polars.LazyFrame`? #1776

How to properly validate a `polars.LazyFrame`? #1776

csubhodeep commented Aug 4, 2024 •

edited

Loading

csubhodeep commented Aug 6, 2024 •

edited

Loading