Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: reads using global ctx #982

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ion-elgreco
Copy link
Contributor

Which issue does this PR close?


from datafusion.dataframe import DataFrame
from datafusion.expr import Expr
import pyarrow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: it would be great to use ruff (https://stackoverflow.com/a/77876298) or isort to deterministically and programmatically sort python imports, and validate that in CI. I think isort/ruff would have a newline here between the third-party and first-party imports.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there a pre-commit config for ruff linter and formatter

- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.3.0
hooks:
# Run the linter.
- id: ruff
# Run the formatter.
- id: ruff-format

Copy link
Contributor

@kylebarron kylebarron Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the SO answer above explains, import sorting isn't currently part of the default ruff-format behavior. We'd need to opt-in by adding an I element here:

select = ["E4", "E7", "E9", "F", "D", "W"]

Copy link
Contributor

@timsaucer timsaucer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not opposed to this addition, but there is a potential source of confusion that we can mitigate with documentation. If a new user creates a session context themself and registers functions, and then creates a dataframe using this method, the functions they registered will not be available. I think it could lead to a fair amount of confusion.

I think this is easily mitigated by adding documentation to these functions that describes that it uses a default global session context and if the user needs a custom context they need to use the functions .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make all read methods available on DataFusion module
4 participants