Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-37184][connector/filesystem] Add ZStandard to supported standard decompressors #26029

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

JoeryH
Copy link
Contributor

@JoeryH JoeryH commented Jan 20, 2025

What is the purpose of the change

  • Support ZStandard decompression for input files of the new File API. When ZStandard was first added it was only added to the old File API, the new File API was left out.

Brief change log

  • Added ZStandard to the list of decompressors in StandardDeCompressors

Verifying this change

  • Added unit test for StandardDeCompressors

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@flinkbot
Copy link
Collaborator

flinkbot commented Jan 20, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

Copy link
Contributor

@davidradl davidradl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 thoughts:

  • I think there should be a unit test for this
  • It would be useful to document these decompressors and how this effects the API.

Copy link
Contributor

@davidradl davidradl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please can you supply a unit test to that makes use of this change.

@davidradl
Copy link
Contributor

Reviewed by Chi on 23/01/2025 Go back to the submitter with review comments.

@JoeryH
Copy link
Contributor Author

JoeryH commented Jan 24, 2025

@davidradl Thanks for looking at my PR. I'll add some sort of unit test tomorrow, if you have anything specific in mind, please let me know.

On the documentation I wholeheartedly agree. But I am pretty new to Flink and I don't know how to change the documentation or really even what to document. I'm still a little bit confused about the status of https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface. It's been quite a while, but the old API has better support for compressions: it has zstd added and it has an API to register more compressions for Flink users, while the new API has neither.

@JoeryH
Copy link
Contributor Author

JoeryH commented Jan 25, 2025

@davidradl I added an unit test

…rd decompressors

This was missed when ZStandard was first implemented.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants