Skip to content

Commit

Permalink
Switch Parquet default compression to zstd
Browse files Browse the repository at this point in the history
Some experiments have shown that zstd provides the optimal
compression size as well as write/read times, so the decision
was made to switch the Parquet default to zstd compression.
  • Loading branch information
bmcdonald3 committed Jun 18, 2024
1 parent 416cb90 commit 1aef741
Showing 1 changed file with 4 additions and 3 deletions.
7 changes: 4 additions & 3 deletions arkouda/io.py
Original file line number Diff line number Diff line change
Expand Up @@ -1163,7 +1163,7 @@ def to_parquet(
prefix_path: str,
names: Optional[List[str]] = None,
mode: str = "truncate",
compression: Optional[str] = None,
compression: Optional[str] = "zstd",
convert_categoricals: bool = False,
) -> None:
"""
Expand All @@ -1182,7 +1182,7 @@ def to_parquet(
If 'append', attempt to create new dataset in existing files.
'append' is deprecated, please use the multi-column write
compression : str (Optional)
Default None
Default zstd
Provide the compression type to use when writing the file.
Supported values: snappy, gzip, brotli, zstd, lz4
convert_categoricals: bool
Expand Down Expand Up @@ -1512,7 +1512,7 @@ def save_all(
file_format="HDF5",
mode: str = "truncate",
file_type: str = "distribute",
compression: Optional[str] = None,
compression: Optional[str] = "zstd",
) -> None:
"""
DEPRECATED
Expand All @@ -1537,6 +1537,7 @@ def save_all(
Only used with HDF5
compression: str (None | "snappy" | "gzip" | "brotli" | "zstd" | "lz4")
Optional
Default zstd
Select the compression to use with Parquet files.
Only used with Parquet.
Expand Down

0 comments on commit 1aef741

Please sign in to comment.