-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Maximum allowed chunk size for gdeflate is 64KB #93
Comments
@technillogue Yes, both Deflate and GDeflate currently don't support compression with chunk sizes >64KB. This was mainly an internal implementation decision to balance compression ratio, performance and temporary memory requirements. You can however use GDeflate CPU compression through At a higher level though, if you are trying to increase compression ratio, you might actually see more benefit by changing the compression options rather than the chunk size. Especially if you are interested in a "compress once, decompress multiple times" scenario, you can pass a non-default /**
* GDeflate compression options for the low-level API
*/
typedef struct
{
/**
* Compression algorithm to use. Permitted values are:
* 0 : high-throughput, low compression ratio (default)
* 1 : low-throughput, high compression ratio
* 2 : highest-throughput, entropy-only compression (use for symmetric compression/decompression performance)
*/
int algo;
} nvcompBatchedGdeflateOpts_t;
static const nvcompBatchedGdeflateOpts_t nvcompBatchedGdeflateDefaultOpts = {0}; |
Is there any chance of using CPU compression with the HLIF? Previously, only 0 was supported for HLIF, and in LLIF testing it seemed that 2 actually performed better than 1 for model weights, which benefit from entropy coding but not dictionary compression. Does 1 also increase entropy coding settings? |
This issue has been labeled |
Currently, we don't have a way of using CPU compression with HLIF unfortunately. But if algo 2 is giving you the best results, I'm assuming CPU compression won't necessarily improve on compression ratio with larger chunk sizes. The entropy coding step is the same in all algos and the CPU compressor does not support pure entropy coding. |
This issue has been labeled |
This issue has been labeled |
The blog post states
In my case I'm using e.g. 40 managers to process >11GB, so I already have enough parallelism and am trying to get better CR than 0.92
Similarly, the changelog for 2.4.1 states
However, actually using 256KB (
GdeflateManager nvcomp_manager{ 1 << 18, nvcompBatchedGdeflateDefaultOpts, stream, NoComputeNoVerify };
) leads to this error:Is >64KB only supported for deflate, not gdeflate?
Steps/Code to reproduce bug
GdeflateManager nvcomp_manager{ 1 << 18, nvcompBatchedGdeflateDefaultOpts, stream, NoComputeNoVerify };
Expected behavior
256KB uncomp_chunk_size should work.
Environment details (please complete the following information):
The text was updated successfully, but these errors were encountered: