Skip to content
This repository has been archived by the owner on Aug 30, 2022. It is now read-only.

Duration column should be handled by ParquetFileWriter #741

Open
bin-wang opened this issue Aug 24, 2021 · 0 comments
Open

Duration column should be handled by ParquetFileWriter #741

bin-wang opened this issue Aug 24, 2021 · 0 comments
Labels

Comments

@bin-wang
Copy link
Contributor

Currently the ParquetFileWriter cannot handle table with parquet columns. Due to the following reasons

  1. Parquet has a Interval logical type which has the same meaning of Hillview Duration, but its format is too convoluted. It's essentially 3 int32 numbers representing months, days, and milliseconds. If we only use the milliseconds field there might be a precision loss.
  2. If we save the Duration column as Double. Then currently we cannot guarantee reading back a saved table yields the same format as the original table.

One possible solution might be save as Double but also save a schema file.

@mihaibudiu mihaibudiu added the bug label Aug 24, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants