Skip to content
This repository has been archived by the owner on Jul 22, 2024. It is now read-only.

Serialization fails for some users #17

Open
calstad opened this issue Oct 23, 2020 · 0 comments
Open

Serialization fails for some users #17

calstad opened this issue Oct 23, 2020 · 0 comments
Labels
bug Something isn't working

Comments

@calstad
Copy link
Contributor

calstad commented Oct 23, 2020

Some users are seeing the following issue when the ETL job tries to serialize the results to parquet:

(base) ➜  orbit_prediction git:(master) python3 orbit_prediction/spacetrack_etl.py --st_user [email protected] --st_password <mypassword> --norad_id_file sample_data/test_norad_ids.txt --past_n_days 10 --output_path outputfile
INFO:__main__:Fetching Satellite Catalog Data...
INFO:__main__:Number of TLE Batch Requests: 1
INFO:__main__:Starting to fetch TLEs from space-track.org
INFO:__main__:Processing batch 1/1
INFO:__main__:Fetching TLEs for 20 ASOs...
INFO:__main__:Parsing raw TLE data...
INFO:__main__:Finished fetching TLEs
INFO:__main__:Calculating orbital state vectors for 372 TLEs...
INFO:__main__:Serializing data...
Traceback (most recent call last):
  File "orbit_prediction/spacetrack_etl.py", line 309, in <module>
    orbit_data_df.to_parquet(args.output_path)
  File "/usr/local/anaconda3/lib/python3.8/site-packages/pandas/util/_decorators.py", line 214, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/anaconda3/lib/python3.8/site-packages/pandas/core/frame.py", line 2109, in to_parquet
    to_parquet(
  File "/usr/local/anaconda3/lib/python3.8/site-packages/pandas/io/parquet.py", line 260, in to_parquet
    return impl.write(
  File "/usr/local/anaconda3/lib/python3.8/site-packages/pandas/io/parquet.py", line 112, in write
    self.api.parquet.write_table(
  File "/usr/local/anaconda3/lib/python3.8/site-packages/pyarrow/parquet.py", line 1733, in write_table
    writer.write_table(table, row_group_size=row_group_size)
  File "/usr/local/anaconda3/lib/python3.8/site-packages/pyarrow/parquet.py", line 591, in write_table
    self.writer.write_table(table, row_group_size=row_group_size)
  File "pyarrow/_parquet.pyx", line 1433, in pyarrow._parquet.ParquetWriter.write_table
  File "pyarrow/error.pxi", line 84, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Casting from timestamp[ns] to timestamp[ms] would lose data: 1602624833909568000

It looks to be an issue with pyarrow and this stackoverflow thread may provide a solution.

@calstad calstad added the bug Something isn't working label Oct 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant