Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix an issue that caused the astropy table reader on Windows to behave differently to other platforms #2519

Merged
merged 2 commits into from
Sep 25, 2024

Conversation

astrofrog
Copy link
Member

On Windows, the default encoding/locale seems to be cp1252 which will read random binary files without complaining, which is not ideal for automatic format recognition.

cc @dhomeier

…e differently to other platforms and read binary files as tables
@astrofrog astrofrog added the bug label Sep 24, 2024
Copy link
Collaborator

@dhomeier dhomeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May give it a try with encoding only set in the first pass.

glue/core/data_factories/astropy_table.py Outdated Show resolved Hide resolved
@dhomeier
Copy link
Collaborator

That seems to resolve the test failures; wondering if there still might be text formats that we are not testing here?
Seeing the same dev test installation failures over at Astropy now; must be somewhere with anaconda, hopefully only temporary.

# files, which is an issue since it will start recognizing e.g. PNGs as
# valid tables.
encoding = 'utf-8'

from astropy.table import Table

# In Python 3, as of Astropy 0.4, if the format is not specified, the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's been closed as of 2019 I think, but having to try ASCII first now certainly holds again ;-)

@dhomeier
Copy link
Collaborator

Finally got the dev jobs through, so this seems to work at least for all tests.
I've been wondering if pandas text input might run into similar problems. On macOS I get for the test data

>>> with make_file(data, '.png') as fname:
...     for df in data_factory:
...         print(df.label, df.priority, df.identifier(fname))
...     
FITS file 100 False
HDF5 file 100 False
Numpy save file 100 False
ASCII Table 1 False
FITS table 1 False
VO table 1 False
AASTeX Table 0 False
Auto 0 True
CDS Catalog 0 False
Catalog (astropy.table parser) 0 False
DAOphot Catalog 0 False
Excel 0 False
IPAC Catalog 0 False
Image 0 True
LaTeX Table 0 False
Pandas Table 0 False
SExtractor Catalog 0 False
CASA PPV Cube -1000 False

My interpretation of that is even if Pandas or LaTeX identified as True on Windows, they would not get into the way because Image is tried prior to them.
Wondering if
with locale.setlocale(locale.LC_ALL, locale=(None, 'utf-8'))
could be an alternative that would not let the Table.read() without format fail.

@pllim
Copy link
Contributor

pllim commented Sep 25, 2024

CI is green, let's merge!

@dhomeier
Copy link
Collaborator

Yes, I guess if issues with other text formats should pop up, we can always fix them later.
Otherwise this will hopefully get us through until 3.15 ;-)

@dhomeier dhomeier merged commit 39eec54 into glue-viz:main Sep 25, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants