Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IMGT/HLA version 3.56.0 and onwards provides large files as zip files #133

Open
Carovanandel opened this issue Apr 26, 2024 · 3 comments
Open

Comments

@Carovanandel
Copy link

Hi,

The IMGT/HLA reference from version 3.56.0 onwards provides large files as zip files, as can be read on the IMGT/HLA github page:
As of Release 3.56.0, due April 2024, all large files (>100MB) will be provided as compressed files rather than utilise Git LFS, which was previously required. This includes the hla.dat, xml/hla.xml and xml/hla_ambigs.xml in the next release. This has been done to simplify the cloning process and also due to escalating and unpredictable costs in providing the files using Git LFS from a public repository. All compressed files will use the [ZIP format](https://en.wikipedia.org/wiki/ZIP_(file_format)). This formatting change will be applied to all branches.

This breaks your code, as files like hla.dat cannot be found as they are zipped. Using IMGT/HLA versions up until 3.55.0 seems to work fine. I have created a pull request to update the reference list in parameters.json to include the IMGT/HLA versions 3.47.0-3.56.0, as they were missing, so the arcasHLA reference --version command works with these versions. However, from 3.56.0 onwards, it does not work anymore. Could you update your code to work with the zipped files?

Thanks in advance!

@Carovanandel Carovanandel changed the title IMGT/HLA version 3.56.0 and onward provides large files as zip files IMGT/HLA version 3.56.0 and onwards provides large files as zip files Apr 29, 2024
@kalanir
Copy link

kalanir commented Jul 18, 2024

I am also currently running into this issue! Keep running into the error:
FileNotFoundError: [Errno 2] No such file or directory: '/home/arcas-hla-0.5.0-1/scripts/../dat/IMGTHLA/hla.dat' when in fact hla.dat.zip exists

@tyxdavid
Copy link

Also encounter the same issue. A workaround is downloading the IMGTHLA database as usual, and manually unzip the zipped files under dat/IMGTHLA/ . Replace the reference.py in scripts/ with the attached script. Use command 'arcasHLA reference --update_static' to write neccessary files for the analysis.
reference.zip

@bozbezbozzel
Copy link

@tyxdavid thanks for this, it helped me run genotype.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants