Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scripts for downloading datasets? #24

Open
ir2718 opened this issue Oct 31, 2024 · 5 comments
Open

Scripts for downloading datasets? #24

ir2718 opened this issue Oct 31, 2024 · 5 comments

Comments

@ir2718
Copy link
Contributor

ir2718 commented Oct 31, 2024

Hi,

I think it would be much easier to use this repo if there were scripts for downloading well known benchmarks (Union14M, IIIT, IC13, IC15, ...). Ideally, users would run these scripts if they need a certain dataset, and then modify the config to suit their training/eval needs. If you're okay with this I would like to create a PR.

Additionally, I found it annoying that there is no documentation or at least more information on using the repo in the readme file. If I have time I would like to add this as well.

@Topdu
Copy link
Owner

Topdu commented Nov 1, 2024

Sure! We'd like and appreciate if you would create a PR with the Dataset Download scripts and documentation.

@ir2718
Copy link
Contributor Author

ir2718 commented Nov 1, 2024

I've only got one question regarding the possible implementation: is it ok to treat torchvision as a dependency? I see it's not present in the requirements.txt file, but it is used in the preprocess module.

@Topdu
Copy link
Owner

Topdu commented Nov 1, 2024

Yes, torchvision should be installed.

@ir2718
Copy link
Contributor Author

ir2718 commented Nov 1, 2024

Is there any way you can host the Union14 datasets on a Google Drive or server? The issue is that the authors set it up on Baidu and OneDrive. Baidu causes issues for people who are not based in asia, while downloading from OneDrive would introduce including the API for downloading in the dependencies. Alternatively, I could ask the authors to upload the dataset to Google Drive.

@ir2718
Copy link
Contributor Author

ir2718 commented Nov 3, 2024

I made a draft PR, so when you've got time, please have a look @Topdu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants