-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Download data #72
base: main
Are you sure you want to change the base?
Download data #72
Conversation
these are intended to be run as pipenv scripts: pipenv run download_data pipenv run delete_data they can also be run from file
Updated Pipfile to specify versions. Updated Scripts definitions. Moved scripts to process_data repository. Edited header comment in download_data.
Hullo, James. Just checking in. This branch looks like it's nearly ready for a merge. Do my remarks mostly make sense? Is there any trouble with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks good, but make sure to merge changes from main and delete your other PR.
Corrected nits and did some clean-up. Simplified the download_data function with pathlib.
generate new pipfile
Manually combined pipfile and regenerated pipfile.lock, reverted requests version to try and avoid problems
tried to fix errors in pipfile
|
||
|
||
def main(): | ||
data_dir = os.path.join(os.path.dirname(__file__), "../../data") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly this works, and you can merge it as-is. But consider calling normpath() on it, for better quality diagnostics if / when something downstream goes south. The pathlib equivalent would be .resolve(). The "../.." in the middle of what we're passing around has unambiguous meaning to the machine. But humans tend to get a little lost with all the up and down when trying to interpret a message like "bad thing happened with foo/bar/../baz/file.txt".
If you do choose to revisit this prior to merging down, consider DRYing up the two different definitions of data_dir
so there's just one. The project's usual idiom seems to be: from mediabridge.definitions import DATA_DIR
(The clean_data_folder() seems like a long-winded way of saying .rmtree() to nuke followed by mkdir(), but maybe there's some permission attribute on the data/ directory that we're trying to preserve.)
It looks like you still need to get Pipfile merged, or ask one of us to jump in on your feature branch and merge it for you. Fortunately it's small, so it's easy to eyeball the differences with main
branch, which you can conveniently view on the github site in a web browser. Keeping entries alphabetical eases that task.
Don't worry too much about the .lock file, as Pipfile is the important one. A pipenv lock
will regenerate it and we can just "use ours" to resolve any merge nonsense. Good luck! Remember to click "delete" on your feature branch once the squash merge is complete.
Converted my fork into a branch called Download-Data