Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Appends .csv to the log file string #30

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Shrinks99
Copy link
Member

This will ensure that log files are read in the proper format and that I stop opening it in my text editor! ;)

Changes

  • Appends .csv to the user input string passed through the --log argument.
  • Updates help text

- Updates help text

This will ensure that log files are read in the proper format.
Probably better to place it here if we want to use the logfile definition elsewhere without the filename appended.
@Shrinks99 Shrinks99 requested a review from ikreymer November 3, 2023 02:50
@Shrinks99 Shrinks99 self-assigned this Nov 3, 2023
@Shrinks99 Shrinks99 marked this pull request as draft November 3, 2023 03:16
@Shrinks99 Shrinks99 removed the request for review from ikreymer November 3, 2023 03:16
@Shrinks99 Shrinks99 marked this pull request as ready for review November 3, 2023 03:27
@Shrinks99 Shrinks99 requested review from ikreymer and tw4l November 3, 2023 03:27
@@ -240,7 +241,7 @@ def __init__(self, url_prefix, inputs,
self.use_mapfile = True
self.mapfile = mapfile

self.logfile = logfile
self.logfile = str(logfile + '.csv')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.logfile = str(logfile + '.csv')
self.logfile = logfile

I think we actually want to put this below in the if self.logfile: block. Otherwise could end up with a None.csv logfile when we don't want/expect one.

@@ -240,7 +241,7 @@ def __init__(self, url_prefix, inputs,
self.use_mapfile = True
self.mapfile = mapfile

self.logfile = logfile
self.logfile = str(logfile + '.csv')
self.use_logfile = False
if self.logfile:
self.use_logfile = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.use_logfile = True
self.use_logfile = True
self.logfile = f'{self.logfile}.csv'

Could also be self.logfile = '{}.csv'.format(self.logfile) if we're trying to support Python < 3.6, but I don't think we have to worry about that at this point.

@tw4l
Copy link
Member

tw4l commented Nov 13, 2023

Other thing to consider: Should we just expect the full filename with the extension to the --log argument? And if there's an issue with another tool that's using this library and isn't doing that, we could handle it upstream.

Something to think about. Sometimes a little less "magic" ends up being more flexible for a wide variety of use cases (and a little less logic to handle).

@despens
Copy link
Contributor

despens commented Nov 13, 2023

My sense is that auto-appending file extensions to provided names only makes sense if the same base name is used for several output files. For instance, if a crawler would ask for a "collection name" and then produce a filename.warc.gz, filename.cdxj, and filename.log or similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants