Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #106 #117

Closed
Closed

Conversation

charles-cowart
Copy link
Collaborator

Addresses #106

Adding support for instrument-based configuration. Since the
configuration file is defined in mg-scripts, but not used by mg-scripts,
there is not much change required here.
@coveralls
Copy link

coveralls commented Jan 17, 2024

Pull Request Test Coverage Report for Build 7562474734

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.04%) to 82.471%

Totals Coverage Status
Change from base Build 7430826706: 0.04%
Covered Lines: 2141
Relevant Lines: 2455

💛 - Coveralls

# SN – RapidRun which is HiSeq 2500
'SN': 'RapidRun'}

@staticmethod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why staticmethod vs. a regular method? It would allow to use self vs. InstrumentUtils.types[code] ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a hedge. I know that I naturally tend to resort to creating classes more than others in the lab do. In this case I did want to encapsulate the types within a scope, but I also demonstrated that none of the methods require state to be maintained in between them.

I wouldn't recommend making these regular methods for that reason. If anything I would recommend making types and the methods global instead of making them regular methods.

IMHO my first choice would be to migrate these to metapool, since they would be potentially useful in notebooks and other code. I can go either way on either keeping it a static class or making them global.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, and I'm not sure why classes you prefer them and if you are the only one why not follow: "Vox Populi, Vox Dei".

Anyway, I think we will also need these values for other things/projects; for example, this is important information to calculate the resource allocations for these jobs - basically, calculate the actual values that will go to the config file. Following this idea and @wasade comments, maybe adding a single tsv file with the values of the dict would be a better solution so we can expand in the future, when needed. What do you think?

sequence_processing_pipeline/Pipeline.py Show resolved Hide resolved
sequence_processing_pipeline/Pipeline.py Show resolved Hide resolved
sequence_processing_pipeline/Pipeline.py Show resolved Hide resolved
sequence_processing_pipeline/Pipeline.py Show resolved Hide resolved
sequence_processing_pipeline/Pipeline.py Show resolved Hide resolved
sequence_processing_pipeline/Pipeline.py Show resolved Hide resolved
sequence_processing_pipeline/Pipeline.py Show resolved Hide resolved
sequence_processing_pipeline/Pipeline.py Show resolved Hide resolved
sequence_processing_pipeline/Pipeline.py Show resolved Hide resolved
sequence_processing_pipeline/configuration.json Outdated Show resolved Hide resolved
sequence_processing_pipeline/Pipeline.py Show resolved Hide resolved
sequence_processing_pipeline/Pipeline.py Show resolved Hide resolved
@charles-cowart
Copy link
Collaborator Author

charles-cowart commented Jan 17, 2024

@wasade I'm hesitant to change how we determine the type and ID since everyone seems to remember this method and it does appear durable. However I can start pulling examples of xml files from across the runs to see if we can pull it reliably from there if you guys think it's worth it. We don't really have a tool to parse these and that may be a good thing to have yes?

@wasade
Copy link
Member

wasade commented Jan 17, 2024

Some googling suggests the instrument lookup may not be universally correct, and presumably Illumina or IGM has authoritative information. For regex, doesn't this work?

>>> import re
>>> matcher = re.compile(r'(\d{6,8})_([A-Z0-9]+)_(\d+)_([A-Z0-9]+)')
>>> matcher.search('231201_A01535_0431_BHVKWCDSX7').groups()
('231201', 'A01535', '0431', 'BHVKWCDSX7')

@charles-cowart
Copy link
Collaborator Author

Superceded by #123

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants