-
-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Genbank parser fixes #389
Genbank parser fixes #389
Conversation
… on an implicit range, standardizes parse error format for clarity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few changes for more documentation, other than that, looking pretty good!
… with the added benefit of better cmp interop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few more comments! Also, I think there are some architectural decisions. I think we should remove polyjson, and just have a genbank json export function. lemme try to message tim and willow
io/genbank/multimap.go
Outdated
// defined as a simple type alias over a map of slices | ||
// while not ideal (eg computing total number of items takes O(N)) | ||
// this has the advantage of being compatible with json.Marshal, cmp.Diff, | ||
// pretty printing, and bracket indexing out of the box. | ||
type MultiMap[K, V comparable] map[K][]V |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the advantages of:
- Compatible with json.Marhsal, cmp.Diff, pretty printing
- Generic types
Should be highlighted above. This little section was more convincing than O(1) lookup time, mainly because we can accomplish that with just maps. We can't get the guarantees around cmp.Diff, which are actually quite useful, and it also doesn't work on generic types - and we might want more of those in the future.
However, I think we should be more explicit about where this is used (imagine being a user reading the docs and you come across MultiMap). In this case, I think the multimap type should have very explicit docs about where it is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generic types can be had with other implementations, but I agree and emphasized these benefits more. Also contextualized where the MultiMap is currently used, which I will update if we expand in the future.
io/genbank/multimap.go
Outdated
// create a new empty multimap | ||
func NewMultiMap[K, V comparable]() MultiMap[K, V] { | ||
return make(map[K][]V) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure the docstrings are supposed to start with the name of the function, or am I wrong there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are 100% right
io/genbank/genbank_test.go
Outdated
// testcase for subset of unusual file | ||
// includes implicit genome range with partial | ||
// and origin is replaced with contig | ||
func TestParseS288C_IX(t *testing.T) { | ||
_, err := ReadMulti("../../data/NC_001141.2_redux.gb") | ||
assert.Nil(t, err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically mention that this is a regression test for that particular test case.
Closed in favor of #394 |
Changes in this PR
Clearly and concisely summarize the changes you are making. Bullet points are completely okay. Please be specific, saying "improves X" is not enough!
Gff.AddFeature()
code is misleading and mutatesFeature
state #342. I don't think it's actually necessary to changeGff.AddFeature()
as this would just unnecessarily slow down the parser, but these methods should support the pattern described by the issue raiser.Feature.StoreSequence
to enable Genbank import and export from JSON should include feature sequences #388.ParserError
struct.Why are you making these changes?
Explain why these changes are necessary. Link to GitHub issues here with the format
fixes: #XXX
to indicate this PR resolves the issue.Are any changes breaking? (IMPORTANT)
Will merging this PR change
poly
's API in a non-backwards-compatible manner?Yes
map[string]string
toMultiMap[string, string]
, an alias formap[string]string
defined inmultimap.go
. This can be seen via adjustments to unit tests.Pre-merge checklist
All of these must be satisfied before this PR is considered
ready for merging. Mergeable PRs will be prioritized for review.
primers/primers_test.go
for what this might look like.CHANGELOG.md
in the[Unreleased]
section.