You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Noticed during 1:1 w/ @kimandrews that USA records have a mix of state and counties in the division and location fields.
Small subset of examples
accession
accession_version
strain
date
region
country
division
location
AF394868
AF394868.1
1265
North America
USA
Monterey
California
AF394869
AF394869.1
2253
North America
USA
California
AF394870
AF394870.1
3044
North America
USA
Arizona
AF394871
AF394871.1
1566
North America
USA
Plumas County
California
AF394872
AF394872.1
2847
North America
USA
Lewis County
Washington
Looking at GenBank record for one of the examples (AF394868), the geo_loc_name is "USA: Monterey, California".
This does not follow the pattern for GenBank's geo_loc_name (<country_value>[:<region>][, <locality>]) that we expect in augur curate parse-genbank-location.
Possible solutions
Add geolocation rules to correct these records for rabies
Noticed during 1:1 w/ @kimandrews that USA records have a mix of state and counties in the
division
andlocation
fields.Small subset of examples
Looking at GenBank record for one of the examples (AF394868), the
geo_loc_name
is "USA: Monterey, California".This does not follow the pattern for GenBank's geo_loc_name (
<country_value>[:<region>][, <locality>]
) that we expect inaugur curate parse-genbank-location
.Possible solutions
augur curate parse-genbank-location
(parse-genbank-location
should warn about region/locality mix ups augur#1578)The text was updated successfully, but these errors were encountered: