Update scripts #121

acrinklaw · 2021-09-27T21:11:57Z

The only comments I have for this are related to things that we have gone back and forth on:

This script currently:

Accepts alleles > 2 fields long
Takes alleles with suffixes

Both of these can be altered very easily if this isn't desired behavior
Line 79 for number of fields
allele = (":").join(allele.split(":")[:4])

Line 133 can be copied and changed to remove sequences with suffixes i.e. BoLA-1*01:01N
missing_alleles = {x for x in missing_alleles if x[-1] != 'N'}

beckyjackson · 2021-10-15T16:23:47Z

When I run this, I see 11771 new terms added to MRO. Is this expected?

The script also prints a lot of things. It's printing out some dictionaries that I don't think are useful to log (maybe for debugging?), but I also get this message multiple times:

HLA-DRA not available in locus-data.json, please add it.

acrinklaw · 2021-10-15T16:29:47Z

When I run this, I see 11771 new terms added to MRO. Is this expected?

The script also prints a lot of things. It's printing out some dictionaries that I don't think are useful to log (maybe for debugging?), but I also get this message multiple times:
HLA-DRA not available in locus-data.json, please add it.

The 11k sequences seems right but only because I am not restricting to 2 fields for the HLA alleles (meaning there are probably duplicates HLA-A01:01 -> HLA-A01:01:01 HLA-A*01:01:02). These are technically distinct alleles but often times these mutations they refer to have no effect on binding and they are effectively the same. We went back and forth too much and I don't know what's best. We settled on 2 fields when I was working on HLA, but when the tools team needed the terms, they needed the full fields. That can be altered on Line 79.

For the print statements, you can definitely remove them as they were for debugging. I was in a bit of a rush to make this PR before I left so my apologies :^( that message about HLA-DR also seems to be a bit off, it just suggests there's something weird going on between the JSON datafile and what the script finds in the database. I can debug after work if you'd like.

rvita · 2021-10-15T16:31:45Z

is this the g domains? if so, I'd like to see a list of those added, as well as the full seqs for comparison.

…

On Fri, Oct 15, 2021 at 9:23 AM Becky Jackson ***@***.***> wrote: When I run this, I see 11771 new terms added to MRO. Is this expected? The script also prints a lot of things. It's printing out some dictionaries that I don't think are useful to log, but I also get this message multiple times: HLA-DRA not available in locus-data.json, please add it. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#121 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADKDY42SRUBP5G7FYXMDNVDUHBIR5ANCNFSM5E3PKUIQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

-- Randi Vita, M.D. Lead Ontology and Quality Manager Immune Epitope Database and Analysis Project La Jolla Institute for Allergy & Immunology 9420 Athena Circle La Jolla, Ca 92037 ***@***.*** www.immuneepitope.org 858-752-6912

acrinklaw · 2021-10-15T16:35:41Z

@rvita these are the full sequences. Maybe Apurva can combine his G domain work with mine to clear up the multiple fields issue, but this still doesn't address the tools team's needs.

acrinklaw added 2 commits September 27, 2021 12:32

Add new update script

0bbaa6c

Change update script to accept HLA alleles

562d974

jamesaoverton requested a review from beckyjackson October 14, 2021 18:12

jamesaoverton self-requested a review November 29, 2021 14:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update scripts #121

Update scripts #121

acrinklaw commented Sep 27, 2021

beckyjackson commented Oct 15, 2021 •

edited

Loading

acrinklaw commented Oct 15, 2021

rvita commented Oct 15, 2021 via email

acrinklaw commented Oct 15, 2021

Update scripts #121

Are you sure you want to change the base?

Update scripts #121

Conversation

acrinklaw commented Sep 27, 2021

beckyjackson commented Oct 15, 2021 • edited Loading

acrinklaw commented Oct 15, 2021

rvita commented Oct 15, 2021 via email

acrinklaw commented Oct 15, 2021

beckyjackson commented Oct 15, 2021 •

edited

Loading