Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SmartAPI annotation + bug-chasing: ORPHANET -> orphanet #640

Closed
colleenXu opened this issue May 12, 2023 · 16 comments
Closed

SmartAPI annotation + bug-chasing: ORPHANET -> orphanet #640

colleenXu opened this issue May 12, 2023 · 16 comments
Labels
external Requires fixes to an external service On Test Related changes are deployed to Test server x-bte

Comments

@colleenXu
Copy link
Collaborator

colleenXu commented May 12, 2023

biolink-model folks say the prefix should be orphanet, not the ORPHANET that we've been using. See biolink/biolink-model#1198

MyDisease and BioThings RARe-SOURCE are the two APIs we have that use this ID-namespace. Earlier I tried changing MyDisease to use the all-lowercase prefix (oops mixed into this commit), but I encountered issues when testing and decided to revert it back.

While the issue could be the x-bte annotation, it could also be a bug in BTE or an issue with the SRI Node Normalizer response. The SRI Node Normalizer currently uses the all-CAPS prefix, but it seems to be case-agnostic for the input so I'm not sure what's going on...

@rjawesome
Copy link
Contributor

rjawesome commented May 23, 2023

The PR allows lowercase orphanet (or uppercase ORPHANET) to be used in input curies. However the output from bte is still uppercase OPRHANET due to node normalizer output.

@colleenXu
Copy link
Collaborator Author

colleenXu commented Jul 15, 2023

@rjawesome @tokebe

I just tested, and the linked, merged PR and current main-branch code don't seem to address this issue.

Here's how I'm testing:

  1. take a local copy of the SmartAPI yaml of ncats rare-source. Use a override to the local file like this:
contents of biothings_explorer/src/config/smartapi_overrides.json
{
  "conf": {
    "only_overrides": true
  },
  "apis": {
    "b772ebfbfa536bba37764d7fddb11d6f": "file:///Users/colleenxu/Desktop/translator-api-registry/ncats_rare_source/smartapi.yaml"
  }
}
  1. Comment out the references to the diseaseUMLS operations (~lines 314-315). This way only the orphanet operations are used. Keep the current annotation (all-caps ORPHANET).
    a. remember to save the file / run the smartapi-sync to retrieve these modified file contents.
  2. Start local BTE and run the testExamples query for the operation diseaseOrphanet-gene (see TRAPI query below). The sub-query will run successfully using this operation (logs included below)
query for testing
{
    "message": {
        "query_graph": {
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1"
                }
            },
            "nodes": {
                "n0": {
                    "ids": ["ORPHANET:110"],
                    "categories": ["biolink:Disease"]
                },
                "n1": {
                    "categories": ["biolink:Gene"]
                }
            }
        }
    }
}
console logs when ORPHANET is used
  bte:biothings-explorer-trapi:edge-manager (5) Executing current edge >> "e01" +0ms
  bte:biothings-explorer-trapi:batch_edge_query Node Update Start +0ms
  bte:biothings-explorer-trapi:nodeUpdateHandler Getting equivalent IDs... +0ms
  bte:biothings-explorer-trapi:nodeUpdateHandler curies: {"Disease":["ORPHANET:110"],"PhenotypicFeature":["ORPHANET:110"],"BehavioralFeature":["ORPHANET:110"],"ClinicalFinding":["ORPHANET:110"],"DiseaseOrPhenotypicFeature":["ORPHANET:110"]} +0ms
  bte:biomedical-id-resolver:SRI SRI resolved type 'DiseaseOrPhenotypicFeature' doesn't match input semantic type 'Disease' for curie 'MONDO:0015229'. Adding entry for 'Disease'. +0ms
  bte:biomedical-id-resolver:SRI SRI resolved type 'DiseaseOrPhenotypicFeature' doesn't match input semantic type 'PhenotypicFeature' for curie 'MONDO:0015229'. Adding entry for 'PhenotypicFeature'. +0ms
  bte:biomedical-id-resolver:SRI SRI resolved type 'DiseaseOrPhenotypicFeature' doesn't match input semantic type 'BehavioralFeature' for curie 'MONDO:0015229'. Adding entry for 'BehavioralFeature'. +0ms
  bte:biomedical-id-resolver:SRI SRI resolved type 'DiseaseOrPhenotypicFeature' doesn't match input semantic type 'ClinicalFinding' for curie 'MONDO:0015229'. Adding entry for 'ClinicalFinding'. +0ms
  bte:biothings-explorer-trapi:nodeUpdateHandler Got Edge Equivalent IDs successfully. +273ms
  bte:biothings-explorer-trapi:batch_edge_query Node Update Success +273ms
  bte:biothings-explorer-trapi:batch_edge_query Start to convert qEdges into APIEdges.... +0ms
  bte:biothings-explorer-trapi:qedge2btedge Input node is n0 +275ms
  bte:biothings-explorer-trapi:qedge2btedge Output node is n1 +0ms
  bte:biothings-explorer-trapi:qedge2btedge KG Filters: {
  bte:biothings-explorer-trapi:qedge2btedge   "input_type": [
  bte:biothings-explorer-trapi:qedge2btedge     "Disease",
  bte:biothings-explorer-trapi:qedge2btedge     "PhenotypicFeature",
  bte:biothings-explorer-trapi:qedge2btedge     "BehavioralFeature",
  bte:biothings-explorer-trapi:qedge2btedge     "ClinicalFinding",
  bte:biothings-explorer-trapi:qedge2btedge     "DiseaseOrPhenotypicFeature"
  bte:biothings-explorer-trapi:qedge2btedge   ],
  bte:biothings-explorer-trapi:qedge2btedge   "output_type": [
  bte:biothings-explorer-trapi:qedge2btedge     "Gene"
  bte:biothings-explorer-trapi:qedge2btedge   ]
  bte:biothings-explorer-trapi:qedge2btedge } +0ms
  bte:biothings-explorer-trapi:qedge2btedge 1 APIs being used: ["BioThings RARe-SOURCE API"] +1ms
  bte:biothings-explorer-trapi:qedge2btedge 1 SmartAPI edges are retrieved.... +0ms
  bte:biothings-explorer-trapi:qedge2btedge Input prefix: ORPHANET +0ms
  bte:biothings-explorer-trapi:qedge2btedge 1 metaKG are created.... +0ms
  bte:biothings-explorer-trapi:qedge2btedge BTE found 1 metaKG for this batch. +0ms
  bte:biothings-explorer-trapi:batch_edge_query qEdges are successfully converted into 1 APIEdges.... +3ms
  bte:biothings-explorer-trapi:batch_edge_query Start to query APIEdges.... +0ms
  bte:call-apis:query Resolving ID feature is turned on +0ms
  bte:call-apis:query call-apis: 1 planned queries for edge e01 +0ms
  bte:call-apis:query using template builder +0ms
  bte:call-apis:query {
  bte:call-apis:query   url: 'https://biothings.ncats.io/rare_source/query',
  bte:call-apis:query   params: { with_total: true, fields: 'entrezgene,symbol', size: 1000 },
  bte:call-apis:query   data: 'q=110&scopes=raresource.disease.orphanet',
  bte:call-apis:query   method: 'post',
  bte:call-apis:query   timeout: 50000,
  bte:call-apis:query   headers: { 'User-Agent': 'BTE/dev Node/v18.16.1 darwin' }
  bte:call-apis:query } +6ms
  bte:call-apis:query query success, transforming hits->records... +280ms
  bte:api-response-transform:index api name BioThings RARe-SOURCE API +0ms
  bte:api-response-transform:index api tags: gene,disease,annotation,query,translator,biothings +0ms
  bte:call-apis:query Successful POST https://biothings.ncats.io/rare_source (1 ID): Disease > condition_associated_with_gene > Gene (obtained 26 records, took 278ms) +12ms
  bte:call-apis:query query completes. +0ms
  bte:call-apis:query Total number of records returned for this query is 26 +0ms
  1. Then replace ORPHANET -> orphanet in the SmartAPI yaml (match case!). Save / run smartapi-sync. Run the same query (you can replace ORPHANET -> orphanet in the query, but it doesn't matter). It looks like the NodeNorm step works alright, but the sub-query isn't generated properly and so no sub-query is done...This is the bug I am referring to.
console logs when orphanet is used: sub-query not generated properly
  bte:biothings-explorer-trapi:edge-manager (5) Executing current edge >> "e01" +1ms
  bte:biothings-explorer-trapi:batch_edge_query Node Update Start +0ms
  bte:biothings-explorer-trapi:nodeUpdateHandler Getting equivalent IDs... +0ms
  bte:biothings-explorer-trapi:nodeUpdateHandler curies: {"Disease":["orphanet:110"],"PhenotypicFeature":["orphanet:110"],"BehavioralFeature":["orphanet:110"],"ClinicalFinding":["orphanet:110"],"DiseaseOrPhenotypicFeature":["orphanet:110"]} +0ms
  bte:biomedical-id-resolver:SRI SRI resolved type 'DiseaseOrPhenotypicFeature' doesn't match input semantic type 'Disease' for curie 'MONDO:0015229'. Adding entry for 'Disease'. +0ms
  bte:biomedical-id-resolver:SRI SRI resolved type 'DiseaseOrPhenotypicFeature' doesn't match input semantic type 'PhenotypicFeature' for curie 'MONDO:0015229'. Adding entry for 'PhenotypicFeature'. +0ms
  bte:biomedical-id-resolver:SRI SRI resolved type 'DiseaseOrPhenotypicFeature' doesn't match input semantic type 'BehavioralFeature' for curie 'MONDO:0015229'. Adding entry for 'BehavioralFeature'. +0ms
  bte:biomedical-id-resolver:SRI SRI resolved type 'DiseaseOrPhenotypicFeature' doesn't match input semantic type 'ClinicalFinding' for curie 'MONDO:0015229'. Adding entry for 'ClinicalFinding'. +0ms
  bte:biothings-explorer-trapi:nodeUpdateHandler Got Edge Equivalent IDs successfully. +272ms
  bte:biothings-explorer-trapi:batch_edge_query Node Update Success +272ms
  bte:biothings-explorer-trapi:batch_edge_query Start to convert qEdges into APIEdges.... +1ms
  bte:biothings-explorer-trapi:qedge2btedge Input node is n0 +275ms
  bte:biothings-explorer-trapi:qedge2btedge Output node is n1 +0ms
  bte:biothings-explorer-trapi:qedge2btedge KG Filters: {
  bte:biothings-explorer-trapi:qedge2btedge   "input_type": [
  bte:biothings-explorer-trapi:qedge2btedge     "Disease",
  bte:biothings-explorer-trapi:qedge2btedge     "PhenotypicFeature",
  bte:biothings-explorer-trapi:qedge2btedge     "BehavioralFeature",
  bte:biothings-explorer-trapi:qedge2btedge     "ClinicalFinding",
  bte:biothings-explorer-trapi:qedge2btedge     "DiseaseOrPhenotypicFeature"
  bte:biothings-explorer-trapi:qedge2btedge   ],
  bte:biothings-explorer-trapi:qedge2btedge   "output_type": [
  bte:biothings-explorer-trapi:qedge2btedge     "Gene"
  bte:biothings-explorer-trapi:qedge2btedge   ]
  bte:biothings-explorer-trapi:qedge2btedge } +1ms
  bte:biothings-explorer-trapi:qedge2btedge 1 APIs being used: ["BioThings RARe-SOURCE API"] +0ms
  bte:biothings-explorer-trapi:qedge2btedge 1 SmartAPI edges are retrieved.... +0ms
  bte:biothings-explorer-trapi:qedge2btedge Input prefix: orphanet +0ms
  bte:biothings-explorer-trapi:qedge2btedge 0 metaKG are created.... +1ms
  bte:biothings-explorer-trapi:qedge2btedge No metaKG found for this query batch. +0ms
  bte:biothings-explorer-trapi:batch_edge_query qEdges are successfully converted into 0 APIEdges.... +2ms
  bte:biothings-explorer-trapi:edge-manager (X) Terminating..."e01" got 0 records. +275ms

@rjawesome
Copy link
Contributor

See new PR

@colleenXu
Copy link
Collaborator Author

Still doesn't seem to work, when I test, records are dropped during the "edge-management" step.

@rjawesome I'd like to pause the PR / coding work, and discuss first (see next post).

console logs
  bte:biothings-explorer-trapi:QEdge Collected entity ids in records: ["BiologicalEntity","Gene"] +1ms
  bte:biothings-explorer-trapi:QNode Node "n1" saving (26) curies... +1s
  bte:biothings-explorer-trapi:QEdge (7) Updating Entities in "e01" +0ms
  bte:biothings-explorer-trapi:QEdge (7) Collecting Types: "["Disease","PhenotypicFeature","BehavioralFeature","ClinicalFinding","DiseaseOrPhenotypicFeature"]" +0ms
  bte:biothings-explorer-trapi:QEdge Collected entity ids in records: [] +0ms
  bte:biothings-explorer-trapi:QNode Node "n0" intersecting (1)/(0) curies... +0ms
  bte:biothings-explorer-trapi:QNode Node "n0" kept (0) curies... +0ms
  bte:biothings-explorer-trapi:edge-manager 'e01' Reversed[false] (0)--(26) entities / (26) records. +1s
  bte:biothings-explorer-trapi:edge-manager 'e01' dropped (26) records. +0ms
  bte:biothings-explorer-trapi:QEdge (6) Storing records... +0ms
  bte:biothings-explorer-trapi:QEdge (6) Applying Node Constraints to 0 records. +0ms

@colleenXu
Copy link
Collaborator Author

colleenXu commented Jul 17, 2023

@rjawesome

Would you say this is mainly happening because of the NodeNorm output for the ID being ORPHANET? And that our tool relies on ID-namespace/prefixes matching exactly (spelling and case) between NodeNorm output and x-bte annotation?

(I may not understood your earlier post >.<)

If "yes, the main issue is NodeNorm output", then I can raise this issue to NodeNorm / biolink-model folks. It may be more an issue of their output than a bug in our tool's behavior...

@rjawesome
Copy link
Contributor

rjawesome commented Jul 17, 2023

Would you say this is mainly happening because of the NodeNorm output for the ID being ORPHANET? And that our tool relies on ID-namespace/prefixes matching exactly (spelling and case) between NodeNorm output and x-bte annotation?

From what I'm seeing, It seems like that is the main issue and it should be fixed if NodeNorm fixes their capitalization. However, I still think it would be better for BTE to be case insensitive so that it is overall easier to use.

@colleenXu
Copy link
Collaborator Author

colleenXu commented Jul 18, 2023

Okay, I'll raise this as an issue for Node Norm / biolink-model tomorrow.

On your second point on "case insensitive"...it seems like there are multiple ways to define this:

  • it seems like this is easy to do for user-provided QNode IDs (aka BTE can handle if someone queries with "NCBIGene" or "NCBIGENE" or "ncbigene"). I'm okay with adding this behavior, but I dunno how useful it'll be...
  • this seems to be buggy / tricky for this issue, which has to do with NodeNorm output and x-bte annotation. I'm okay with NOT being "case insensitive" here...since our "case insensitive" behavior ultimately matches the biolink-model data model we're using and eventually we could create tools that make it easier to write x-bte annotation...

@tokebe tokebe added the external Requires fixes to an external service label Aug 2, 2023
@colleenXu
Copy link
Collaborator Author

colleenXu commented Aug 2, 2023

Note that this discussion on "case insensitive" hasn't happened yet... Discussion done on this issue's status. See post in PR biothings/bte_trapi_query_graph_handler#160 (comment)

@tokebe
Copy link
Member

tokebe commented Oct 25, 2023

Related to #591

@colleenXu
Copy link
Collaborator Author

colleenXu commented Dec 5, 2023

Update! NodeNorm is rolling out an update that will change ORPHANET -> orphanet in their responses.

It looks like we haven't addressed #731 yet, so all instances of BTE are still using NodeNorm Prod. So I think we shouldn't deploy x-bte changes for ORPHANET -> orphanet until after NodeNorm Prod is updated. EDIT: see next comment

EDIT, NOTE: I'm not sure if the NodeNorm/prefix change will break any of BTE's tests. I see some test info in bte-server that has ORPHANET text-matches. @tokebe

@colleenXu
Copy link
Collaborator Author

colleenXu commented Dec 6, 2023

We've decided to use overrides to implement the x-bte changes as the NodeNorm update rolls out.

Jackson said he plans to work on the "BTE using instance-specific NodeNorm" feature

  • Then if this code is live on an instance + the corresponding NodeNorm has updated to use orphanet, we'll want that instance to also use these overrides.
    • example: right now, NodeNorm dev is updated. So we'd need to use these overrides if BTE dev is using NodeNorm dev.

@colleenXu colleenXu added the x-bte label Dec 6, 2023
@tokebe tokebe added the On Dev Related changes are deployed to Dev server label Dec 12, 2023
@colleenXu
Copy link
Collaborator Author

Update: we're using overrides for the 3 KPs that have orphanet IDs (mydisease, biothings rare-source, ComplexPortal) -> see this commit

I think we can close this issue once:

  • the overrides are deployed to Prod
  • I merge the yaml PR

We'll then have a separate process to remove the overrides (not needed once the yaml PRs are all merged / registrations refreshed).

@colleenXu
Copy link
Collaborator Author

colleenXu commented Dec 16, 2023

@tokebe

I double-checked and it's not working on CI, probably because of the larger cache-update issues (recent lab Slack convo)

My test

POST to MyDisease through BTE CI https://bte.ci.transltr.io/v1/smartapi/671b45c0301c8624abbd26ae78449ca2/query (from this testExample)

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "categories": ["biolink:Disease"],
                    "ids": ["orphanet:881"]
                },
                "n1": {
                    "categories": ["biolink:PhenotypicFeature"]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:has_phenotype"]
                }
            }
        }
    }
}

Right now, it seems like the MetaEdges aren't successfully turned into sub-queries. This could be because NodeNorm CI is using orphanet but BTE CI is using the registered yaml (ORPHANET) rather than the overrides (orphanet)

I only see the logs in the TRAPI response

       {
            "timestamp": "2023-12-16T06:00:10.748Z",
            "level": "DEBUG",
            "message": "BTE is trying to find metaKG edges (smartAPI registry, x-bte annotation) connecting from BehavioralFeature,ClinicalFinding,Disease,DiseaseOrPhenotypicFeature,PhenotypicFeature to BehavioralFeature,ClinicalFinding,PhenotypicFeature with predicate has_phenotype",
            "code": null
        },
        {
            "timestamp": "2023-12-16T06:00:10.749Z",
            "level": "DEBUG",
            "message": "BTE found 2 metaKG edges corresponding to e01. These metaKG edges comes from 1 unique APIs. They are MyDisease.info API",
            "code": null
        },
        {
            "timestamp": "2023-12-16T06:00:10.749Z",
            "level": "WARNING",
            "message": "BTE didn't find any metaKG for this batch. Your query terminates.",
            "code": null
        },
        {
            "timestamp": "2023-12-16T06:00:10.749Z",
            "level": "INFO",
            "message": "e01 execution: 0 queries (0 success/0 fail) and (0) cached qEdges return (0) records",
            "code": null
        },
        {
            "timestamp": "2023-12-16T06:00:10.749Z",
            "level": "WARNING",
            "message": "qEdge (e01) got 0 records. Your query terminates.",
            "code": null
        }

@tokebe
Copy link
Member

tokebe commented Dec 18, 2023

Issue should now be addressed by 3019cec, please test again

@colleenXu
Copy link
Collaborator Author

colleenXu commented Dec 18, 2023

Now it's working on BTE CI! Yay!

EDIT: And while we are deploying this to Test soon, it may not work until NodeNorm Test gets the orphanet prefix update...

@colleenXu colleenXu added On CI Related changes are deployed to CI server and removed On Dev Related changes are deployed to Dev server labels Dec 18, 2023
@tokebe tokebe added On Test Related changes are deployed to Test server and removed On CI Related changes are deployed to CI server labels Dec 21, 2023
@colleenXu
Copy link
Collaborator Author

colleenXu commented Feb 21, 2024

I've confirmed that things work as-expected after the Prod deployment. Closing issue, updating the registered yamls and registrations, and opening another issue for removing the overrides.

Example: POST to https://bte.transltr.io/v1/smartapi/671b45c0301c8624abbd26ae78449ca2/query, will get a response with results

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "categories": ["biolink:Disease"],
                    "ids": ["orphanet:881"]
                },
                "n1": {
                    "categories": ["biolink:PhenotypicFeature"]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:has_phenotype"]
                }
            }
        }
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external Requires fixes to an external service On Test Related changes are deployed to Test server x-bte
Projects
None yet
Development

No branches or pull requests

3 participants