Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

annotate Complex Portal API #631

Closed
andrewsu opened this issue Apr 21, 2023 · 10 comments
Closed

annotate Complex Portal API #631

andrewsu opened this issue Apr 21, 2023 · 10 comments
Labels
data source On Test Related changes are deployed to Test server x-bte

Comments

@andrewsu
Copy link
Member

Website: https://www.ebi.ac.uk/complexportal/home
Publication: https://academic.oup.com/nar/article/50/D1/D578/6414048
Description: The Complex Portal is a manually curated, encyclopaedic resource of macromolecular complexes from a number of key model organisms. The majority of complexes are made up of proteins but may also include nucleic acids or small molecules.

The API is described at https://www.ebi.ac.uk/intact/complex-ws/search/. Example API calls:

Note also that there are Complex - Disease annotations:

"diseases": [
    "Heinz body anemia [Orphanet:178330]: a form of nonspherocytic hemolytic anemia of Dacie type I.",
    "Hereditary persistence of fetal hemoglobin - beta-thalassemia (HPFH) [Orphanet:46532]: Characterized by high hemoglobin (Hb) F levels and an increased number of fetal-Hb-containing-cells.",
    "Beta-thalassemia [Orphanet:848]: Beta-thalassemia (BT) is characterized by deficiency (Beta+) or absence (Beta0) of synthesis of the beta globin chains of hemoglobin (Hb). Three main types of BT have been described (minor, intermedia [Orphanet:231222] and major [Orphanet:231214]). 1) Thalassemia minor (BT-minor, BT trait) is the heterozygous form and is usually asymptomatic. 2) Thalassemia major (Cooley anemia; BT-major) is the homozygous form and associates splenomegaly and microcytic and hypochromic anemia resulting from dyserythropoiesis and hemolysis. Onset generally occurs from 6-24 months of age. Patients require regular transfusions. 3) Thalassemia intermedia (BTI) in which the anemia is less severe and diagnosed later in life compared to BT-major. Patients with BTI may or may not require occasional transfusions. Rare autosomal dominant forms have also been described (dominant beta-thalassemia [Orphanet:231226]) resulting in moderate to severe anemia. Transmission is autosomal recessive and around 200 mutations (B0 or B+) have been identified.",
    "Sickle cell anemia [Orphanet:232]: a chronic hemolytic diseases that may induce three types of acute accidents: severe anemia, severe bacterial infections, and ischemic vasoocclusive accidents (VOA) caused by sickle-shaped red blood cells obstructing small blood vessels and capillaries. The presence of fetal hemoglobin means that the disease doesn't manifest until after 3 months. In addition to anemia and bacterial infections, VOAs cause hyperalgic focal ischemia (and sometimes infarction) when they occur in the muscles or skeleton. Over the course of time, VOAs may compromise the integrity of tissues or organs. Transmission is autosomal recessive. Sickle cell anemia is determined by combinations of two abnormal alleles of the beta globin gene among which at least one carries the beta 6 glu-val mutation (Hb S). Sickle cell anemia reduces the susceptibility to malaria infection.",
    "Alpha-thalassemia [Orphanet:846]: Alpha-thalassemia is an inherited hemoglobinopathy characterized by impaired synthesis of alpha-globin chains leading to a variable clinical picture depending on the number of affected alleles. The disease can be classified into clinical subtypes of increasing severity: silent alpha thalassemia, alpha thalassemia trait (or alpha thalassemia minor), hemoglobin H disease (HbH, [Orphanet:93616]), and Hb Bart's hydrops fetalis [Orphanet:163596]. A rare form called alpha-thalassemia-intellectual deficit syndrome linked to chromosome 16 (16p13.3) has also been identified. Alpha thalassemia trait causes microcytosis and hypochromia with absent or mild anemia (often detected on routine blood tests), generally with no other symptoms. HbH patients develop moderate hemolytic anemia with variable amounts of HbH along with occasionally severe splenomegaly, sometimes complicated by hypersplenism. Hb Bart's hydrops fetalis involves a severe deficiency in alpha-globin with serious developmental implications. Alpha-thalassemia-intellectual deficit syndrome is characterized by very mild to severe anemia associated with developmental abnormalities."
],

Note also that the web site also shows links to related pathways. For example, on https://www.ebi.ac.uk/complexportal/complex/CPX-2158, we see the content below.

image

However, these mappings are not currently available through the API. I've emailed the Complex Portal folks to see if they are willing/able to modify/extend the API around diseases and pathways to make it more easily accessible to BTE.

@rjawesome
Copy link
Contributor

I am working on a yaml. Seems like it may need some jq post processing to differentiate between proteins and chemicals

@rjawesome
Copy link
Contributor

ComplexPortal SmartAPI Yaml (uses jq processing): https://gist.github.com/rjawesome/020f3013a648f42e8326ba8df5a4f637
Supports Complex -> Chemical/Disease/Protein, and Chemical/Disease/Protein -> Complex
(when I was testing, I needed to specify the category "biolink:MacromolecularComplex" on the complex)

@colleenXu
Copy link
Collaborator

colleenXu commented Dec 8, 2023

Related infores stuff is ready but not deployed yet:

colleenXu referenced this issue in NCATS-Tangerine/translator-api-registry Dec 13, 2023
@colleenXu
Copy link
Collaborator

colleenXu commented Dec 13, 2023

Current status

Biolink-model v3.5.3 mapping notes

  • complex ID namespace is in biolink-model as ComplexPortal (in yaml and prefix-map)
  • complex's biolink-category is MacromolecularComplex, as pointed out by Rohan earlier
  • Using part_of/has_part predicates for Complex <-> Protein/Chemical relationships
  • Using related_to predicate for Complex <-> Disease relationships since the relationship between the complexes and the diseases aren't clear ("complex is linked to a specific disease condition" according to data-documentation)

To test this API locally, add it to your local config file

It's best to test this way, since then we can include the primarySource: true info and the TRAPI edge-sources info will display as-intended.

In your local copy of https://github.com/biothings/bte-server/blob/main/src/config/apis.js, add the following item to the include list (I add it after the CTD API entry):

    {
      id: "326eb1e437303bee27d3cef29227125d",
      name: "Complex Portal Web Service",
      primarySource: true
    },

Then update your local copy of BTE's smartapi specs (pnpm build, then pnpm run smartapi_sync).

Then you can send a POST request to the api-specific endpoint, Complex Portal only. Like http://localhost:3000/v1/smartapi/326eb1e437303bee27d3cef29227125d/query

Put this in the request body: It's querying with the protein hemoglobin subunit alpha (human)

{
    "message": {
        "query_graph": {
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1"
                }
            },
            "nodes": {
                "n0": {
                    "ids": ["UniProtKB:P69905"],
                    "categories": ["biolink:Protein"]
                },
                "n1": {
                    "categories": ["biolink:MacromolecularComplex"]
                }
            }
        }
    }
}

You'll get a response with this node and edge

                "ComplexPortal:CPX-2158": {
                    "categories": [
                        "biolink:MacromolecularComplex"
                    ],
                    "name": "Hemoglobin HbA complex",
                    "attributes": [
                        {
                            "attribute_type_id": "biolink:xref",
                            "value": [
                                "ComplexPortal:CPX-2158"
                            ]
                        },
                        {
                            "attribute_type_id": "biolink:synonym",
                            "value": [
                                "ComplexPortal:CPX-2158"
                            ]
                        }
                    ]
                }
                "6e42ea498eace1f667853945cee0b3ef": {
                    "predicate": "biolink:part_of",
                    "subject": "UniProtKB:P69905",
                    "object": "ComplexPortal:CPX-2158",
                    "attributes": [],
                    "sources": [
                        {
                            "resource_id": "infores:complex-portal",
                            "resource_role": "primary_knowledge_source"
                        },
                        {
                            "resource_id": "infores:service-provider-trapi",
                            "resource_role": "aggregator_knowledge_source",
                            "upstream_resource_ids": [
                                "infores:complex-portal"
                            ]
                        }
                    ]
                }
            }

@colleenXu
Copy link
Collaborator

colleenXu commented Dec 13, 2023

Discussed

I think this resource/yaml is ready to incorporate into BTE.
However, I'm waiting for the decision on infores catalog changes (whether changes made now can be used in this release cycle).

If we do want to incorporate this data-resource during this release cycle:

  • PR to add this to BTE's config list (set primarySource: true)
  • because this API uses ORPHANET IDs, we'd need to add it to the overrides. I already have a yaml for the override ready in the orphanet yaml PR

Added

I added operations for GO biological process -> Complex and GO molecular function -> Complex. But I didn't add the opposite operations (Complex -> GO terms) because it would require custom JQ-processing.

Here's my notes on the data, with example API queries

This info may be specific to complex and not its parts: according to the data-documentation, "Annotation to Gene Ontology terms indicates the function, process, location and component of the complex as a whole"

Complex -> GO terms: using the same example as above, in the crossReferences field:

  • the GO terms are the objects where database = gene ontology
  • GO biological process terms are when qualifier = biological process
  • GO molecular function terms are when qualifier = molecular function
  • I'm not interested in the cellular component entries because it seems the same as the Complex entity...

GO terms -> Complex: example from the API documentation for /search/ endpoint

  • structure of the response looks the same as the other /search/ queries, wouldn't need JQ post-processing


Not done yet (for another issue?)

All operations starting from Complex ID require JQ

All operations starting from the Complex ID depend on custom JQ-post-processing, which we need to add to BTE. Jackson @tokebe and I agreed to leave this for later

  • Rohan @rjawesome wrote operations w/ JQ strings for Complex -> Chemical/Disease/Protein in the yaml, which I commented out and haven't tested
  • Nothing is written for Complex -> GO biological-process or GO molecular-function yet

@colleenXu
Copy link
Collaborator

The infores stuff is being deployed for this release cycle!

So we are incorporating this resource into BTE/Service Provider during this release cycle. biothings/bte-server#9

We are using an override as well, because this resource uses ORPHANET IDs and we're in the ORPHANET -> orphanet transition.


I think we can close this issue once:

  • the config-PR / override is deployed to Prod
  • the UI is using the updated infores catalog in all instances
  • I merge the yaml PR for the orphanet transition

We'll then have a separate process to remove the overrides (not needed once the yaml PRs are all merged / registrations refreshed).

@colleenXu colleenXu added the On CI Related changes are deployed to CI server label Dec 15, 2023
@colleenXu
Copy link
Collaborator

colleenXu commented Dec 16, 2023

@tokebe

I double-checked and it's not working on CI, probably because of the larger cache-update issues (recent lab Slack convo)

My test

POST to https://bte.ci.transltr.io/v1/smartapi/326eb1e437303bee27d3cef29227125d/query (from this testExample)

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "categories": ["biolink:Protein"],
                    "ids": ["UniProtKB:P69905"]
                },
                "n1": {
                    "categories": ["biolink:MacromolecularComplex"]
                }
            },
            "edges": {
                "e01": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": ["biolink:part_of"]
                }
            }
        }
    }
}

Right now, BTE CI doesn't recognize this SmartAPI registration ID, which shows that the smartapi-spec cron job didn't run successfully.

Note: I should also be able to get a response through https://bte.ci.transltr.io/v1/team/Service Provider/query. But right now, no matching MetaEdges are found.

@tokebe
Copy link
Member

tokebe commented Dec 18, 2023

Issue should now be addressed by 3019cec, please test again

@colleenXu
Copy link
Collaborator

Now it's working on BTE CI! Yay!

@colleenXu colleenXu added On Test Related changes are deployed to Test server and removed On CI Related changes are deployed to CI server labels Dec 20, 2023
@colleenXu
Copy link
Collaborator

colleenXu commented Feb 21, 2024

Closing this issue since the changes have been deployed to Prod with the Feb 2024 release.

I've confirmed that I can query ComplexPortal through BTE prod https://bte.transltr.io/v1/team/Service Provider/query with the example in #631 (comment) and get the expected response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data source On Test Related changes are deployed to Test server x-bte
Projects
None yet
Development

No branches or pull requests

4 participants