-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement edge attribute constraints #795
Comments
From my perspective, there's 3 issues at play here: 1: edge-attribute value typeBTE is currently returning this edge-attribute in Dev/CI instances (ref: commit). However, the value type is currently an array of ints (click for examples).
These are from this example BTE response show-edge-attribute-issue.json, which runs the POST version of this query to MyChem Example of a 1-element array:
Example of a multi-element array: the 733 count from Depression and 42 count from "Depressed mood" were put in the same edge/edge-attribute since both meddra IDs mapped to the "MONDO:0002050 (depressive disorder)" entity.
I suggest flattening these into ints, because the array will probably cause validation issues (biolink-model says attribute values should be int) and it'll make the edge-attribute constraint easier to implement. But we'll need to decide what to do with the multi-element arrays. These are happening because MyChem has separate meddra indication IDs, but BTE/NodeNorm maps them to the same entity. BTE then merges those records into the same edge, and concatenates the counts in the edge-attribute value. I think we could either:
(Note: I'm not sure about flattening all 1-element arrays in edge-attributes. 2: what to do with the previous effort - a default, hard-coded count limitEDIT: SEE UPDATE BELOW - we've implemented this. We've been trying to add a hard-coded count limit of 20 to our MyChem queries #727 (comment), similar to what we do with SEMMEDDB. I was able to add it to the Old notes on reverse operation
But this hasn't been done for the reverse operation
But if we want to implement TRAPI-query edge-attribute constraints, it's not clear if we want to go forward with this. An edge-attribute constraint < 20 would conflict with this hard-coded limit. 3: how to implement this issue's ask: TRAPI-query edge-attribute constraintsThis is still up for discussion:
Idea: if an edge-attribute constraint is specified...
I had another idea of transforming the constraint into part of the BioThings API query using the x-bte annotation templating and info in the response-mapping, but this would be complex and more effort to think through and implement. |
During today's group meeting, we made decisions on issues (1) and (3) above: 3: how to implement TRAPI-query edge-attribute constraints
1: how to flatten multi-element arrays
Jackson @tokebe estimated that this would take ~2 days of work. But as part of this effort, they'll review the TRAPI yaml/spec docs for QEdge.attribute_constraints expectations and requirements - and they'll decide how full/robust of an implementation to do for this issue. |
Note that I've made a PR to ask about this template TranslatorSRI/CQS#9:
|
Update! The template PR TranslatorSRI/CQS#9 has been merged. So the current template is https://github.com/TranslatorSRI/CQS/blob/main/templates/mvp1-templates/mvp1-template4-bte-aeolus/mvp1-template5-service-provider-aeolus.json Changes:
|
Update on issue 2 aboveThe hard-coded/default/MyChem-query-level limit is now live in Dev/CI! See the details in #727 (comment) |
And noting that Issue 1 was also addressed last month as part of #727 (comment) Leaving just Issue 3 - the edge attribute constraint implementation itself (first post, later decision) |
Going to add this to our agenda for next week to discuss. We tabled this earlier this year, and I'd like to revisit where we stand w.r.t. current priorities. For updated info/context, BioLink Model undertook the "treats refactor", which separates out different types of evidence into different predicates (e.g. "in_clinical_trials_for", "beneficial_in_models_for"). CQS then created one hop templates that capture how to query translator with each of these predicates, and these queries often include edge constraints. (Though we could review those templates in more detail to look at what those edge constraints actually do and how common they are.) Right now, our first MVP1 template queries the root of the treats predicate hierarchy ( |
Andrew said we don't need to get this done in BTE by the end of this phase. We will want to be able to do edge-attribute constraints in Retriever (or Shepherd?) (CC @tokebe) |
We'll want support for edge-attribute constraints in Retriever |
We originally proposed edge attribute constraints in the context of TRAPI 1.3 and #482, but breaking this out to its own ticket.
We have a solid use case for edge constraints proposed in this query template for the CQS:
https://github.com/TranslatorSRI/CQS/blob/main/templates/mvp1-templates/mvp1-template4-bte-aeolus/mvp1-template5-service-provider-aeolus.json
The key bit is here, attempting to apply a minimum threshold on the
biolink:evidence_count
from AEOLUS.There are (at least) two issues that need to be done/checked:
evidence_count
is provided as a multi-element array (example below). In this case, I think it is reasonable to apply the constraint to the sum of the evidence_counts.The text was updated successfully, but these errors were encountered: