adjust MVP1 templates based on treats refactor #881

andrewsu · 2024-10-04T21:13:52Z

Right now, our first MVP1 template queries the root of the treats predicate hierarchy (biolink:treats_or_applied_or_studied_to_treat). We should break each treats predicate out to different templates because they can have very different levels of confidence -- for example, treats should be more convincing than in_clinical_trials_for, which in turn should be more convincing than beneficial_in_models_for. The predicate-specific treats templates should be interleaved with the multi-hop templates in our MVP1 creative mode template groups.

The text was updated successfully, but these errors were encountered:

colleenXu · 2024-10-07T19:37:16Z

UPDATE: discussed during our 1-on-1, going to discuss more during group meeting Wednesday

Andrew's overall point: right now, we use general predicate and don't tease out/special score based on the diff predicates and how "confident" we are in them.

I have 3 concerns:

(1) we have multiple "treats" predicates that are used in our KG data BUT also expand to the more specific predicates you mentioned (see light blue in screenshot). I'm not sure how we want to handle these (if we exclude, then we miss a LOT of data. if we include, we do a lot of redundant sub-query work). ➡️ Andrew: okay to include + have redundant sub-queries.

treats_or_applied_or_studied_to_treat
studied_to_treat
in_preclinical_trials_for (this is probably fine, just use this rather than child beneficial_in_models_for).

Screenshot

(2) I'm not sure this will improve scores/ranking of desired results. We don't incorporate template order/"confidence in the template" into our scoring... ➡️ Andrew: BTE may fill up on earlier templates that use predicates we like more (aka depends on sequential template execution?)

(3) TopAnswers actually dependent on text-mined edges/general predicates ➡️ Andrew: good point. will need testing

(4) I'm not sure what "interleaved with the multi-hop templates" means. ➡️ Andrew: meaning not clear yet. there are diff ideas for how to arrange the templates w/ diff predicates

One idea:

"treats": direct, pheno, gene
"in clinical trials for": direct, pheno, gene
etc

Another idea:

"treats" direct
"in clinical trials for" direct
etc direct
"treats" pheno
"in clinical trials for" pheno
etc pheno

andrewsu · 2024-11-04T23:03:19Z

In our 2024-10-10 meeting, we floated the idea of incorporating a scoring factor based on KL/AT to essentially down-weight text-mined edges. In thinking about this issue again now, I realize that this solution doesn't address the additional filtering rules established by the CQS for certain predicates. For example, in the aeolus template, CQS determined that evidence count should be > 20. similarly, for the CTKP template, elevate_to_prediction should be true, and for TMKP evidence_count should be > 5. simply incorporating a new KL/AT-based scoring parameter doesn't take advantage of these edge-level constraints.

andrewsu mentioned this issue Oct 4, 2024

implement edge attribute constraints #795

Open

andrewsu mentioned this issue Nov 5, 2024

incorporate CQS edge constraints into templates biothings/bte_trapi_query_graph_handler#227

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adjust MVP1 templates based on treats refactor #881

adjust MVP1 templates based on treats refactor #881

andrewsu commented Oct 4, 2024

colleenXu commented Oct 7, 2024 •

edited

Loading

andrewsu commented Nov 4, 2024

adjust MVP1 templates based on treats refactor #881

adjust MVP1 templates based on treats refactor #881

Comments

andrewsu commented Oct 4, 2024

colleenXu commented Oct 7, 2024 • edited Loading

andrewsu commented Nov 4, 2024

colleenXu commented Oct 7, 2024 •

edited

Loading