Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting the Mapping Integration workflow #334

Open
2 tasks
matentzn opened this issue Jan 31, 2023 · 1 comment
Open
2 tasks

Supporting the Mapping Integration workflow #334

matentzn opened this issue Jan 31, 2023 · 1 comment

Comments

@matentzn
Copy link

matentzn commented Jan 31, 2023

The mapping Integration (as opposed to QC #333) workflow is about effective integration of new mappings into an ontology while maintaining consistency. The goal is to be able to rapidly slurp up existing mappings (almost) without the need of human review

Workflow:

  • Input Ontology O (e.g. Mondo)
  • Input M:
    • Merged set of mappings:
    • Internal (existing, verified mappings):
      • Reviewed: 0.99 %
      • Not reviewed 0.95 %
    • External (OAK lexmatch, existing mapping sets)
      • Confidence on a case by case basis, configured as part of mapping commons
  • PT=sssom-py:ptable(M)
  • {best-guess.sssom.tsv, results.json, |cluster-X.png|, |cluster-X.md}, =boomer(PT, O)
  • EDIT: I thought we would do a proper human review of questionable clusters here, but maybe we leave this to Supporting Mapping QC workflow #333 instead to make this workflow more scalable
  • difference.sssom.tsv = sssom-py:diff(M, best-guess.sssom.tsv)
  • Cursory human review of difference.sssom.tsv (eyeballing), no semapv:MappingReview justification added. Links from SSSOM file to related cluster facilitates to effectively review using a nice image (this could be an app one day).
  • Rejected mappings from the difference.sssom.tsv should be recorded in a "negative.sssom.tsv" mapping file by the curators

New boomer requirements

  • Output best-guess.sssom.tsv should be sssom accept SSSOM format #47 and also include a notion of mapping confidence (I didn't get 100% how cluster and mapping confidence should relate in our meeting, but I think you did) and a link to the associated mapping cluster. If there is other metadata you think that can help with the review, you can add it into the comment section.
  • Most of the stuff in Supporting Mapping QC workflow #333

Comments:

  • "low prior property mapping will be rejected in a high probability clique" (@cmungall)
  • boomer does not necessarily create a globally coherent outcome model (@balhoff)
@matentzn matentzn added this to the Mondo workflow milestone Jan 31, 2023
cmungall added a commit to INCATools/ontology-access-kit that referenced this issue Jan 31, 2023
cmungall added a commit to INCATools/ontology-access-kit that referenced this issue Jan 31, 2023
@balhoff
Copy link
Member

balhoff commented May 3, 2023

Discussion with @matentzn: implement output of cliques in order of increasing confidence (joint posterior prop most likely of clique / prop next most).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants