-
Notifications
You must be signed in to change notification settings - Fork 6
Workflow 5: Green/Gamma Team Implementation of Modules 1-4 & 5-8 #36
Comments
Question 1: ICEES functionality 4 Input parameters: Output (from full output): Output includes counts of patients by bin, adjusted Chi Square Statistics, and P values Exposures that significantly differ between the two groups of patients will be used as input for module five, but separate streams of operations will be maintained with annotation indicating which group was "higher" and which group was "lower". |
Question 2: ICEES functionality 4 Input parameters: *The US Census Bureau identifies two types of urban areas: The ICEES patient population is largely rural, so the Census Bureau definitions may not work/apply with this use case.* Output (from full output): Output includes counts of patients by bin, adjusted Chi Square Statistics, and P values Exposures that significantly differ between the two groups of patients will be used as input for module five, but separate streams of operations will be maintained with annotation indicating which group was "higher" and which group was "lower". |
Note that Green/Gamma is working with the BioLink folks to develop high-level concepts for ICEES feature variables, in order to properly incorporate ICEES data into the BioLink data model. |
US Census Bureau rural, urban definitions The Census Bureau's urban-rural classification is fundamentally a delineation of geographical areas, identifying both individual urban areas and the rural areas of the nation. The Census Bureau's urban areas represent densely developed territory, and encompass residential, commercial, and other non-residential urban land uses. For the 2010 Census, an urban area will comprise a densely settled core of census tracts and/or census blocks that meet minimum population density requirements, along with adjacent territory containing non-residential urban land uses as well as territory with low population density included to link outlying densely settled territory with the densely settled core. To qualify as an urban area, the territory identified according to criteria must encompass at least 2,500 people, at least 1,500 of which reside outside institutional group quarters. The Census Bureau identifies two types of urban areas: Urbanized Areas (UAs) of 50,000 or more people; The specific criteria used to define urban areas for the 2010 Census were published in the Federal Register of August 24, 2011. |
@xu-hao : Let's bin EstResidentialDensity as defined above by the US Census Bureau. |
cc-ing @diatomsRcool who is working on ECTO exposures |
Thanks, Chris! @balhoff @stevencox : Let's coordinate with @diatomsRcool and perhaps loop in Sarav and Alex Valencia (his student). |
Do we need a meeting? I'm really not up to speed on translator stuff. |
Don't worry, I think it's premature to have a meeting, at most an ECTO
ticket request
… |
Agreed! My intent was simply to make sure that we coordinate (and not duplicate) efforts. |
Updated plan for implementation of Workflow 5:
Note re ICEES: We will need to capture directionality as part of the output for the workflow. By "directionality", I mean that we need to capture which strata is "enriched" for a given phenotype (i.e., has a higher percentage of patients with XXX). The Chi Square statistic that ICEES provides informs one of differences between groups or bins, but it does not provide any information on the directionality of the differences. Relative risks and odds ratios may suffice. Notes re (1)-(5) above:
A. ICEES example query Input: Feature variables: AvgDailyPM2.5Exposures < 3, TotalEDInpatientVisits < 2 Output:* +----------------------------+------------------------------+-------------------------------+---------+ B. COHD example queries Input: Asthma (ID #317009) and Black or African American (ID #8516) Output: Input: Asthma (ID #317009) and White (ID #8527) Output: C. Clinical Profiles links |
Hi Kara, just curiously is there any reason COHD only run implementation by Sex? instead of doing the same experiments as ICEES, then we can do comparison or cross validation afterwards? Is the plan proposed for Hackathon? Thanks! Qian |
Hi Qian. The variables defined in (1) and (2) above are specific to ICEES and not available in COHD. (3) and (4) are intended to cross-validate output, as you noted. I'm hoping to do something similar for Green Team's Implementation of Workflow 4. WRT the hackathon, I'm hoping that we can extend the plan above to include additional teams. |
@stevencox @colinkcurtis @xu-hao : I'm wondering where we stand with (1) above, in terms of modules 1-4 and modules 5-8. I realize you all shifted your focus to (2), but I think (1) might serve best as a use case for SME evaluation (Dave Peden) during the hackathon. Plus, I'm developing a second ICEES manuscript that follows the first one and will focus on the outcome variable 'TotalEDInpatientVisits', so execution of (1) would align nicely with those efforts. |
@karafecho I will pivot towards (1) again. In what I have been doing it was incidental that I began focusing on (2). I'll update when I have an executable CWL/Ros WF5 for (1). Tentatively, before Monday. |
@webyrd @dkoslicki : Please take a look at the above Green/Gamma action plan for execution of Workflow 5, Modules 1-4, as well as the action plan for execution of Workflow 5, Modules 5-8 (#37). If you're interested, I'd be happy to discuss approaches for Alpha and X-Ray to contribute to this workflow. |
See TranQL implementation of Workflow 5, which is related to Workflow 4, here. |
WORKFLOW INPUT: See ICEES_FeatureVariables and ICEES_Identifiers here for chemicals and medications. Note that these docs are updated as new variables are added to the ICEES integrated feature tables. WORKFLOW (Gamma) QUESTION TEMPLATE: { |
ROBOKOP queries and RTX queries are being pre-computed for this workflow using all available ICEES chemicals and medications. Example ICEES queries are included below as an FYI: curl -k -XPOST https://localhost:8080/1.0.0/patient/2010/cohort/COHORT:22/associations_to_all_features -H "Content-Type: application/json" -d '{"feature":{"TotalEDInpatientVisits":{"operator":"<", "value":2}},"maximum_p_value":0.1}' -H "Accept: application/json" curl -k -XPOST https://localhost:8080/1.0.0/patient/2010/cohort/COHORT:22/associations_to_all_features -H "Content-Type: application/json" -d '{"feature":{"ur":{"operator":"=", "value":"U"}},"maximum_p_value":0.1}' -H "Accept: application/json" curl -k -XPOST https://localhost:8080/1.0.0/patient/2010/cohort/COHORT:22/associations_to_all_features -H "Content-Type: application/json" -d '{"feature":{"Sex2":{"operator":"=", "value":"Male"}},"maximum_p_value":0.1}' -H "Accept: application/json" |
Green/Gamma initial plan is to refine end-to-end execution of WF5 using TranQL, with ICEES/COHD/Clinical Profiles for execution of modules 1-4 input and ROBOKOP/RTX/mediKanren for execution of modules 5-8. |
Mini-hackathon was held on Friday, April 12, 12-4 pm ET. Topic: Unified Translator-compliant Clinical Knowledge Source API. Attendees: Hao Xu, Richard Zhu, Casey Ta, Steve Cos, and Kara Fecho. Event was successful. Team developed a plan of action and is moving forward with execution of the plan. The unified Translator Clinical Knowledge Source API will foster efforts on Workflows 4 and 5, as well as any efforts related to COHD, Clinical Profiles, and ICEES. |
Scroll below to find updates to plan
Overview
Green/Gamma Team is approaching Workflow 5 using ICEES as the source of clinical data. In consideration of the design of ICEES, the team has decided to collapse Modules one through four of the workflow. In addition, a Jupyter notebook will be used to call ICEES and integrate with Gamma for subsequent modules.
Two questions will be asked:
This question will be fully evaluated by a SME (D. Peden) and serve as the basis of a TIDBIT.
This question will allow us to begin to more thoroughly explore the ACS data available through our Socioenvironmental Exposures API in the context of a workflow. In particular, the question will allow us to "stress test" our binning strategy.
Note that the output of modules one through four will be of the same entity type for both Question 1 and Question 2; thus, subsequent modules for workflow 5 will be identical for both questions.
The text was updated successfully, but these errors were encountered: