Skip to content

termMed/sct-statistics-qa

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SNOMED CT Pattern and Changes Evaluation Toolkit

SNOMED CT Quality Assurance (QA) processes are an integral part of content authoring, maintenance and publishing activities through the lifecycle of each International release.

As part of daily authoring activities, editors follow a formal content management process including revision workflows, computer-assisted real time rule-based enforcement of editorial guidelines and testing for conformance to SNOMED CT design and technical specifications. Daily builds of the development database in RF2 format are tested in batch, and detected anomalies are queued for issue resolution.

Architecture

The Quality Evaluation toolkit is a piece of software that executes quality evaluation queries over SNOMED CT release files and creates structured outputs that can be integrated into other quality revision processes. Components:

  • Data generator: reads release files, executes queries, creates outputs.
  • Command line tool: packages the generator in a format that enables the execution from the command line

All these components are included in a single multi-module Maven Java project, that facilitates distribution and deployment.

Build dependencies

This project has a dependency on the HTML5 Viewer for rendenring the results in an HTML site. It's necessary to build that project before running a full build of the SNOMED CT Pattern and Changes Evaluation Toolkit, to ensure that the dependency is available in the maven repository.

This dependency is already included in the binary releases packages, so this step does not affect the execution toolkit.

Running the statistics

Unpacking the binaries

Download and unpack the binaries from the releases section of this project.

The executable scripts are:

  • Windows: Stats.bat, runs the statistics generation using config/intConfig.xml to locate the RF2 sources.
  • Mac or Linux: Stats.sh, runs the statistics generation using config/intConfig.xml to locate the RF2 sources.
  • Windows: Extension_Stats.bat, runs the statistics generation using config/extConfig.xml to locate the RF2 sources.
  • Mac or Linux: Extension_Stats.sh, runs the statistics generation using config/extConfig.xml to locate the RF2 sources.

Configuring SNOMED CT Release data

The “config” folder in the binaries package contains all the necessary configurations for locating the release files that will be subject of the evaluation in your hard drive.

The configuration files contain several parameters, this guide will focus on the important ones for the execution, the rest are part of the configuration to extend the toolkit with new tasks.

There are 2 configuration files, intConfig.xml is used for single package releases, like the International Release, extConfig.xml is used for releases with dependencies with other packages, like extension releases.

Adding values to the configuration files:

  • In config/intConfig.xml and config/extConfig.xml
  • releaseFullFolder: path to a folder that contains a RF2 Full SNOMED CT Release.
  • releaseDate: effective time for the release that is the subject of the analysis.
  • previousReleaseDate: effective time for the previous release for the purpose of changes reports.
  • Example:
    • /MyFiles/SnomedCT_RF2Release_INT_20160131/Full
    • 20160131
    • 20150731
  • Advanced parameters: the effective time needs to be replaced in this configurarion line:
    • releaseDate20160215
  • Only in config/extConfig.xml
    • releaseDependencies: includes a list of “releaseFullFolder” elements that direct to the location of other RF2 Full packages that represent dependencies of the focus release.
    • Example:
    <releaseDependencies>
      <releaseFullFolder/MyData/RF2_Core_Release_20150731_Beta_version_4/Full</releaseFullFolder>
      <releaseFullFolder>/MyData/ES_Release/RF2/SnomedCT_SpanishRelease-es_INT_20151031/RF2Release/Full</releaseFullFolder>
    </releaseDependencies>

Running queries and creating output files

After the configuration file was edited to include valid path to the source RF2 Full data, the execution script should be launched. In some environments permissions for execution should be granted before, for example in Mac OSX or Linux:

chmod Stats.sh 777
./Stats.sh

This will launch the process, details from intermediate tasks will be displayed in the output, time taken will depend on the release size and available resources.

After the process finishes the following folders will contain the results:

  • toolkit/stats_output: contains the statistics results, as CSV files.
  • toolkit/patterns_output: contains the output of the patterns results, as JSON files.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • HTML 84.0%
  • Java 15.7%
  • Other 0.3%