Analogy

Automated Analytics for Epidemiology (ANALOGY) is an open-source python cli application to run Incidence and Prevalence analysis.

Getting Started

These instructions will get you a copy of the project up and running on your local machine.

Prerequisites

The project requires python 3.9 or above. We recommend using Anaconda for open source python distribution. Go to https://www.anaconda.com/ for installation.

Installing

Clone the repository to your local system.

git clone https://github.com/aditya02acharya/analogy_publication.git

Move to the project directory.

cd analogy_publication

Run the command below to install the package and all the dependency.

python -m pip install .

Running the application

To see all available arguments, use the command below.

analogy incprev --help

Run the command below to start the analysis.

analogy incprev ./src/analogy/data/sample.csv . "2001-01-01 00:00:00.0" "2021-12-31 00:00:00.0" "%Y-%m-%d %H:%M:%S.%f" START_DATE END_DATE 1000 12

Running the command above will be followed up with the prompts below.

Enter the list of conditions columns to analyse (col1, col2, ...): CONDITION
Enter the list of demography columns for subgroup analyse or leave empty if none (col1, col2, ...): SEX, ETHNICITY

The command expects the following:

path to the .csv dataset: ./src/analogy/data/sample.csv
the destination folder to store outputs: . (current directory)
Study start date: 2001-01-01 00:00:00.0
Study end date: 2021-12-31 00:00:00.0
Date formate in dataset and user provided: %Y-%m-%d %H:%M:%S.%f
Patient follow-up start date column in dataset: START_DATE
Patient follow-up end date column in dataset: END_DATE
Per person years scale for the result reporting: 1000
Regular interval at which incidence and prevalence to be calculated: 12
List of condition to calculate incidence and prevalence on: CONDITION
List of demography variables for subgroup analysis: SEX, ETHNICITY

CSV File Format

The command takes csv files as input and expects the following format:

Each row should correspond to one observation
Each column should represent one variable
The first row should be a list of column names
Each observation must have a start and end date

FAQ

Q: Where should I run the code blocks in the readme?

These should be run in the Python command line.

Q: Are column names necessary or do they need to be excluded?

Column names are necessary and the command line will take column names as inputs to identify observation start and end dates, conditions, and any subgroup analyses that are desired for stratified analysis.

Q: Do the values need to be quoted in case there are spaces in them?

Values in the command line need to be quoted to provide those data points as "string" formats. Without quote marks, the software reads inputs as pre-defined variables. The dates should be provided in the same format as in the data, and this format should be provided to the command line as an input e.g. "%Y-%m-%d %H:%M:%S.%f". The 9th character in that string is a space, which also reflects the character 9 space between data and time in "2001-01-01 00:00:00.0"

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
src/analogy		src/analogy
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analogy

Getting Started

Prerequisites

Installing

Running the application

CSV File Format

FAQ

About

Releases

Packages

Languages

License

THINKINGGroup/analogy_publication

Folders and files

Latest commit

History

Repository files navigation

Analogy

Getting Started

Prerequisites

Installing

Running the application

CSV File Format

FAQ

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages