Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor tests #157

Open
wants to merge 46 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
1e384e8
initial commit
jdr0887 Oct 24, 2022
5f7c08e
initial commit
jdr0887 Oct 24, 2022
7cca582
moving validate to util
jdr0887 Nov 7, 2022
a9b478a
refactor
jdr0887 Nov 11, 2022
6b2a371
removing
jdr0887 Nov 29, 2022
ba91101
removing
jdr0887 Nov 29, 2022
64a4949
initial commit
jdr0887 Nov 29, 2022
980846c
adding independent log config
jdr0887 Nov 29, 2022
190efbd
posterity
jdr0887 Nov 29, 2022
f1073eb
initial commit
jdr0887 Nov 29, 2022
7fe4490
formatting
jdr0887 Nov 29, 2022
0b3a56e
formatting
jdr0887 Nov 29, 2022
11e589d
adding redis databases to match yaml config
jdr0887 Nov 29, 2022
15068f2
formatting, changing gene_protein_db to conflation_db
jdr0887 Nov 29, 2022
507e372
renaming variables
jdr0887 Nov 29, 2022
a6be46c
reducing size
jdr0887 Nov 29, 2022
20a6b98
adding tests from test_norm.py
jdr0887 Nov 29, 2022
f07527a
formatting, fixing renamed service
jdr0887 Nov 29, 2022
e820ad5
adding conflation_type
jdr0887 Nov 29, 2022
2d8e935
adding asyncclick
jdr0887 Nov 29, 2022
00c2591
adding validate_compendium
jdr0887 Nov 29, 2022
2a95ccc
adding conflation enum, formatting
jdr0887 Nov 29, 2022
c06a4e3
formatting
jdr0887 Nov 29, 2022
3614cd4
formatting, adding conflation type
jdr0887 Nov 29, 2022
ba7630a
adding conflation type
jdr0887 Nov 29, 2022
cc21083
adding conflation type
jdr0887 Nov 29, 2022
fb44db7
adding more logging statements
jdr0887 Nov 29, 2022
df9aece
moving validate to util.py
jdr0887 Nov 29, 2022
4cc6fa6
removing
jdr0887 Nov 29, 2022
f268092
removing config & load.py
jdr0887 Nov 29, 2022
7daeeac
removing config.json, redis_config.yaml, and load.py
jdr0887 Nov 29, 2022
3d6b00c
adding redis instances to match redis_config.yaml
jdr0887 Nov 29, 2022
137e936
renaming
jdr0887 Nov 29, 2022
3ede74a
initial commit
jdr0887 Nov 29, 2022
90dfe3b
setting build & pull to true
jdr0887 Nov 29, 2022
dfeca95
removing unused env vars
jdr0887 Nov 29, 2022
1c0c689
initial commit
jdr0887 Nov 30, 2022
55ae81a
initial commit
jdr0887 Nov 30, 2022
b98b1a3
ignoring Cargo lock
jdr0887 Nov 30, 2022
bf78fab
conflation file has new format
jdr0887 Nov 30, 2022
372ebf7
adding get_semantic_types test
jdr0887 Nov 30, 2022
2e56b37
renamed
jdr0887 Nov 30, 2022
0d8815d
close too many on shutdown event
jdr0887 Dec 1, 2022
1f859e1
updating
jdr0887 Dec 1, 2022
8130347
checking for gene_protein
jdr0887 Dec 1, 2022
a48c817
moving contents of readme to wiki page
jdr0887 Dec 1, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -130,3 +130,5 @@ dmypy.json

# intellij
.idea

Cargo.lock
2 changes: 0 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,7 @@ WORKDIR /code
COPY ./requirements.txt requirements.txt
COPY ./setup.py setup.py
COPY ./node_normalizer node_normalizer
COPY ./config.json config.json
COPY ./redis_config.yaml redis_config.yaml
COPY ./load.py load.py

# install requirements
RUN pip install -r requirements.txt
Expand Down
4 changes: 1 addition & 3 deletions Dockerfile-test
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,7 @@ WORKDIR /code
COPY ./requirements.txt requirements.txt
COPY ./setup.py setup.py
COPY ./node_normalizer node_normalizer
COPY ./tests/config.json config.json
COPY ./tests/redis_config.yaml redis_config.yaml
COPY ./load.py load.py
COPY ./redis_config.yaml redis_config.yaml
COPY ./tests tests

# install requirements
Expand Down
20 changes: 0 additions & 20 deletions KGX_converter.py

This file was deleted.

132 changes: 2 additions & 130 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,133 +1,5 @@
[![Build Status](https://travis-ci.com/TranslatorIIPrototypes/NodeNormalization.svg?branch=master)](https://travis-ci.com/TranslatorIIPrototypes/NodeNormalization)

# NodeNormalization
# Welcome to the NodeNormalization project

## Introduction

Node normalization takes a CURIE, and returns:

* The preferred CURIE for this entity
* All other known equivalent identifiers for the entity
* Semantic types for the entity as defined by the [Biolink Model](https://biolink.github.io/biolink-model/)

The data currently served by Node Normalization is created by the prototype project
[Babel](https://github.com/TranslatorSRI/Babel), which attempts to find identifier equivalences,
and makes sure that CURIE prefixes are BioLink Model compliant. The NodeNormalization service, however,
is independent of Babel and as improved identifier equivalence tools are developed, their results
can be easily incorporated.

To determine whether Node Normalization is likely to be useful, check /get_semantic_types, which lists the BioLink
semantic types for which normalization has been attempted, and /get_curie_prefixes,
which lists the number of times each prefix is used for a semantic type.

For examples of service usage, see the example [notebook](documentation/NodeNormalization.ipynb).

The Node normalization website leverages the [R3 (Redis-REST with referencing)](https://github.com/TranslatorSRI/r3) Redis data design and configuration.

Users can find the publicly available website at [service](https://nodenormalization-sri.renci.org/docs).

## Installation

Create a virtual environment
```
python -m venv nodeNormalization-env
```
Activate the virtual environment
```
source nodeNormalization-env/bin/activate
```
Install requirements
```
> pip install -r requirements.txt
```
## Generating equivalence data

The equivalence data can be generated by running [Babel](https://github.com/TranslatorSRI/Babel). An example of the contents of a compendia file is shown below:
```
{"id": {"identifier": "PUBCHEM:50986940"}, "equivalent_identifiers": [{"identifier": "PUBCHEM:50986940"}, {"identifier": "INCHIKEY:CYMOSKLLKPIPCD-UHFFFAOYSA-N"}], "type": ["chemical_substance", "named_thing", "biological_entity", "molecular_entity"]}
{"id": {"identifier": "CHEMBL.COMPOUND:CHEMBL1546789", "label": "CHEMBL1546789"}, "equivalent_identifiers": [{"identifier": "CHEMBL.COMPOUND:CHEMBL1546789", "label": "CHEMBL1546789"}, {"identifier": "PUBCHEM:4879549"}, {"identifier": "INCHIKEY:FUIYIXDZTPMQEH-UHFFFAOYSA-N"}], "type": ["chemical_substance", "named_thing", "biological_entity", "molecular_entity"]}
```
## Creating and loading a Redis container with data

A running instance of Redis is needed to house the node normalization data. a Redis Docker container image can be downloaded from [Docker hub](https://hub.docker.com/_/redis). The Redis caonteriner can be started with thie following docker command:
```
docker run --name node-norm-redis -p 6379:6379 -d redis redis-server --appendonly yes
```
Note that the dataset for Node normalization is quite large and 256Gb of memory and disk space should be made available to the Redis instance to insure proper loading of the complete compendia.
### Configuration
Insure that the `./config.json` file is created and contains the parameters for the node normalization load specific to your environment.

The configuration parameters `compendium_directory` and `data_files` specify the location of the compendia files. An example of the files' contents
are listed below:
```
{
"compendium_directory": "<path to files>",
"data_files": "anatomy.txt,BiologicalProcess.txt,cell.txt,cellular_component.txt,disease.txt,gene_compendium.txt,gene_family_compendium.txt,MolecularActivity.txt,pathways.txt,phenotypes.txt,taxon_compendium.txt",
"redis_host": "<Redis host server name>",
"redis_port": <Redis connection port>,
"redis_password": "<Redis password",
"test_mode": 1,
"debug_messages": 0
}
```
### Loading of the Redis server with compendia data

The load.py script reads the configuration file for load parameters and the loads the compendia data into the Redis instance.

#### The redis command line can be used to monitor various aspects of the load.

It is possible to observer the progress of the load opening a command line _within the container_ and issuing Redis commands.

_View the number of keys loaded so far._
```
redis-cli info keyspace
```

_Once the database has completed loading it is recommended that the Redis database be persisted to disk._
```
redis-cli save
```

_Monitor the database to determine if the save has completed._
```
redis-cli info persistence
```

### Starting the FASTAPI webserver from the command line

The web server can be started after successful completion of the load.

```
cd <Node normalization code root>

pip install -r requirements.txt

uvicorn --host 0.0.0.0 --port 8000 --workers 1 node_normalizer.server:app
```

Then navigate to http://localhost:8000/docs to run the application

### Webserver Docker container creation and execution
Much like the Redis Docker container noted above, a Docker container can also be created and executed to run the webserver.

#### Build the webserver Docker image
```
cd <Node normalization code root>

docker build --tag <image_tag> .
```

#### Start the container:

_Note the Dockerfile specifies port 6380 for the webservice container._
```
docker run --name Node-normalization -p 8000:6380 node-norm
```

Then navigate to: http://localhost:8000/docs to run the application

### Kubernetes configurations
Kubernetes configurations and helm charts for this project can be found at:
```
https://github.com/helxplatform/translator-devops/helm/r3
```
Please visit the [WIKI](https://github.com/TranslatorSRI/NodeNormalization/wiki) page for this project for more documentation
17 changes: 0 additions & 17 deletions config.json

This file was deleted.

78 changes: 49 additions & 29 deletions docker-compose-test.yml
Original file line number Diff line number Diff line change
@@ -1,34 +1,41 @@
version: "3"
services:

r3:
container_name: r3
image: r3
build:
context: .
dockerfile: Dockerfile-test
env_file: .env
redis_eq_id_to_id_db:
container_name: redis_eq_id_to_id_db
image: redis:alpine
expose:
- 8080
ports:
- "8080:8080"
# links:
# - callback-app
# networks:
# - nn_network
depends_on:
- redis
- callback-app
- 6379

redis:
container_name: redis
redis_id_to_eqids_db:
container_name: redis_id_to_eqids_db
image: redis:alpine
expose:
- 6379

redis_id_to_type_db:
container_name: redis_id_to_type_db
image: redis:alpine
expose:
- 6379

redis_curie_to_bl_type_db:
container_name: redis_curie_to_bl_type_db
image: redis:alpine
expose:
- 6379

redis_info_content_db:
container_name: redis_info_content_db
image: redis:alpine
expose:
- 6379

redis_conflation_db:
container_name: redis_conflation_db
image: redis:alpine
ports:
- "6379:6379"
expose:
- 6379
# networks:
# - nn_network

callback-app:
container_name: callback-app
Expand All @@ -39,10 +46,23 @@ services:
- 8008
ports:
- "8008:8008"
# networks:
# - nn_network

#networks:
# nn_network:
# name: nn
# external: false
node-norm:
container_name: node-norm
image: node-norm
build:
context: .
dockerfile: Dockerfile-test
env_file: .env
expose:
- 8080
ports:
- "8080:8080"
depends_on:
- redis_eq_id_to_id_db
- redis_id_to_eqids_db
- redis_id_to_type_db
- redis_curie_to_bl_type_db
- redis_info_content_db
- redis_conflation_db
- callback-app
59 changes: 49 additions & 10 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,18 +1,57 @@
version: "3"
services:
redis:
container_name: redis
redis_eq_id_to_id_db:
container_name: redis_eq_id_to_id_db
image: redis:alpine
expose:
- 6379

r3:
container_name: r3
image: r3
build: .
env_file:
- .env

redis_id_to_eqids_db:
container_name: redis_id_to_eqids_db
image: redis:alpine
expose:
- 6379

redis_id_to_type_db:
container_name: redis_id_to_type_db
image: redis:alpine
expose:
- 6379

redis_curie_to_bl_type_db:
container_name: redis_curie_to_bl_type_db
image: redis:alpine
expose:
- 6379

redis_info_content_db:
container_name: redis_info_content_db
image: redis:alpine
expose:
- 6379

redis_conflation_db:
container_name: redis_conflation_db
image: redis:alpine
expose:
- 6379

node-norm:
container_name: node-norm
image: node-norm
build:
context: .
dockerfile: Dockerfile
env_file: .env
expose:
- 8080
ports:
- "8080:8080"
depends_on:
- redis
- redis_eq_id_to_id_db
- redis_id_to_eqids_db
- redis_id_to_type_db
- redis_curie_to_bl_type_db
- redis_info_content_db
- redis_conflation_db

20 changes: 0 additions & 20 deletions load.py

This file was deleted.

Loading