Skip to content

Code Layout And Major Features

Josh Miller edited this page Apr 19, 2019 · 10 revisions

Code Layout

Moltimate's Backend follows a standard Spring Boot directory layout, roughly in accordance with Apache Maven's Standard Directory Layout. A visual directory tree is available at the bottom of this page.

This backend exposes a REST API which is called by the moltimate-frontend. The API documentation can be viewed here.

Major Features

The major features of Moltimate are:

  1. Motif Finder
  2. Motif Maker
  3. Database & Caching

Motif Finder

Motif Finder is the heart of Moltimate. It's how an active site alignment request turns into a list of "good" alignments. This feature is based around an algorithm which takes in a motif and a list of protein PDB ids (sent from the frontend). BioJava is used to fetch the structure objects which represent these proteins. The alignment algorithm then looks at each protein to see if they contain the motif's active site residues in very similar locations. Any proteins which align well are returned in the results list. A more thorough explanation of the algorithm can be found here.

Motif Maker

Motif Maker is a crucial part of Moltimate which allows users to manually define the active site of an existing PDB protein entry, or for a custom uploaded PDB file. This feature generates a .motif file which contains the original protein file (.pdb, .cif, ...) and the residues which make up that protein's active site.

Database & Caching

Motifs are currently stored in a database inside Moltimate. When you are running locally this will be stored in an H2 database. You can find this database in the file moltimate.mv.db. For the cloud implementation, we use a cloud MySQL database which is common across all running cloud instances of Moltimate.

In addition to this, we have a two-layer cache implemented for protein structure alignments. The first layer is an in-memory cache. This can be found under the CacheService and is currently only used to store the results of protein structure alignments, which can be costly to recompute otherwise. The cache can be adjusted based on maximum size and a time-based eviction policy. We automatically evict elements of the cache when it gets too big or if the elements have existed in the cache for 24 hours. When an alignment is evicted from the cache, it is placed into a database of alignments. This database acts our second layer of cache. Since we store alignments in this database, it means we never have to compute the same alignment twice. This was implemented in order to improve the speed of alignment calls. Currently, a new alignment which has never been cached takes ~90 seconds to compute on average. A cached alignment in memory usually takes about 0.1 seconds to return on average. An alignment in the database usually takes about 1 second to return on average.

Populating the Database

Follow these steps to delete and populate the database.
Note: You will see many NullPointerExceptions; this is known behavior. These motifs fail to save.

Local Database

  1. Delete moltimate.mv.db
  2. Run the application locally
  3. Send a GET request to localhost:8080/tasks/updatemotifs (done easily with an internet browser)
  4. Look at the logs and wait for the log message Finished saving # motifs to the database

Cloud Database

(easy way)

  1. Go to GCP console
  2. Go to SQL
  3. Click backend-db-sql
  4. Click Databases
  5. Click the trash icon next to the moltimate database
  6. Create a new database named moltimate
  7. Send a GET request to moltimate.appspot.com/tasks/updatemotifs
  8. Look at the logs and wait for the log message Finished saving # motifs to the database

(hard way, no down time)

  1. Go to GCP console
  2. Go to SQL
  3. Click backend-db-sql
  4. Enter console session to ssh into the database
  5. Use sql commands to delete all tables under moltimate except for the ones with h2 names

Directory Tree

For compactness, many files have been omitted from this directory tree.

├── LICENSE (GNU General Public License v2.0)
├── README.md
├── WEB-INF (GCP App Engine configurations)
├── pom.xml
├── secrets (contains encrypted GCP credentials for automated deployments)
└── src
    ├── main
    │   ├── appengine (GCP App Engine configurations)
    │   ├── java
    │   │   └── org
    │   │       └── moltimate
    │   │           └── moltimatebackend
    │   │               ├── Application.java
    │   │               ├── config (profile-specific configurations)
    │   │               ├── constant (global constant values)
    │   │               ├── controller (REST controllers)
    │   │               ├── dto (general dtos)
    │   │               │   └── request (REST request dtos)
    │   │               │   └── response (REST response dtos)
    │   │               ├── exception (custom exceptions)
    │   │               ├── model (database models)
    │   │               ├── parser (data file parsers)
    │   │               ├── repository (JPA / Hibernate repositories)
    │   │               ├── service (business logic)
    │   │               ├── util (stateless business logic w/ no database interaction)
    │   │               └── validation (small static validate methods)
    │   └── resources (spring profiles)
    │       ├── motifdata (version-locked sources of active site data)
    │       └── static (version-locked bundled moltimate-frontend)
    └── test
        └── java
            └── org
                └── moltimate
                    └── moltimatebackend
                        └── service
                            └── AlignmentTest.java