Skip to content

krystalzeng/Classfiy-Protein-Sequence

Repository files navigation

Eukaryotic protein sequences classification

The aim of this project is to develop a method in order classify eukaryotic protein sequences into the 4 categories: Cytosolic, Secreted, Nuclear and Mitochondrial.

Getting Started

Prerequisites

numpy==1.14.1
pandas==0.22.0
Keras==2.0.8
matplotlib==2.2.0
scikit_learn==0.19.1
biopython

Running

Demo.py is the main demo file. This file shall save the confusion matrix for Uniform Model, Logistic Regression Model, and Random Forest Model. A 'result.txt' is generated to demonstrate the output prediction on the blind test set.

Author

Acknowledgments

  • BioPython

About

For bioinformatics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages