Lean Mean Protein Machine (Learning)

Project Overview

Structure-based design of proteins is promising in synthetic biology research. This has many applications, such as development of nanoparticle vaccines. However, one of the major hurdles to function and use of designed nanoparticles is sub-optimal secretion. In this project, we will analyze native, characterized proteins in the human proteome in order to understand what features contribute to protein secretion (such as isoelectric point, amino acid length, and protein size). We will develop a model that can predict whether a designed protein will be secreted out of the cell, become a transmembrane protein, or remain a soluble or intracellular protein.

Use cases

Standardizing amino acid sequence through UniRep.
- Protein designer/researcher provides amino acid sequence in single-letter format.
- System implicitly returns a 1900-element vector based on sequence.
Predict secretion score with UniRep features.
- Protein designer/researcher provides amino acid sequence in single-letter format.
- System enters the sequence-based vector into CNN (?) and returns a secretion score.
Optimize secretion score accuracy with additional features.
- Researcher provides additional protein data, such as amino acid length and theoretical/experimental isoelectric point, in addition to amino acid sequence.
- Model returns an improved secretion score based on additional features.
Comparison of models.

Dependencies

Pandas
- Data-organization framework
Numpy
- Numerical operations
Jax
- High-efficiency math library, optimized for accelerated linear algebra (XLA)
Jax-UniRep
- Library for applying the protein sequence-encoding paradigm, UniRe, implemented using Jax instead of Tensorflow

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
README.md		README.md
UniProt_query.ipynb		UniProt_query.ipynb
UniRep_featurize_example.ipynb		UniRep_featurize_example.ipynb
environment.yml		environment.yml
environment_description.md		environment_description.md
make_unirep_dataframe.ipynb		make_unirep_dataframe.ipynb
regression_test.ipynb		regression_test.ipynb
unprot_access_notes.md		unprot_access_notes.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lean Mean Protein Machine (Learning)

Project Overview

Use cases

Dependencies

About

Releases

Packages

Languages

andrewfavor95/LMPM

Folders and files

Latest commit

History

Repository files navigation

Lean Mean Protein Machine (Learning)

Project Overview

Use cases

Dependencies

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages