Skip to content

A generative latent variable model for biological sequence families.

License

Notifications You must be signed in to change notification settings

debbiemarkslab/DeepSequence

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DeepSequence

DeepSequence is a generative, unsupervised latent variable model for biological sequences. Given a multiple sequence alignment as an input, it can be used to predict accessible mutations, extract quantitative features for supervised learning, and generate libraries of new sequences satisfying apparent constraints. It models higher-order dependencies in sequences as a nonlinear combination of constraints between subsets of residues. For more information, check out the paper on biorxiv and the examples below.

For ease of analysis, we advise that alignments be generated with the EVcouplings package, though any sequence alignment can be used.

Codebase is compatible with Python 2.7 and Theano 1.0.1. For GPU-enabled computation, CUDA will have to be installed separately. See INSTALL for more details.

Examples

For reasonable training time, we advise training DeepSequence on a GPU:

THEANO_FLAGS='floatX=float32,device=cuda' python run_svi.py

However, it can be run on the CPU with:

python run_svi.py

Other usage examples and features of the analysis are available in iPython notebooks in the examples subfolder.

About

A generative latent variable model for biological sequence families.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages