compbio-logo

SAMPDI-3Dv2: Predicting protein-DNA binding free energy change upon mutations

Emil Alexov Group

compbio-logo

About SAMPDI-3Dv2

SAMPDI-3Dv2 is an updated version of SAMPDI-3D, trained on a larger dataset of mutations in proteins and DNA within protein-DNA complexes. It incorporates an extended feature set and employs a gradient boosting decision tree machine learning algorithm to predict changes in binding free energy caused by mutations in proteins (single amino acid) or DNA (single base-pair).

This method uses two distinct models:

  1. A model trained on single amino acid mutations in proteins and their associated free energy changes, utilizing features derived from protein-DNA complex structures.
  2. A model trained on single base-pair mutations in DNA and their associated free energy changes, leveraging features from protein-DNA complex structures.
The dataset used for model development is available to download. Users can also download the standalone code (available here).

 
SAMPDI-3Dv2 schema
 
Protein-DNA complex structure
Mutation detail
Protein-DNA complex structure
Mutation list file *

Plain text, one mutation per line: CHAIN WTYPE RESID MTYPE — e.g. A W 45 A. View example

Protein-DNA complex structure
DNA mutation detail
Protein-DNA complex structure
Mutation list file *

Plain text, one mutation per line: CHAIN WTYPE RESID MTYPE — e.g. C AT 12 GC. View example

 
Copyright © Computational Biophysics and Bioinformatics - Emil Alexov Group.