Monday, April 13th, 8:30-10:00am
Algorithms to detect sequence motifs and predict post-translational modification sites
Abstract: Post-translational modifications (PTMs) are an important step in the formation of a mature protein. While much is known about some PTMs such as Phosphorylation, there are other PTMs such as Cysteine S-sulfenylation that are currently being researched and analyzed. For this project, an algorithm, Bit-Motif, was developed to analyze modification sites of PTMs by discovering statistically significant motifs around these sites. These motifs were then used as features for a Support Vector Machine to build a prediction model for PTM sites, called Prediction Using Motifs for PTMs (PUMP).
Bit-Motif was used on a novel dataset that had not been tested for motifs. Bit-Motif was also found to run 1.3 to 2.4 times faster on sequences with lengths greater than 13 along with running 1.4 to 3.8 times faster on lower foreground frequency thresholds when compared against currently available motif-identifying algorithms. PUMP, evaluated with 10-fold cross validation, achieved comparable area under the curve (AUC) scores to current predictive methods.