Comparative Exploratory Analysis of Intrinsically Disordered Protein Dynamics Using Machine Learning and Network Analytic Methods

Simulations of intrinsically disordered proteins (IDPs) pose numerous challenges to comparative analysis, prominently including highly dynamic conformational states and a lack of well-defined secondary structure. Machine learning (ML) algorithms are especially effective at discriminating among high-...

Full description

Bibliographic Details
Published in:Frontiers in Molecular Biosciences
Main Authors: Grazioli, Gianmarc, Martin, Rachel W., Butts, Carter T.
Format: Text
Language:English
Published: Frontiers Media S.A. 2019
Subjects:
Online Access:http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6581705/
https://doi.org/10.3389/fmolb.2019.00042
Description
Summary:Simulations of intrinsically disordered proteins (IDPs) pose numerous challenges to comparative analysis, prominently including highly dynamic conformational states and a lack of well-defined secondary structure. Machine learning (ML) algorithms are especially effective at discriminating among high-dimensional inputs whose differences are extremely subtle, making them well suited to the study of IDPs. In this work, we apply various ML techniques, including support vector machines (SVM) and clustering, as well as related methods such as principal component analysis (PCA) and protein structure network (PSN) analysis, to the problem of uncovering differences between configurational data from molecular dynamics simulations of two variants of the same IDP. We examine molecular dynamics (MD) trajectories of wild-type amyloid beta (Aβ(1−40)) and its “Arctic” variant (E22G), systems that play a central role in the etiology of Alzheimer's disease. Our analyses demonstrate ways in which ML and related approaches can be used to elucidate subtle differences between these proteins, including transient structure that is poorly captured by conventional metrics.