The Metaproteomic Analysis of Arctic Soils with Novel Bioinformatic Methods

Microbes control the decomposition of soil organic matter, a key biogeochemical process significant to global climate. The complex chemistry of soils and the great diversity of microbial strains with flexible metabolic capabilities have impeded the elucidation of degradation pathways from plant tiss...

Full description

Bibliographic Details
Main Author: Miller, Samuel
Format: Text
Language:English
Published: The University of Chicago 2019
Subjects:
Online Access:https://doi.org/10.6082/uchicago.1419
http://knowledge.uchicago.edu/record/1419
Description
Summary:Microbes control the decomposition of soil organic matter, a key biogeochemical process significant to global climate. The complex chemistry of soils and the great diversity of microbial strains with flexible metabolic capabilities have impeded the elucidation of degradation pathways from plant tissues to greenhouse gases. A mechanistic understanding of soil processes can improve models used to predict the fate of vast quantities of carbon stored in Arctic soils. Arctic warming is accelerating microbial decomposition but also increasing plant biomass, counteracting carbon loss. Floras with a significant nonvascular component are being replaced by floras dominated by larger and woodier plants. The changing vegetation may mediate the effects of warming on soil microbial activity through interactions with roots and the composition of plant detritus. Metaproteomics is a promising approach for studying soil processes, since proteins catalyze key biogeochemical transformations. I collected soil cores from major floral ecotypes in the area of Toolik Field Station, Alaska and extracted proteins for metaproteomic analysis. To overcome impediments to the routine application of proteomics to complex samples, I developed novel bioinformatic methods to analyze protein mass spectrometry data. The standard database search method of assigning amino acid sequences to peptide mass spectra requires a tailored reference database of sequences that may be present in the proteomic dataset. Environmental metaproteomes may lack appropriate reference databases, especially in the absence of paired metagenomes. As an alternative to database search, sequences can be deduced directly from mass spectra, a computationally challenging approach known as de novo sequencing. To improve the low accuracy of de novo sequences predicted by existing algorithms, I created post-processing software called Postnovo, which rescores and reranks sequences from multiple input algorithms using newly calculated metrics. I demonstrated that Postnovo improves the ...