Decoding Phenotypes via Transcriptomics and Proteomics: Cancer and beyond
While genomics approaches are important in studying host phenotype alterations in response to environmental changes or disease, proteomics approaches offer a complementary perspective by providing a direct readout of expressed functional pathways. Proteogenomic strategies utilizing RNA-sequencing da...
Main Author: | |
---|---|
Other Authors: | |
Format: | Software |
Language: | English |
Published: |
eScholarship, University of California
2022
|
Subjects: | |
Online Access: | https://escholarship.org/uc/item/1s98z674 |
id |
ftcdlib:oai:escholarship.org:ark:/13030/qt1s98z674 |
---|---|
record_format |
openpolar |
spelling |
ftcdlib:oai:escholarship.org:ark:/13030/qt1s98z674 2024-09-15T17:56:32+00:00 Decoding Phenotypes via Transcriptomics and Proteomics: Cancer and beyond Lin, Miin Sophia Bafna, Vineet 2022-01-01 https://escholarship.org/uc/item/1s98z674 en eng eScholarship, University of California qt1s98z674 https://escholarship.org/uc/item/1s98z674 public Bioinformatics multimedia 2022 ftcdlib 2024-06-28T06:28:21Z While genomics approaches are important in studying host phenotype alterations in response to environmental changes or disease, proteomics approaches offer a complementary perspective by providing a direct readout of expressed functional pathways. Proteogenomic strategies utilizing RNA-sequencing data to construct splice graph databases have been used in a variety of applications to identify novel splice junctions and mutated peptides. The work in this dissertation begins with the integration of splice databases into a proteogenomic pipeline for the validation of the recently released annotation of the Atlantic salmon genome, and the validation of primary hepatocytes as in vitro models for salmon toxicity studies. Searching in-house generated LC-MS/MS datasets against splice databases constructed from publicly available and in-house-generated salmon transcriptomics data, our proteogenomic pipeline identified 183 events in support of 71 transcript predictions. These included novel genes, corrections to current annotations, and support for Ensembl transcripts. In addition to host-expressed proteins, microbial-expressed proteins can also alter host phenotype. In the absence of prior taxonomic information, tandem mass spectra would be searched against large pan-microbial databases, requiring heavy computational workload and reducing sensitivity. Using both software and algorithmic methods, we developed ProteoStorm, an efficient database search framework for large-scale metaproteomics studies, that significantly reduced runtime from 22 weeks to 9.7 hours while retaining 96% of peptide identifications when compared to MSGF+. A reanalysis of a urinary tract infection dataset revealed a complex pattern of polymicrobial expression, including previously identified microbes. In the final chapter, we used transcriptomics data from TCGA to identify a set of genes that may be involved in the maintenance of ecDNA amplicons in cancer. Specifically, we applied the Boruta algorithm, which incorporates the Random Forest classifier ... Software Atlantic salmon University of California: eScholarship |
institution |
Open Polar |
collection |
University of California: eScholarship |
op_collection_id |
ftcdlib |
language |
English |
topic |
Bioinformatics |
spellingShingle |
Bioinformatics Lin, Miin Sophia Decoding Phenotypes via Transcriptomics and Proteomics: Cancer and beyond |
topic_facet |
Bioinformatics |
description |
While genomics approaches are important in studying host phenotype alterations in response to environmental changes or disease, proteomics approaches offer a complementary perspective by providing a direct readout of expressed functional pathways. Proteogenomic strategies utilizing RNA-sequencing data to construct splice graph databases have been used in a variety of applications to identify novel splice junctions and mutated peptides. The work in this dissertation begins with the integration of splice databases into a proteogenomic pipeline for the validation of the recently released annotation of the Atlantic salmon genome, and the validation of primary hepatocytes as in vitro models for salmon toxicity studies. Searching in-house generated LC-MS/MS datasets against splice databases constructed from publicly available and in-house-generated salmon transcriptomics data, our proteogenomic pipeline identified 183 events in support of 71 transcript predictions. These included novel genes, corrections to current annotations, and support for Ensembl transcripts. In addition to host-expressed proteins, microbial-expressed proteins can also alter host phenotype. In the absence of prior taxonomic information, tandem mass spectra would be searched against large pan-microbial databases, requiring heavy computational workload and reducing sensitivity. Using both software and algorithmic methods, we developed ProteoStorm, an efficient database search framework for large-scale metaproteomics studies, that significantly reduced runtime from 22 weeks to 9.7 hours while retaining 96% of peptide identifications when compared to MSGF+. A reanalysis of a urinary tract infection dataset revealed a complex pattern of polymicrobial expression, including previously identified microbes. In the final chapter, we used transcriptomics data from TCGA to identify a set of genes that may be involved in the maintenance of ecDNA amplicons in cancer. Specifically, we applied the Boruta algorithm, which incorporates the Random Forest classifier ... |
author2 |
Bafna, Vineet |
format |
Software |
author |
Lin, Miin Sophia |
author_facet |
Lin, Miin Sophia |
author_sort |
Lin, Miin Sophia |
title |
Decoding Phenotypes via Transcriptomics and Proteomics: Cancer and beyond |
title_short |
Decoding Phenotypes via Transcriptomics and Proteomics: Cancer and beyond |
title_full |
Decoding Phenotypes via Transcriptomics and Proteomics: Cancer and beyond |
title_fullStr |
Decoding Phenotypes via Transcriptomics and Proteomics: Cancer and beyond |
title_full_unstemmed |
Decoding Phenotypes via Transcriptomics and Proteomics: Cancer and beyond |
title_sort |
decoding phenotypes via transcriptomics and proteomics: cancer and beyond |
publisher |
eScholarship, University of California |
publishDate |
2022 |
url |
https://escholarship.org/uc/item/1s98z674 |
genre |
Atlantic salmon |
genre_facet |
Atlantic salmon |
op_relation |
qt1s98z674 https://escholarship.org/uc/item/1s98z674 |
op_rights |
public |
_version_ |
1810432732963012608 |