Development of CAbase and an Exon Analysis Pipeline for Visual Assessment of Predicted Genes for the Carbonic Anhydrases

Background and Aims Humans are able to quickly recognize and evaluate visual patterns, thus this thesis aims to apply this feature to the analysis of aspects of the conservation of carbonic anhydrase proteins. This was facilitated through the creation of two pipelines: * One to create a publically a...

Full description

Bibliographic Details
Main Author: Isokangas, Lydia
Other Authors: BioMediTech - BioMediTech, University of Tampere
Format: Master Thesis
Language:English
Published: 2016
Subjects:
ca
Online Access:https://trepo.tuni.fi/handle/10024/99277
Description
Summary:Background and Aims Humans are able to quickly recognize and evaluate visual patterns, thus this thesis aims to apply this feature to the analysis of aspects of the conservation of carbonic anhydrase proteins. This was facilitated through the creation of two pipelines: * One to create a publically available specialized database to service the CA research world named CAbase, and, * One to create a visual display of the aligned exons of the cDNA transcripts contained within CAbase with indicators to show the positions of start and stop codons along with the locations of the predicted signal and mitochondrial targeting peptides. This pipeline was named Exon_Analysis. Carbonic anhydrases (CAs) are ubiquitous proteins that reversibly catalyse carbon dioxide into carbonic acid. Through the events of duplication, the CAs exist in at least 16 different isoforms and potentially up to 17 different isoforms. Methods The pipelines were created using freely available tools that included python, MySQL, various bioinformatic tools such as Clustal Omega, PRANK, BLAST and Pal2Nal. The data for CAbase was extracted from Ensembl, NCBI, UniProt, RSCB PDB, UniGene and FlyBase. Additionally, calculated data from using SignalP and TargetP was also included. CAbase is hosted on the Amazon Web Server and can be accessed using any computer that has access to the Internet and has MySQL installed. Exon_Analysis draws a scaled exon MSA schematic based on a PRANK MSA of the cDNA transcripts for a CA isoform. The exons and other indicators such as the start and stop codon, and the signal and target peptides are all drawn in different colours and in their scaled locations. Thus it is possible to see the conserved nature of the exons within the coding regions and the aligned start and stop codons and the peptides for each CA isoform. Results CAbase is now publically available for anyone to use. However, it is still somewhat user unfriendly due to the requirement that user be familiar with SQL. CAbase facilitated the use of Exon_Analysis. This ...