Autopolyploidy Genome Duplication Preserves Other Ancient Genome Duplications In Atlantic Salmon (Salmo Salar) Supplementary Datasets

For various species, alignments were found between a protein database (produced from Zebrafish) and the sequenced genome of that species. Using Perl scripts and the alignments, gene models were identified in the various species based on the protein sequences. The gene models, for the various species...

Full description

Bibliographic Details
Main Authors: Christensen, Kris, Davidson, William
Format: Dataset
Language:unknown
Published: Zenodo 2017
Subjects:
Online Access:https://dx.doi.org/10.5281/zenodo.270028
https://zenodo.org/record/270028
Description
Summary:For various species, alignments were found between a protein database (produced from Zebrafish) and the sequenced genome of that species. Using Perl scripts and the alignments, gene models were identified in the various species based on the protein sequences. The gene models, for the various species, can be found in the .gff3 files. Some of the .gff3 files have had ribosomal proteins removed. Homeologous regions were then identified using Perl scripts and can be found in .gff3 files as well. They have Homeologous_Regions.gff3 in their title. Homeologous genes in these regions were counted (named XX_XX_Homeologous_Regions.txt), and compared to all of the genes (not just homeologous genes) in these regions (named Gene_Count_Homeolgous_XX_XX_XX.txt) to find the density. Homeologous gene sequences were compared to each other to identify the Ps values between them using a program called SNAP (Files with _Homeologous_region_analysis_version_1.2.txt at the end). The analyses of these files are summarized in "Pn_Ps_Values_Vertebrate_Homeologous_Regions.ods." The synteny between species can be found in the files with .seg extensions (These can be opened in IGV). A comparison between the gene density and Ps value for each homeologous region can be found in the file, "Gene_Density_Compared_to_Ps_Values.ods." Included is an extended readme file and Perl scripts (.pl extension) in a compressed file (Final_Scripts.tar.gz).