Supporting Data for: The genome of the pygmy right whale illuminates the evolution of rorquals ...

Background Baleen whales are a clade of gigantic and highly specialized marine mammals. Their genomes have been used to investigate their complex evolutionary history and to decipher the molecular mechanisms that allowed them to reach these dimensions. However, many unanswered questions remain, espe...

Full description

Bibliographic Details
Main Authors: Wolf, Magnus, Zapf, Konstantin, Gupta, Deepak Kumar, Hiller, Michael, Árnason, Ulfur, Janke, Axel
Format: Software
Language:unknown
Published: Zenodo 2023
Subjects:
Online Access:https://dx.doi.org/10.5281/zenodo.7740015
https://zenodo.org/record/7740015
Description
Summary:Background Baleen whales are a clade of gigantic and highly specialized marine mammals. Their genomes have been used to investigate their complex evolutionary history and to decipher the molecular mechanisms that allowed them to reach these dimensions. However, many unanswered questions remain, especially about the early radiation of rorquals and how cancer resistance interplays with their huge number of cells. The pygmy right whale is the smallest and most elusive among the baleen whales. It reaches only a fraction of the body length compared to its relatives and it is the only living member of an otherwise extinct family. This placement makes the pygmy right whale genome an interesting target to update the complex phylogenetic past of baleen whales, because it splits up an otherwise long branch that leads to the radiation of rorquals. Apart from that, genomic data of this species might help to investigate cancer resistance in large whales, since these mechanisms are not as important for the pygmy right ... : General Usage: Many files containing sequence data are zipped using gzip. Use "gunzip" to reverse this. Also, directories containing many sub-files are compiled in a tar ball. Use "tar -xzvf" to open the directory first. Usage Annotation Data: The assembly as well as the cds and amino acid sequences are in typical fasta format and can be viewed by any type of text editor. The gene ID within all these files are named after the best hit within one of the used reference annotations used for homology-based annotation. Usage Phylogenomics Data: All alignments including the WGA, WGA fragments and SCOSs are in fasta alignment format and can again be opened by any text editor. To better understand their quality however, we recommend alignment viewing software like AliView (http://genocat.tools/tools/aliview.html). SCOS raw sequences are in regular fasta format and can be opened with any text editor. Within WGA sequences, header represent a short 6- character long species identified made from their scientific name. ...