De novo transcriptome assembly, annotation and comparison of four ecological and evolutionary model salmonid fish species

Abstract Background Salmonid fishes exhibit high levels of phenotypic and ecological variation and are thus ideal model systems for studying evolutionary processes of adaptive divergence and speciation. Furthermore, salmonids are of major interest in fisheries, aquaculture, and conservation research...

Full description

Bibliographic Details
Main Authors: Carruthers, Madeleine, Yurchenko, Andrey, Augley, Julian, Adams, Colin, Herzyk, Pawel, Elmer, Kathryn
Format: Article in Journal/Newspaper
Language:unknown
Published: Figshare 2018
Subjects:
Online Access:https://dx.doi.org/10.6084/m9.figshare.c.3971733
https://figshare.com/collections/De_novo_transcriptome_assembly_annotation_and_comparison_of_four_ecological_and_evolutionary_model_salmonid_fish_species/3971733
id ftdatacite:10.6084/m9.figshare.c.3971733
record_format openpolar
spelling ftdatacite:10.6084/m9.figshare.c.3971733 2023-05-15T14:30:09+02:00 De novo transcriptome assembly, annotation and comparison of four ecological and evolutionary model salmonid fish species Carruthers, Madeleine Yurchenko, Andrey Augley, Julian Adams, Colin Herzyk, Pawel Elmer, Kathryn 2018 https://dx.doi.org/10.6084/m9.figshare.c.3971733 https://figshare.com/collections/De_novo_transcriptome_assembly_annotation_and_comparison_of_four_ecological_and_evolutionary_model_salmonid_fish_species/3971733 unknown Figshare https://dx.doi.org/10.1186/s12864-017-4379-x CC BY 4.0 https://creativecommons.org/licenses/by/4.0 CC-BY Genetics FOS Biological sciences Evolutionary Biology 59999 Environmental Sciences not elsewhere classified FOS Earth and related environmental sciences Ecology 69999 Biological Sciences not elsewhere classified Marine Biology Inorganic Chemistry FOS Chemical sciences Collection article 2018 ftdatacite https://doi.org/10.6084/m9.figshare.c.3971733 https://doi.org/10.1186/s12864-017-4379-x 2021-11-05T12:55:41Z Abstract Background Salmonid fishes exhibit high levels of phenotypic and ecological variation and are thus ideal model systems for studying evolutionary processes of adaptive divergence and speciation. Furthermore, salmonids are of major interest in fisheries, aquaculture, and conservation research. Improving understanding of the genetic mechanisms underlying traits in these species would significantly progress research in these fields. Here we generate high quality de novo transcriptomes for four salmonid species: Atlantic salmon (Salmo salar), brown trout (Salmo trutta), Arctic charr (Salvelinus alpinus), and European whitefish (Coregonus lavaretus). All species except Atlantic salmon have no reference genome publicly available and few if any genomic studies to date. Results We used paired-end RNA-seq on Illumina to generate high coverage sequencing of multiple individuals, yielding between 180 and 210Â M reads per species. After initial assembly, strict filtering was used to remove duplicated, redundant, and low confidence transcripts. The final assemblies consisted of 36,505 protein-coding transcripts for Atlantic salmon, 35,736 for brown trout, 33,126 for Arctic charr, and 33,697 for European whitefish and are made publicly available. Assembly completeness was assessed using three approaches, all of which supported high quality of the assemblies: 1) ~78% of Actinopterygian single-copy orthologs were successfully captured in our assemblies, 2) orthogroup inference identified high overlap in the protein sequences present across all four species (40% shared across all four and 84% shared by at least two), and 3) comparison with the published Atlantic salmon genome suggests that our assemblies represent well covered (~98%) protein-coding transcriptomes. Thorough comparison of the generated assemblies found that 84-90% of transcripts in each assembly were orthologous with at least one of the other three species. We also identified 34-37% of transcripts in each assembly as paralogs. We further compare completeness and annotation statistics of our new assemblies to available related species. Conclusion New, high-confidence protein-coding transcriptomes were generated for four ecologically and economically important species of salmonids. This offers a high quality pipeline for such complex genomes, represents a valuable contribution to the existing genomic resources for these species and provides robust tools for future investigation of gene expression and sequence evolution in these and other salmonid species. Article in Journal/Newspaper Arctic charr Arctic Atlantic salmon Salmo salar Salvelinus alpinus DataCite Metadata Store (German National Library of Science and Technology) Arctic
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language unknown
topic Genetics
FOS Biological sciences
Evolutionary Biology
59999 Environmental Sciences not elsewhere classified
FOS Earth and related environmental sciences
Ecology
69999 Biological Sciences not elsewhere classified
Marine Biology
Inorganic Chemistry
FOS Chemical sciences
spellingShingle Genetics
FOS Biological sciences
Evolutionary Biology
59999 Environmental Sciences not elsewhere classified
FOS Earth and related environmental sciences
Ecology
69999 Biological Sciences not elsewhere classified
Marine Biology
Inorganic Chemistry
FOS Chemical sciences
Carruthers, Madeleine
Yurchenko, Andrey
Augley, Julian
Adams, Colin
Herzyk, Pawel
Elmer, Kathryn
De novo transcriptome assembly, annotation and comparison of four ecological and evolutionary model salmonid fish species
topic_facet Genetics
FOS Biological sciences
Evolutionary Biology
59999 Environmental Sciences not elsewhere classified
FOS Earth and related environmental sciences
Ecology
69999 Biological Sciences not elsewhere classified
Marine Biology
Inorganic Chemistry
FOS Chemical sciences
description Abstract Background Salmonid fishes exhibit high levels of phenotypic and ecological variation and are thus ideal model systems for studying evolutionary processes of adaptive divergence and speciation. Furthermore, salmonids are of major interest in fisheries, aquaculture, and conservation research. Improving understanding of the genetic mechanisms underlying traits in these species would significantly progress research in these fields. Here we generate high quality de novo transcriptomes for four salmonid species: Atlantic salmon (Salmo salar), brown trout (Salmo trutta), Arctic charr (Salvelinus alpinus), and European whitefish (Coregonus lavaretus). All species except Atlantic salmon have no reference genome publicly available and few if any genomic studies to date. Results We used paired-end RNA-seq on Illumina to generate high coverage sequencing of multiple individuals, yielding between 180 and 210Â M reads per species. After initial assembly, strict filtering was used to remove duplicated, redundant, and low confidence transcripts. The final assemblies consisted of 36,505 protein-coding transcripts for Atlantic salmon, 35,736 for brown trout, 33,126 for Arctic charr, and 33,697 for European whitefish and are made publicly available. Assembly completeness was assessed using three approaches, all of which supported high quality of the assemblies: 1) ~78% of Actinopterygian single-copy orthologs were successfully captured in our assemblies, 2) orthogroup inference identified high overlap in the protein sequences present across all four species (40% shared across all four and 84% shared by at least two), and 3) comparison with the published Atlantic salmon genome suggests that our assemblies represent well covered (~98%) protein-coding transcriptomes. Thorough comparison of the generated assemblies found that 84-90% of transcripts in each assembly were orthologous with at least one of the other three species. We also identified 34-37% of transcripts in each assembly as paralogs. We further compare completeness and annotation statistics of our new assemblies to available related species. Conclusion New, high-confidence protein-coding transcriptomes were generated for four ecologically and economically important species of salmonids. This offers a high quality pipeline for such complex genomes, represents a valuable contribution to the existing genomic resources for these species and provides robust tools for future investigation of gene expression and sequence evolution in these and other salmonid species.
format Article in Journal/Newspaper
author Carruthers, Madeleine
Yurchenko, Andrey
Augley, Julian
Adams, Colin
Herzyk, Pawel
Elmer, Kathryn
author_facet Carruthers, Madeleine
Yurchenko, Andrey
Augley, Julian
Adams, Colin
Herzyk, Pawel
Elmer, Kathryn
author_sort Carruthers, Madeleine
title De novo transcriptome assembly, annotation and comparison of four ecological and evolutionary model salmonid fish species
title_short De novo transcriptome assembly, annotation and comparison of four ecological and evolutionary model salmonid fish species
title_full De novo transcriptome assembly, annotation and comparison of four ecological and evolutionary model salmonid fish species
title_fullStr De novo transcriptome assembly, annotation and comparison of four ecological and evolutionary model salmonid fish species
title_full_unstemmed De novo transcriptome assembly, annotation and comparison of four ecological and evolutionary model salmonid fish species
title_sort de novo transcriptome assembly, annotation and comparison of four ecological and evolutionary model salmonid fish species
publisher Figshare
publishDate 2018
url https://dx.doi.org/10.6084/m9.figshare.c.3971733
https://figshare.com/collections/De_novo_transcriptome_assembly_annotation_and_comparison_of_four_ecological_and_evolutionary_model_salmonid_fish_species/3971733
geographic Arctic
geographic_facet Arctic
genre Arctic charr
Arctic
Atlantic salmon
Salmo salar
Salvelinus alpinus
genre_facet Arctic charr
Arctic
Atlantic salmon
Salmo salar
Salvelinus alpinus
op_relation https://dx.doi.org/10.1186/s12864-017-4379-x
op_rights CC BY 4.0
https://creativecommons.org/licenses/by/4.0
op_rightsnorm CC-BY
op_doi https://doi.org/10.6084/m9.figshare.c.3971733
https://doi.org/10.1186/s12864-017-4379-x
_version_ 1766304061530832896