Data_Sheet_1_Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.FASTA

Genomic structural variants (SVs) are a major source of genetic and phenotypic variation but have not been investigated systematically in rainbow trout (Oncorhynchus mykiss), an important aquaculture species of cold freshwater. The objectives of this study were 1) to identify and validate high-confi...

Full description

Bibliographic Details
Main Authors: Sixin Liu (166382), Guangtu Gao (524136), Ryan M. Layer (8522268), Gary H. Thorgaard (10192691), Gregory D. Wiens (6680213), Timothy D. Leeds (10192694), Kyle E. Martin (10192697), Yniv Palti (166374)
Format: Dataset
Language:unknown
Published: 2021
Subjects:
Online Access:https://doi.org/10.3389/fgene.2021.639355.s001
id ftsmithonian:oai:figshare.com:article/14112029
record_format openpolar
spelling ftsmithonian:oai:figshare.com:article/14112029 2023-05-15T15:33:01+02:00 Data_Sheet_1_Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.FASTA Sixin Liu (166382) Guangtu Gao (524136) Ryan M. Layer (8522268) Gary H. Thorgaard (10192691) Gregory D. Wiens (6680213) Timothy D. Leeds (10192694) Kyle E. Martin (10192697) Yniv Palti (166374) 2021-02-25T05:54:05Z https://doi.org/10.3389/fgene.2021.639355.s001 unknown https://figshare.com/articles/dataset/Data_Sheet_1_Identification_of_High-Confidence_Structural_Variants_in_Domesticated_Rainbow_Trout_Using_Whole-Genome_Sequencing_FASTA/14112029 doi:10.3389/fgene.2021.639355.s001 CC BY 4.0 CC-BY Genetics Genetic Engineering Biomarkers Developmental Genetics (incl. Sex Determination) Epigenetics (incl. Genome Methylation and Epigenomics) Gene Expression (incl. Microarray and other genome-wide approaches) Genome Structure and Regulation Genomics Genetically Modified Animals Livestock Cloning Gene and Molecular Therapy rainbow trout structural variants copy number variants transposable elements repetitive sequence whole-genome sequencing Dataset 2021 ftsmithonian https://doi.org/10.3389/fgene.2021.639355.s001 2021-02-26T10:36:16Z Genomic structural variants (SVs) are a major source of genetic and phenotypic variation but have not been investigated systematically in rainbow trout (Oncorhynchus mykiss), an important aquaculture species of cold freshwater. The objectives of this study were 1) to identify and validate high-confidence SVs in rainbow trout using whole-genome re-sequencing; and 2) to examine the contribution of transposable elements (TEs) to SVs in rainbow trout. A total of 96 rainbow trout, including 11 homozygous lines and 85 outbred fish from three breeding populations, were whole-genome sequenced with an average genome coverage of 17.2×. Putative SVs were identified using the program Smoove which integrates LUMPY and other associated tools into one package. After rigorous filtering, 13,863 high-confidence SVs were identified. Pacific Biosciences long-reads of Arlee, one of the homozygous lines used for SV detection, validated 98% (3,948 of 4,030) of the high-confidence SVs identified in the Arlee homozygous line. Based on principal component analysis, the 85 outbred fish clustered into three groups consistent with their populations of origin, further indicating that the high-confidence SVs identified in this study are robust. The repetitive DNA content of the high-confidence SV sequences was 86.5%, which is much higher than the 57.1% repetitive DNA content of the reference genome, and is also higher than the repetitive DNA content of Atlantic salmon SVs reported previously. TEs thus contribute substantially to SVs in rainbow trout as TEs make up the majority of repetitive sequences. Hundreds of the high-confidence SVs were annotated as exon-loss or gene-fusion variants, and may have phenotypic effects. The high-confidence SVs reported in this study provide a foundation for further rainbow trout SV studies. Dataset Atlantic salmon Unknown Pacific
institution Open Polar
collection Unknown
op_collection_id ftsmithonian
language unknown
topic Genetics
Genetic Engineering
Biomarkers
Developmental Genetics (incl. Sex Determination)
Epigenetics (incl. Genome Methylation and Epigenomics)
Gene Expression (incl. Microarray and other genome-wide approaches)
Genome Structure and Regulation
Genomics
Genetically Modified Animals
Livestock Cloning
Gene and Molecular Therapy
rainbow trout
structural variants
copy number variants
transposable elements
repetitive sequence
whole-genome sequencing
spellingShingle Genetics
Genetic Engineering
Biomarkers
Developmental Genetics (incl. Sex Determination)
Epigenetics (incl. Genome Methylation and Epigenomics)
Gene Expression (incl. Microarray and other genome-wide approaches)
Genome Structure and Regulation
Genomics
Genetically Modified Animals
Livestock Cloning
Gene and Molecular Therapy
rainbow trout
structural variants
copy number variants
transposable elements
repetitive sequence
whole-genome sequencing
Sixin Liu (166382)
Guangtu Gao (524136)
Ryan M. Layer (8522268)
Gary H. Thorgaard (10192691)
Gregory D. Wiens (6680213)
Timothy D. Leeds (10192694)
Kyle E. Martin (10192697)
Yniv Palti (166374)
Data_Sheet_1_Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.FASTA
topic_facet Genetics
Genetic Engineering
Biomarkers
Developmental Genetics (incl. Sex Determination)
Epigenetics (incl. Genome Methylation and Epigenomics)
Gene Expression (incl. Microarray and other genome-wide approaches)
Genome Structure and Regulation
Genomics
Genetically Modified Animals
Livestock Cloning
Gene and Molecular Therapy
rainbow trout
structural variants
copy number variants
transposable elements
repetitive sequence
whole-genome sequencing
description Genomic structural variants (SVs) are a major source of genetic and phenotypic variation but have not been investigated systematically in rainbow trout (Oncorhynchus mykiss), an important aquaculture species of cold freshwater. The objectives of this study were 1) to identify and validate high-confidence SVs in rainbow trout using whole-genome re-sequencing; and 2) to examine the contribution of transposable elements (TEs) to SVs in rainbow trout. A total of 96 rainbow trout, including 11 homozygous lines and 85 outbred fish from three breeding populations, were whole-genome sequenced with an average genome coverage of 17.2×. Putative SVs were identified using the program Smoove which integrates LUMPY and other associated tools into one package. After rigorous filtering, 13,863 high-confidence SVs were identified. Pacific Biosciences long-reads of Arlee, one of the homozygous lines used for SV detection, validated 98% (3,948 of 4,030) of the high-confidence SVs identified in the Arlee homozygous line. Based on principal component analysis, the 85 outbred fish clustered into three groups consistent with their populations of origin, further indicating that the high-confidence SVs identified in this study are robust. The repetitive DNA content of the high-confidence SV sequences was 86.5%, which is much higher than the 57.1% repetitive DNA content of the reference genome, and is also higher than the repetitive DNA content of Atlantic salmon SVs reported previously. TEs thus contribute substantially to SVs in rainbow trout as TEs make up the majority of repetitive sequences. Hundreds of the high-confidence SVs were annotated as exon-loss or gene-fusion variants, and may have phenotypic effects. The high-confidence SVs reported in this study provide a foundation for further rainbow trout SV studies.
format Dataset
author Sixin Liu (166382)
Guangtu Gao (524136)
Ryan M. Layer (8522268)
Gary H. Thorgaard (10192691)
Gregory D. Wiens (6680213)
Timothy D. Leeds (10192694)
Kyle E. Martin (10192697)
Yniv Palti (166374)
author_facet Sixin Liu (166382)
Guangtu Gao (524136)
Ryan M. Layer (8522268)
Gary H. Thorgaard (10192691)
Gregory D. Wiens (6680213)
Timothy D. Leeds (10192694)
Kyle E. Martin (10192697)
Yniv Palti (166374)
author_sort Sixin Liu (166382)
title Data_Sheet_1_Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.FASTA
title_short Data_Sheet_1_Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.FASTA
title_full Data_Sheet_1_Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.FASTA
title_fullStr Data_Sheet_1_Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.FASTA
title_full_unstemmed Data_Sheet_1_Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.FASTA
title_sort data_sheet_1_identification of high-confidence structural variants in domesticated rainbow trout using whole-genome sequencing.fasta
publishDate 2021
url https://doi.org/10.3389/fgene.2021.639355.s001
geographic Pacific
geographic_facet Pacific
genre Atlantic salmon
genre_facet Atlantic salmon
op_relation https://figshare.com/articles/dataset/Data_Sheet_1_Identification_of_High-Confidence_Structural_Variants_in_Domesticated_Rainbow_Trout_Using_Whole-Genome_Sequencing_FASTA/14112029
doi:10.3389/fgene.2021.639355.s001
op_rights CC BY 4.0
op_rightsnorm CC-BY
op_doi https://doi.org/10.3389/fgene.2021.639355.s001
_version_ 1766363489054490624