Data_Sheet_1_Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.FASTA
Genomic structural variants (SVs) are a major source of genetic and phenotypic variation but have not been investigated systematically in rainbow trout (Oncorhynchus mykiss), an important aquaculture species of cold freshwater. The objectives of this study were 1) to identify and validate high-confi...
Main Authors: | , , , , , , , |
---|---|
Format: | Dataset |
Language: | unknown |
Published: |
2021
|
Subjects: | |
Online Access: | https://doi.org/10.3389/fgene.2021.639355.s001 |
id |
ftsmithonian:oai:figshare.com:article/14112029 |
---|---|
record_format |
openpolar |
spelling |
ftsmithonian:oai:figshare.com:article/14112029 2023-05-15T15:33:01+02:00 Data_Sheet_1_Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.FASTA Sixin Liu (166382) Guangtu Gao (524136) Ryan M. Layer (8522268) Gary H. Thorgaard (10192691) Gregory D. Wiens (6680213) Timothy D. Leeds (10192694) Kyle E. Martin (10192697) Yniv Palti (166374) 2021-02-25T05:54:05Z https://doi.org/10.3389/fgene.2021.639355.s001 unknown https://figshare.com/articles/dataset/Data_Sheet_1_Identification_of_High-Confidence_Structural_Variants_in_Domesticated_Rainbow_Trout_Using_Whole-Genome_Sequencing_FASTA/14112029 doi:10.3389/fgene.2021.639355.s001 CC BY 4.0 CC-BY Genetics Genetic Engineering Biomarkers Developmental Genetics (incl. Sex Determination) Epigenetics (incl. Genome Methylation and Epigenomics) Gene Expression (incl. Microarray and other genome-wide approaches) Genome Structure and Regulation Genomics Genetically Modified Animals Livestock Cloning Gene and Molecular Therapy rainbow trout structural variants copy number variants transposable elements repetitive sequence whole-genome sequencing Dataset 2021 ftsmithonian https://doi.org/10.3389/fgene.2021.639355.s001 2021-02-26T10:36:16Z Genomic structural variants (SVs) are a major source of genetic and phenotypic variation but have not been investigated systematically in rainbow trout (Oncorhynchus mykiss), an important aquaculture species of cold freshwater. The objectives of this study were 1) to identify and validate high-confidence SVs in rainbow trout using whole-genome re-sequencing; and 2) to examine the contribution of transposable elements (TEs) to SVs in rainbow trout. A total of 96 rainbow trout, including 11 homozygous lines and 85 outbred fish from three breeding populations, were whole-genome sequenced with an average genome coverage of 17.2×. Putative SVs were identified using the program Smoove which integrates LUMPY and other associated tools into one package. After rigorous filtering, 13,863 high-confidence SVs were identified. Pacific Biosciences long-reads of Arlee, one of the homozygous lines used for SV detection, validated 98% (3,948 of 4,030) of the high-confidence SVs identified in the Arlee homozygous line. Based on principal component analysis, the 85 outbred fish clustered into three groups consistent with their populations of origin, further indicating that the high-confidence SVs identified in this study are robust. The repetitive DNA content of the high-confidence SV sequences was 86.5%, which is much higher than the 57.1% repetitive DNA content of the reference genome, and is also higher than the repetitive DNA content of Atlantic salmon SVs reported previously. TEs thus contribute substantially to SVs in rainbow trout as TEs make up the majority of repetitive sequences. Hundreds of the high-confidence SVs were annotated as exon-loss or gene-fusion variants, and may have phenotypic effects. The high-confidence SVs reported in this study provide a foundation for further rainbow trout SV studies. Dataset Atlantic salmon Unknown Pacific |
institution |
Open Polar |
collection |
Unknown |
op_collection_id |
ftsmithonian |
language |
unknown |
topic |
Genetics Genetic Engineering Biomarkers Developmental Genetics (incl. Sex Determination) Epigenetics (incl. Genome Methylation and Epigenomics) Gene Expression (incl. Microarray and other genome-wide approaches) Genome Structure and Regulation Genomics Genetically Modified Animals Livestock Cloning Gene and Molecular Therapy rainbow trout structural variants copy number variants transposable elements repetitive sequence whole-genome sequencing |
spellingShingle |
Genetics Genetic Engineering Biomarkers Developmental Genetics (incl. Sex Determination) Epigenetics (incl. Genome Methylation and Epigenomics) Gene Expression (incl. Microarray and other genome-wide approaches) Genome Structure and Regulation Genomics Genetically Modified Animals Livestock Cloning Gene and Molecular Therapy rainbow trout structural variants copy number variants transposable elements repetitive sequence whole-genome sequencing Sixin Liu (166382) Guangtu Gao (524136) Ryan M. Layer (8522268) Gary H. Thorgaard (10192691) Gregory D. Wiens (6680213) Timothy D. Leeds (10192694) Kyle E. Martin (10192697) Yniv Palti (166374) Data_Sheet_1_Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.FASTA |
topic_facet |
Genetics Genetic Engineering Biomarkers Developmental Genetics (incl. Sex Determination) Epigenetics (incl. Genome Methylation and Epigenomics) Gene Expression (incl. Microarray and other genome-wide approaches) Genome Structure and Regulation Genomics Genetically Modified Animals Livestock Cloning Gene and Molecular Therapy rainbow trout structural variants copy number variants transposable elements repetitive sequence whole-genome sequencing |
description |
Genomic structural variants (SVs) are a major source of genetic and phenotypic variation but have not been investigated systematically in rainbow trout (Oncorhynchus mykiss), an important aquaculture species of cold freshwater. The objectives of this study were 1) to identify and validate high-confidence SVs in rainbow trout using whole-genome re-sequencing; and 2) to examine the contribution of transposable elements (TEs) to SVs in rainbow trout. A total of 96 rainbow trout, including 11 homozygous lines and 85 outbred fish from three breeding populations, were whole-genome sequenced with an average genome coverage of 17.2×. Putative SVs were identified using the program Smoove which integrates LUMPY and other associated tools into one package. After rigorous filtering, 13,863 high-confidence SVs were identified. Pacific Biosciences long-reads of Arlee, one of the homozygous lines used for SV detection, validated 98% (3,948 of 4,030) of the high-confidence SVs identified in the Arlee homozygous line. Based on principal component analysis, the 85 outbred fish clustered into three groups consistent with their populations of origin, further indicating that the high-confidence SVs identified in this study are robust. The repetitive DNA content of the high-confidence SV sequences was 86.5%, which is much higher than the 57.1% repetitive DNA content of the reference genome, and is also higher than the repetitive DNA content of Atlantic salmon SVs reported previously. TEs thus contribute substantially to SVs in rainbow trout as TEs make up the majority of repetitive sequences. Hundreds of the high-confidence SVs were annotated as exon-loss or gene-fusion variants, and may have phenotypic effects. The high-confidence SVs reported in this study provide a foundation for further rainbow trout SV studies. |
format |
Dataset |
author |
Sixin Liu (166382) Guangtu Gao (524136) Ryan M. Layer (8522268) Gary H. Thorgaard (10192691) Gregory D. Wiens (6680213) Timothy D. Leeds (10192694) Kyle E. Martin (10192697) Yniv Palti (166374) |
author_facet |
Sixin Liu (166382) Guangtu Gao (524136) Ryan M. Layer (8522268) Gary H. Thorgaard (10192691) Gregory D. Wiens (6680213) Timothy D. Leeds (10192694) Kyle E. Martin (10192697) Yniv Palti (166374) |
author_sort |
Sixin Liu (166382) |
title |
Data_Sheet_1_Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.FASTA |
title_short |
Data_Sheet_1_Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.FASTA |
title_full |
Data_Sheet_1_Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.FASTA |
title_fullStr |
Data_Sheet_1_Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.FASTA |
title_full_unstemmed |
Data_Sheet_1_Identification of High-Confidence Structural Variants in Domesticated Rainbow Trout Using Whole-Genome Sequencing.FASTA |
title_sort |
data_sheet_1_identification of high-confidence structural variants in domesticated rainbow trout using whole-genome sequencing.fasta |
publishDate |
2021 |
url |
https://doi.org/10.3389/fgene.2021.639355.s001 |
geographic |
Pacific |
geographic_facet |
Pacific |
genre |
Atlantic salmon |
genre_facet |
Atlantic salmon |
op_relation |
https://figshare.com/articles/dataset/Data_Sheet_1_Identification_of_High-Confidence_Structural_Variants_in_Domesticated_Rainbow_Trout_Using_Whole-Genome_Sequencing_FASTA/14112029 doi:10.3389/fgene.2021.639355.s001 |
op_rights |
CC BY 4.0 |
op_rightsnorm |
CC-BY |
op_doi |
https://doi.org/10.3389/fgene.2021.639355.s001 |
_version_ |
1766363489054490624 |