Capture of hemoglobin clusters

Combining high-throughput sequencing with targeted sequence capture has becomean attractive tool to study specific genomic regions of interest. Most studies have so farfocused on the exome using short-read technology. These approaches are not designedto capture intergenic regions needed to reconstru...

Full description

Bibliographic Details
Main Authors: Tørresen, Ole K., Hoff, Siv Nam Khang, Baalsrud, Helle Tessand
Format: Dataset
Language:unknown
Published: figshare 2018
Subjects:
Online Access:https://dx.doi.org/10.6084/m9.figshare.5875842
https://figshare.com/articles/Capture_of_hemoglobin_clusters/5875842
id ftdatacite:10.6084/m9.figshare.5875842
record_format openpolar
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language unknown
topic 60408 Genomics
FOS Biological sciences
spellingShingle 60408 Genomics
FOS Biological sciences
Tørresen, Ole K.
Hoff, Siv Nam Khang
Baalsrud, Helle Tessand
Capture of hemoglobin clusters
topic_facet 60408 Genomics
FOS Biological sciences
description Combining high-throughput sequencing with targeted sequence capture has becomean attractive tool to study specific genomic regions of interest. Most studies have so farfocused on the exome using short-read technology. These approaches are not designedto capture intergenic regions needed to reconstruct genomic organization, includingregulatory regions and gene synteny. Here, we demonstrate the power of combiningtargeted sequence capture with long-read sequencing technology for comparativegenomic analyses of the hemoglobin (Hb) gene clusters across eight species separatedby up to 70 million years. Guided by the reference genome assembly of the Atlantic cod(Gadus morhua) together with genome information from draft assemblies of selectedcodfishes, we designed probes covering the two Hb gene clusters. Use of custom-madebarcodes combined with PacBio RSII sequencing led to highly continuous assemblies ofthe LA (~100kb) and MN (~200kb) clusters, which include syntenic regions of codingand intergenic sequences. Our results revealed an overall conserved geneticorganization and synteny of the Hb genes within this lineage, yet with several, lineagespecific gene duplications. Moreover, for some of the species examined, we identifiedamino acid substitutions at two sites in the Hbb1 gene as well as length polymorphismsin its regulatory region, which has previously been linked to temperature adaptation inAtlantic cod populations. This study highlights the use of targeted long-read capture asa versatile approach for comparative genomic studies by generation of a cross-speciesgenomic resource elucidating the evolutionary history of the Hb gene family across thehighly divergent group of codfishes. 160111_Gmorhua_capture_probes.zip Sequences of the different probes.Hb_target_region_gadmor2.fasta.gz The target regions from the gadMor2 genome assembly.brosme_brosme.rawreads.fastq.gz The raw reads from capture for Brosme brosme.brosme_brosme_hb_assembly.fasta.gz The assembled hemoglobin regions of Brosme brosme. gadiculus_argenteus.rawreads.fastq.gz The raw reads from capture for Gadiculus argenteus.gadiculus_argenteus_hb_assembly.fasta.gz The assembled hemoglobin regions of Gadiculus argenteus.gadus_morhua.rawreads.fastq.gz The raw reads from capture for Gadus morhua.gadus_morhua_hb_assembly.fasta.gz The assembled hemoglobin regions of Gadus morhua.lota_lota.rawreads.fastq.gz The raw reads from capture for Lota lota.lota_lota_hb_assembly.fasta.gz The assembled hemoglobin regions of Lota lota.macrourus_berglax.rawreads.fastq.gz The raw reads from capture for Macrourus berglax.macrourus_berglax_hb_assembly.fasta.gz The assembled hemoglobin regions of Macrourus berglax.melanogrammus_aeglefinus.rawreads.fastq.gz The raw reads from capture for Melanogrammus aeglefinus.melanogrammus_aeglefinus_hb_assembly.fasta.gz The assembled hemoglobin regions of Melanogrammus aeglefinus.merluccius_merluccius.rawreads.fastq.gz The raw reads from capture for Merluccius merluccius.merluccius_merluccius_hb_assembly.fasta.gz The assembled hemoglobin regions of Merluccius merluccius.muraenolepsis_marmoratus.rawreads.fastq.gz The raw reads from capture for Muraenolepsis marmoratus. muraenolepsis_marmoratus_hb_assembly.fasta.gz The assembled hemoglobin regions of Muraenolepsis marmoratus.Boreogadus_saida_fish_2LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Boreogadus_saida_fish_2MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Gadiculus_argentus_fish_8LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Gadiculus_argentus_fish_8MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Lota_lota_fish_11LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Lota_lota_fish_11MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Macrourus_berglax_fish_17LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Macrourus_berglax_fish_17MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Melanogrammus_aeglefinus_fish_5LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Melanogrammus_aeglefinus_fish_5MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Merluccius_merluccius_fish_13LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Merluccius_merluccius_fish_13MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Muraenolepis_marmoratus_fish_20LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Muraenolepis_marmoratus_fish_20MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Theragra_chalcogramma_fish_7LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Theragra_chalcogramma_fish_7MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Trachyrincus_scabrus_fish_36LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Trachyrincus_scabrus_fish_36MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.
format Dataset
author Tørresen, Ole K.
Hoff, Siv Nam Khang
Baalsrud, Helle Tessand
author_facet Tørresen, Ole K.
Hoff, Siv Nam Khang
Baalsrud, Helle Tessand
author_sort Tørresen, Ole K.
title Capture of hemoglobin clusters
title_short Capture of hemoglobin clusters
title_full Capture of hemoglobin clusters
title_fullStr Capture of hemoglobin clusters
title_full_unstemmed Capture of hemoglobin clusters
title_sort capture of hemoglobin clusters
publisher figshare
publishDate 2018
url https://dx.doi.org/10.6084/m9.figshare.5875842
https://figshare.com/articles/Capture_of_hemoglobin_clusters/5875842
genre atlantic cod
Gadus morhua
lota
genre_facet atlantic cod
Gadus morhua
lota
op_rights Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
cc-by-4.0
op_rightsnorm CC-BY
op_doi https://doi.org/10.6084/m9.figshare.5875842
_version_ 1766358282158473216
spelling ftdatacite:10.6084/m9.figshare.5875842 2023-05-15T15:27:53+02:00 Capture of hemoglobin clusters Tørresen, Ole K. Hoff, Siv Nam Khang Baalsrud, Helle Tessand 2018 https://dx.doi.org/10.6084/m9.figshare.5875842 https://figshare.com/articles/Capture_of_hemoglobin_clusters/5875842 unknown figshare Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode cc-by-4.0 CC-BY 60408 Genomics FOS Biological sciences dataset Dataset 2018 ftdatacite https://doi.org/10.6084/m9.figshare.5875842 2021-11-05T12:55:41Z Combining high-throughput sequencing with targeted sequence capture has becomean attractive tool to study specific genomic regions of interest. Most studies have so farfocused on the exome using short-read technology. These approaches are not designedto capture intergenic regions needed to reconstruct genomic organization, includingregulatory regions and gene synteny. Here, we demonstrate the power of combiningtargeted sequence capture with long-read sequencing technology for comparativegenomic analyses of the hemoglobin (Hb) gene clusters across eight species separatedby up to 70 million years. Guided by the reference genome assembly of the Atlantic cod(Gadus morhua) together with genome information from draft assemblies of selectedcodfishes, we designed probes covering the two Hb gene clusters. Use of custom-madebarcodes combined with PacBio RSII sequencing led to highly continuous assemblies ofthe LA (~100kb) and MN (~200kb) clusters, which include syntenic regions of codingand intergenic sequences. Our results revealed an overall conserved geneticorganization and synteny of the Hb genes within this lineage, yet with several, lineagespecific gene duplications. Moreover, for some of the species examined, we identifiedamino acid substitutions at two sites in the Hbb1 gene as well as length polymorphismsin its regulatory region, which has previously been linked to temperature adaptation inAtlantic cod populations. This study highlights the use of targeted long-read capture asa versatile approach for comparative genomic studies by generation of a cross-speciesgenomic resource elucidating the evolutionary history of the Hb gene family across thehighly divergent group of codfishes. 160111_Gmorhua_capture_probes.zip Sequences of the different probes.Hb_target_region_gadmor2.fasta.gz The target regions from the gadMor2 genome assembly.brosme_brosme.rawreads.fastq.gz The raw reads from capture for Brosme brosme.brosme_brosme_hb_assembly.fasta.gz The assembled hemoglobin regions of Brosme brosme. gadiculus_argenteus.rawreads.fastq.gz The raw reads from capture for Gadiculus argenteus.gadiculus_argenteus_hb_assembly.fasta.gz The assembled hemoglobin regions of Gadiculus argenteus.gadus_morhua.rawreads.fastq.gz The raw reads from capture for Gadus morhua.gadus_morhua_hb_assembly.fasta.gz The assembled hemoglobin regions of Gadus morhua.lota_lota.rawreads.fastq.gz The raw reads from capture for Lota lota.lota_lota_hb_assembly.fasta.gz The assembled hemoglobin regions of Lota lota.macrourus_berglax.rawreads.fastq.gz The raw reads from capture for Macrourus berglax.macrourus_berglax_hb_assembly.fasta.gz The assembled hemoglobin regions of Macrourus berglax.melanogrammus_aeglefinus.rawreads.fastq.gz The raw reads from capture for Melanogrammus aeglefinus.melanogrammus_aeglefinus_hb_assembly.fasta.gz The assembled hemoglobin regions of Melanogrammus aeglefinus.merluccius_merluccius.rawreads.fastq.gz The raw reads from capture for Merluccius merluccius.merluccius_merluccius_hb_assembly.fasta.gz The assembled hemoglobin regions of Merluccius merluccius.muraenolepsis_marmoratus.rawreads.fastq.gz The raw reads from capture for Muraenolepsis marmoratus. muraenolepsis_marmoratus_hb_assembly.fasta.gz The assembled hemoglobin regions of Muraenolepsis marmoratus.Boreogadus_saida_fish_2LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Boreogadus_saida_fish_2MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Gadiculus_argentus_fish_8LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Gadiculus_argentus_fish_8MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Lota_lota_fish_11LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Lota_lota_fish_11MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Macrourus_berglax_fish_17LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Macrourus_berglax_fish_17MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Melanogrammus_aeglefinus_fish_5LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Melanogrammus_aeglefinus_fish_5MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Merluccius_merluccius_fish_13LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Merluccius_merluccius_fish_13MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Muraenolepis_marmoratus_fish_20LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Muraenolepis_marmoratus_fish_20MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Theragra_chalcogramma_fish_7LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Theragra_chalcogramma_fish_7MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Trachyrincus_scabrus_fish_36LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Trachyrincus_scabrus_fish_36MN_scf.fasta.gz From low-coverage genome assembly, used for probe design. Dataset atlantic cod Gadus morhua lota DataCite Metadata Store (German National Library of Science and Technology)