Capture of hemoglobin clusters

Combining high-throughput sequencing with targeted sequence capture has becomean attractive tool to study specific genomic regions of interest. Most studies have so farfocused on the exome using short-read technology. These approaches are not designedto capture intergenic regions needed to reconstru...

Full description

Bibliographic Details
Main Authors: Tørresen, Ole K., Hoff, Siv Nam Khang, Baalsrud, Helle Tessand
Format: Dataset
Language:unknown
Published: figshare 2018
Subjects:
Online Access:https://dx.doi.org/10.6084/m9.figshare.5875842
https://figshare.com/articles/Capture_of_hemoglobin_clusters/5875842
Description
Summary:Combining high-throughput sequencing with targeted sequence capture has becomean attractive tool to study specific genomic regions of interest. Most studies have so farfocused on the exome using short-read technology. These approaches are not designedto capture intergenic regions needed to reconstruct genomic organization, includingregulatory regions and gene synteny. Here, we demonstrate the power of combiningtargeted sequence capture with long-read sequencing technology for comparativegenomic analyses of the hemoglobin (Hb) gene clusters across eight species separatedby up to 70 million years. Guided by the reference genome assembly of the Atlantic cod(Gadus morhua) together with genome information from draft assemblies of selectedcodfishes, we designed probes covering the two Hb gene clusters. Use of custom-madebarcodes combined with PacBio RSII sequencing led to highly continuous assemblies ofthe LA (~100kb) and MN (~200kb) clusters, which include syntenic regions of codingand intergenic sequences. Our results revealed an overall conserved geneticorganization and synteny of the Hb genes within this lineage, yet with several, lineagespecific gene duplications. Moreover, for some of the species examined, we identifiedamino acid substitutions at two sites in the Hbb1 gene as well as length polymorphismsin its regulatory region, which has previously been linked to temperature adaptation inAtlantic cod populations. This study highlights the use of targeted long-read capture asa versatile approach for comparative genomic studies by generation of a cross-speciesgenomic resource elucidating the evolutionary history of the Hb gene family across thehighly divergent group of codfishes. 160111_Gmorhua_capture_probes.zip Sequences of the different probes.Hb_target_region_gadmor2.fasta.gz The target regions from the gadMor2 genome assembly.brosme_brosme.rawreads.fastq.gz The raw reads from capture for Brosme brosme.brosme_brosme_hb_assembly.fasta.gz The assembled hemoglobin regions of Brosme brosme. gadiculus_argenteus.rawreads.fastq.gz The raw reads from capture for Gadiculus argenteus.gadiculus_argenteus_hb_assembly.fasta.gz The assembled hemoglobin regions of Gadiculus argenteus.gadus_morhua.rawreads.fastq.gz The raw reads from capture for Gadus morhua.gadus_morhua_hb_assembly.fasta.gz The assembled hemoglobin regions of Gadus morhua.lota_lota.rawreads.fastq.gz The raw reads from capture for Lota lota.lota_lota_hb_assembly.fasta.gz The assembled hemoglobin regions of Lota lota.macrourus_berglax.rawreads.fastq.gz The raw reads from capture for Macrourus berglax.macrourus_berglax_hb_assembly.fasta.gz The assembled hemoglobin regions of Macrourus berglax.melanogrammus_aeglefinus.rawreads.fastq.gz The raw reads from capture for Melanogrammus aeglefinus.melanogrammus_aeglefinus_hb_assembly.fasta.gz The assembled hemoglobin regions of Melanogrammus aeglefinus.merluccius_merluccius.rawreads.fastq.gz The raw reads from capture for Merluccius merluccius.merluccius_merluccius_hb_assembly.fasta.gz The assembled hemoglobin regions of Merluccius merluccius.muraenolepsis_marmoratus.rawreads.fastq.gz The raw reads from capture for Muraenolepsis marmoratus. muraenolepsis_marmoratus_hb_assembly.fasta.gz The assembled hemoglobin regions of Muraenolepsis marmoratus.Boreogadus_saida_fish_2LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Boreogadus_saida_fish_2MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Gadiculus_argentus_fish_8LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Gadiculus_argentus_fish_8MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Lota_lota_fish_11LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Lota_lota_fish_11MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Macrourus_berglax_fish_17LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Macrourus_berglax_fish_17MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Melanogrammus_aeglefinus_fish_5LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Melanogrammus_aeglefinus_fish_5MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Merluccius_merluccius_fish_13LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Merluccius_merluccius_fish_13MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Muraenolepis_marmoratus_fish_20LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Muraenolepis_marmoratus_fish_20MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Theragra_chalcogramma_fish_7LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Theragra_chalcogramma_fish_7MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.Trachyrincus_scabrus_fish_36LA_scf.fasta.gz From low-coverage genome assembly, used for probe design.Trachyrincus_scabrus_fish_36MN_scf.fasta.gz From low-coverage genome assembly, used for probe design.