De novo assembly and comparative genomics of teleosts

During the last 20 years, genome sequencing and assembly projects have changed from requiring large international collaborations to a task that a handful of people can plan and conduct. This has been driven by improvements in sequencing technology and computational methods. More and more sequencing...

Full description

Bibliographic Details
Main Author: Tørresen, Ole Kristian
Format: Doctoral or Postdoctoral Thesis
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10852/57728
http://urn.nb.no/URN:NBN:no-60406
id ftoslouniv:oai:www.duo.uio.no:10852/57728
record_format openpolar
institution Open Polar
collection Universitet i Oslo: Digitale utgivelser ved UiO (DUO)
op_collection_id ftoslouniv
language English
description During the last 20 years, genome sequencing and assembly projects have changed from requiring large international collaborations to a task that a handful of people can plan and conduct. This has been driven by improvements in sequencing technology and computational methods. More and more sequencing and assembly projects are being conducted, with older assemblies being updated and improved, resulting in deeper understanding of the biology of a large and steadily growing number of species. The projects described in this thesis focus on genome assemblies created from species of the order Gadiformes, an order containing commercially and ecologically important fishes. Here, these assemblies are investigated in detail and compared to other teleost genome assemblies, with special attention to immune genes and short tandem repeats. We have updated and substantially improved the Atlantic cod (Gadus morhua) genome assembly with the use of different sequencing technologies and computational approaches. A major finding was that the presence of short tandem repeats (STRs) is the main factor that led to the fragmentation of the previous assembly. STRs are hypermutating loci that occur at high frequency (loci/Mbp) and high density (bp/Mbp) in the cod genome, surpassing that of other published genome assemblies. The STRs likely contribute to substantial genetic variation in natural cod populations. The Atlantic cod lacks genes involved in the major histocompatibility complex (MHC) II pathway, which is the pathway that normally detects and initiates a response against bacterial pathogens and thus is a crucial part of the adaptive immune system. To infer when in the ancestry of cod these genes were lost, we sequenced and assembled the genomes of 66 teleost species. We found that the loss is shared by all species in the order Gadiformes, and that there is an expanded repertoire of MHCI genes in the Gadiformes, which is likely connected with the large number of species in this order. Since the 66 new teleost (including gadiform) genome assemblies are fragmented, the properties of STRs and multi-copy immune genes are not easily investigated. To further elucidate their role in Gadiformes, we sequenced and assembled the genome of haddock (Melanogrammus aeglefinus), a relative of cod. Our result shows that the high density and frequency of STRs is a feature likely shared by all codfishes (a family inside Gadiformes), and possibly all Gadiformes. Cod and haddock share a similar repertoire of the innate immune Toll-like receptor (TLR) genes, with both losses and expansions. The expansions might be part of a compensatory mechanism for the absence of MHCII. Another class of genes, the NOD-like receptors (NLRs) has been reported in large numbers in species without an adaptive immune system. We find that cod and haddock as well as most other teleosts generally have a high number of NLRs, with a likely expansion at the root of this clade. Thus, a high number of NLRs in teleosts does not seem to be connected with the presence or absence of MHCII. This thesis shows what kind of questions genome assemblies created for different purposes can answer. Ideally, genome assemblies for all kinds of species should be created, upgraded and updated based on the best available technologies. But this is costly. With the right planning and set-up, assemblies based on low-coverage sequencing can be very powerful with regards to topics such as the presence/absence of genes and for phylogeny. Also, even with moderate amounts of long-read PacBio sequencing, it is possible to create highly contiguous genome assemblies addressing issues that are impossible to elucidate with fragmented assemblies, such as the amount of multi-copy immune genes.
format Doctoral or Postdoctoral Thesis
author Tørresen, Ole Kristian
spellingShingle Tørresen, Ole Kristian
De novo assembly and comparative genomics of teleosts
author_facet Tørresen, Ole Kristian
author_sort Tørresen, Ole Kristian
title De novo assembly and comparative genomics of teleosts
title_short De novo assembly and comparative genomics of teleosts
title_full De novo assembly and comparative genomics of teleosts
title_fullStr De novo assembly and comparative genomics of teleosts
title_full_unstemmed De novo assembly and comparative genomics of teleosts
title_sort de novo assembly and comparative genomics of teleosts
publishDate 2017
url http://hdl.handle.net/10852/57728
http://urn.nb.no/URN:NBN:no-60406
genre atlantic cod
Gadus morhua
genre_facet atlantic cod
Gadus morhua
op_relation Work I Ole K. Tørresen, Bastiaan Star, Sissel Jentoft, Kjetill S. Jakobsen, Alexander J. Nederbragt. The new era of genome sequencing using high-throughput sequencing technology: generation of the first version of the Atlantic cod genome. In Genomics in Aquaculture, edited by Simon MacKenzie and Sissel Jentoft. Cambridge, Massachusetts: Academic Press, 2016 The paper is not available in DUO due to publisher restrictions. The published version is available at: https://doi.org/10.1016/B978-0-12-801418-9.00001-9
Work II Ole K. Tørresen, Bastiaan Star, Sissel Jentoft, William B. Reinar, Harald Grove, Jason R. Miller, Brian P. Walenz, James Knight, Jenny M. Ekholm, Paul Peluso, Rolf B. Edvardsen, Ave Tooming-Klunderud, Morten Skage, Sigbjørn Lien, Kjetill S. Jakobsen, Alexander J. Nederbragt. 2017. An improved genome assembly uncovers prolific tandem repeats in Atlantic cod. BMC Genomics. 18:95. The paper is available in DUO: http://urn.nb.no/URN:NBN:no-56741
Work III Martin Malmstrøm, Michael Matschiner, Ole K. Tørresen, Kjetill S. Jakobsen, Sissel Jentoft. 2017. Whole genome sequencing data and de novo draft assemblies for 66 teleost species. Scientific Data. 4:160132. The paper is available in DUO: http://urn.nb.no/URN:NBN:no-60401
Work IV Martin Malmstrøm, Michael Matschiner, Ole K. Tørresen, Bastiaan Star, Lars G. Snipen, Thomas F. Hansen, Helle T. Baalsrud, Alexander J. Nederbragt, Reinhold Hanel, Walter Salzburger, Nils C. Stenseth, Kjetill S. Jakobsen, Sissel Jentoft. 2016. Evolution of the immune system influences speciation rates in teleost fishes. Nature Genetics. 48, 1204–1210. The paper is available in DUO: http://urn.nb.no/URN:NBN:no-60404
Work V Ole K. Tørresen, Marine S. O. Brieuc, Monica H. Solbakken, Tomasz Furmanek, Elin Sørhus, Alexander J. Nederbragt, Kjetill S. Jakobsen, Sonnich Meier, Rolf B. Edvardsen, Sissel Jentoft. Genomic architecture of codfishes featured by expansions of innate immune genes and short tandem repeats. Manuscript. To be published. The paper is not available in DUO awaiting publishing.
https://doi.org/10.1016/B978-0-12-801418-9.00001-9
http://urn.nb.no/URN:NBN:no-56741
http://urn.nb.no/URN:NBN:no-60401
http://urn.nb.no/URN:NBN:no-60404
http://urn.nb.no/URN:NBN:no-60406
http://hdl.handle.net/10852/57728
URN:NBN:no-60406
Fulltext https://www.duo.uio.no/bitstream/handle/10852/57728/4/PhD-Torresen-DUO.pdf
op_doi https://doi.org/10.1016/B978-0-12-801418-9.00001-9
container_start_page 1
op_container_end_page 20
_version_ 1766358187717427200
spelling ftoslouniv:oai:www.duo.uio.no:10852/57728 2023-05-15T15:27:46+02:00 De novo assembly and comparative genomics of teleosts Tørresen, Ole Kristian 2017 http://hdl.handle.net/10852/57728 http://urn.nb.no/URN:NBN:no-60406 en eng Work I Ole K. Tørresen, Bastiaan Star, Sissel Jentoft, Kjetill S. Jakobsen, Alexander J. Nederbragt. The new era of genome sequencing using high-throughput sequencing technology: generation of the first version of the Atlantic cod genome. In Genomics in Aquaculture, edited by Simon MacKenzie and Sissel Jentoft. Cambridge, Massachusetts: Academic Press, 2016 The paper is not available in DUO due to publisher restrictions. The published version is available at: https://doi.org/10.1016/B978-0-12-801418-9.00001-9 Work II Ole K. Tørresen, Bastiaan Star, Sissel Jentoft, William B. Reinar, Harald Grove, Jason R. Miller, Brian P. Walenz, James Knight, Jenny M. Ekholm, Paul Peluso, Rolf B. Edvardsen, Ave Tooming-Klunderud, Morten Skage, Sigbjørn Lien, Kjetill S. Jakobsen, Alexander J. Nederbragt. 2017. An improved genome assembly uncovers prolific tandem repeats in Atlantic cod. BMC Genomics. 18:95. The paper is available in DUO: http://urn.nb.no/URN:NBN:no-56741 Work III Martin Malmstrøm, Michael Matschiner, Ole K. Tørresen, Kjetill S. Jakobsen, Sissel Jentoft. 2017. Whole genome sequencing data and de novo draft assemblies for 66 teleost species. Scientific Data. 4:160132. The paper is available in DUO: http://urn.nb.no/URN:NBN:no-60401 Work IV Martin Malmstrøm, Michael Matschiner, Ole K. Tørresen, Bastiaan Star, Lars G. Snipen, Thomas F. Hansen, Helle T. Baalsrud, Alexander J. Nederbragt, Reinhold Hanel, Walter Salzburger, Nils C. Stenseth, Kjetill S. Jakobsen, Sissel Jentoft. 2016. Evolution of the immune system influences speciation rates in teleost fishes. Nature Genetics. 48, 1204–1210. The paper is available in DUO: http://urn.nb.no/URN:NBN:no-60404 Work V Ole K. Tørresen, Marine S. O. Brieuc, Monica H. Solbakken, Tomasz Furmanek, Elin Sørhus, Alexander J. Nederbragt, Kjetill S. Jakobsen, Sonnich Meier, Rolf B. Edvardsen, Sissel Jentoft. Genomic architecture of codfishes featured by expansions of innate immune genes and short tandem repeats. Manuscript. To be published. The paper is not available in DUO awaiting publishing. https://doi.org/10.1016/B978-0-12-801418-9.00001-9 http://urn.nb.no/URN:NBN:no-56741 http://urn.nb.no/URN:NBN:no-60401 http://urn.nb.no/URN:NBN:no-60404 http://urn.nb.no/URN:NBN:no-60406 http://hdl.handle.net/10852/57728 URN:NBN:no-60406 Fulltext https://www.duo.uio.no/bitstream/handle/10852/57728/4/PhD-Torresen-DUO.pdf Doctoral thesis Doktoravhandling 2017 ftoslouniv https://doi.org/10.1016/B978-0-12-801418-9.00001-9 2020-06-21T08:51:01Z During the last 20 years, genome sequencing and assembly projects have changed from requiring large international collaborations to a task that a handful of people can plan and conduct. This has been driven by improvements in sequencing technology and computational methods. More and more sequencing and assembly projects are being conducted, with older assemblies being updated and improved, resulting in deeper understanding of the biology of a large and steadily growing number of species. The projects described in this thesis focus on genome assemblies created from species of the order Gadiformes, an order containing commercially and ecologically important fishes. Here, these assemblies are investigated in detail and compared to other teleost genome assemblies, with special attention to immune genes and short tandem repeats. We have updated and substantially improved the Atlantic cod (Gadus morhua) genome assembly with the use of different sequencing technologies and computational approaches. A major finding was that the presence of short tandem repeats (STRs) is the main factor that led to the fragmentation of the previous assembly. STRs are hypermutating loci that occur at high frequency (loci/Mbp) and high density (bp/Mbp) in the cod genome, surpassing that of other published genome assemblies. The STRs likely contribute to substantial genetic variation in natural cod populations. The Atlantic cod lacks genes involved in the major histocompatibility complex (MHC) II pathway, which is the pathway that normally detects and initiates a response against bacterial pathogens and thus is a crucial part of the adaptive immune system. To infer when in the ancestry of cod these genes were lost, we sequenced and assembled the genomes of 66 teleost species. We found that the loss is shared by all species in the order Gadiformes, and that there is an expanded repertoire of MHCI genes in the Gadiformes, which is likely connected with the large number of species in this order. Since the 66 new teleost (including gadiform) genome assemblies are fragmented, the properties of STRs and multi-copy immune genes are not easily investigated. To further elucidate their role in Gadiformes, we sequenced and assembled the genome of haddock (Melanogrammus aeglefinus), a relative of cod. Our result shows that the high density and frequency of STRs is a feature likely shared by all codfishes (a family inside Gadiformes), and possibly all Gadiformes. Cod and haddock share a similar repertoire of the innate immune Toll-like receptor (TLR) genes, with both losses and expansions. The expansions might be part of a compensatory mechanism for the absence of MHCII. Another class of genes, the NOD-like receptors (NLRs) has been reported in large numbers in species without an adaptive immune system. We find that cod and haddock as well as most other teleosts generally have a high number of NLRs, with a likely expansion at the root of this clade. Thus, a high number of NLRs in teleosts does not seem to be connected with the presence or absence of MHCII. This thesis shows what kind of questions genome assemblies created for different purposes can answer. Ideally, genome assemblies for all kinds of species should be created, upgraded and updated based on the best available technologies. But this is costly. With the right planning and set-up, assemblies based on low-coverage sequencing can be very powerful with regards to topics such as the presence/absence of genes and for phylogeny. Also, even with moderate amounts of long-read PacBio sequencing, it is possible to create highly contiguous genome assemblies addressing issues that are impossible to elucidate with fragmented assemblies, such as the amount of multi-copy immune genes. Doctoral or Postdoctoral Thesis atlantic cod Gadus morhua Universitet i Oslo: Digitale utgivelser ved UiO (DUO) 1 20