Detection and genotyping of Atlantic salmon structural variants with genome graphs

Structural variants (SVs) are defined as genomic rearrangements of 50 base pairs (bp) or larger. Although they are less frequent in the genome, they can account for ten folds more variable base pairs than the widely studied singe nucleotide polymorphisms (SNPs). SVs have been hard to detect by short...

Full description

Bibliographic Details
Main Author: Kjelstrup, Anna Sofie
Other Authors: Lien, Sigbjørn
Format: Master Thesis
Language:English
Published: Norwegian University of Life Sciences, Ås 2022
Subjects:
Online Access:https://hdl.handle.net/11250/3030212
id ftunivmob:oai:nmbu.brage.unit.no:11250/3030212
record_format openpolar
spelling ftunivmob:oai:nmbu.brage.unit.no:11250/3030212 2023-05-15T15:26:17+02:00 Detection and genotyping of Atlantic salmon structural variants with genome graphs Kjelstrup, Anna Sofie Lien, Sigbjørn 2022 application/pdf https://hdl.handle.net/11250/3030212 eng eng Norwegian University of Life Sciences, Ås https://hdl.handle.net/11250/3030212 Attribution-NonCommercial-NoDerivatives 4.0 Internasjonal http://creativecommons.org/licenses/by-nc-nd/4.0/deed.no CC-BY-NC-ND VDP::Landbruks- og Fiskerifag: 900 Master thesis 2022 ftunivmob 2022-11-09T23:43:19Z Structural variants (SVs) are defined as genomic rearrangements of 50 base pairs (bp) or larger. Although they are less frequent in the genome, they can account for ten folds more variable base pairs than the widely studied singe nucleotide polymorphisms (SNPs). SVs have been hard to detect by short-read sequencing, especially in repeat rich regions. The recent addition of a new reference genome (GCA_905237065.2) and long-read sequencing data for eleven Atlantic salmon individuals has allowed for a more extensive characterization of SVs, revealing a significantly higher count than previously reported. By constructing a genome graph with new high-quality assemblies based on long-reads, we aim to genotype salmon SVs in short-read data, not detectable by traditional methods. We demonstrate how genome graphs, generated with the bioinformatic pipeline PGGB, can be used to detect and accurately represent SVs in Atlantic salmon genomes. We also present two pipelines for graph-based genotyping using short-reads and discuss alternative metrics for genome graph quality improvement. Eventually, this work will contribute to building a whole genome graph for Atlantic salmon, enabling population scale SV-calling based on already available short-read data. Strukturelle varianter (SVer) er definert som genomisk endring på 50 basepar eller mer. Selv om de er i mindretall i genomet, står SVer for mange ganger antallet variable basepar enn de mye studerte enkeltnukleotidpolymorfismer (SNPs). Strukturelle varianter har tidligere vært utfordrende å oppdage ved bruk av eldre teknologi som shortread sekvensering, spesielt i regioner med høyt innhold av repetativt DNA. Et nytt refereanse genom for atlanterhavslaks (GCA_905237065.2), samnt long-read sekvenseringsdata for elleve individer, har åpnet opp for utvidet karakterisering/deteksjon av strukturelle varianter. Dette har avdekket høyere forekomster enn hva som tidligere har blitt rapportert. Ved å konstruere en genomgraf fra nye assemblies av høy kvalitet, basert på long-read ... Master Thesis Atlanterhavslaks Atlantic salmon Open archive Norwegian University of Life Sciences: Brage NMBU
institution Open Polar
collection Open archive Norwegian University of Life Sciences: Brage NMBU
op_collection_id ftunivmob
language English
topic VDP::Landbruks- og Fiskerifag: 900
spellingShingle VDP::Landbruks- og Fiskerifag: 900
Kjelstrup, Anna Sofie
Detection and genotyping of Atlantic salmon structural variants with genome graphs
topic_facet VDP::Landbruks- og Fiskerifag: 900
description Structural variants (SVs) are defined as genomic rearrangements of 50 base pairs (bp) or larger. Although they are less frequent in the genome, they can account for ten folds more variable base pairs than the widely studied singe nucleotide polymorphisms (SNPs). SVs have been hard to detect by short-read sequencing, especially in repeat rich regions. The recent addition of a new reference genome (GCA_905237065.2) and long-read sequencing data for eleven Atlantic salmon individuals has allowed for a more extensive characterization of SVs, revealing a significantly higher count than previously reported. By constructing a genome graph with new high-quality assemblies based on long-reads, we aim to genotype salmon SVs in short-read data, not detectable by traditional methods. We demonstrate how genome graphs, generated with the bioinformatic pipeline PGGB, can be used to detect and accurately represent SVs in Atlantic salmon genomes. We also present two pipelines for graph-based genotyping using short-reads and discuss alternative metrics for genome graph quality improvement. Eventually, this work will contribute to building a whole genome graph for Atlantic salmon, enabling population scale SV-calling based on already available short-read data. Strukturelle varianter (SVer) er definert som genomisk endring på 50 basepar eller mer. Selv om de er i mindretall i genomet, står SVer for mange ganger antallet variable basepar enn de mye studerte enkeltnukleotidpolymorfismer (SNPs). Strukturelle varianter har tidligere vært utfordrende å oppdage ved bruk av eldre teknologi som shortread sekvensering, spesielt i regioner med høyt innhold av repetativt DNA. Et nytt refereanse genom for atlanterhavslaks (GCA_905237065.2), samnt long-read sekvenseringsdata for elleve individer, har åpnet opp for utvidet karakterisering/deteksjon av strukturelle varianter. Dette har avdekket høyere forekomster enn hva som tidligere har blitt rapportert. Ved å konstruere en genomgraf fra nye assemblies av høy kvalitet, basert på long-read ...
author2 Lien, Sigbjørn
format Master Thesis
author Kjelstrup, Anna Sofie
author_facet Kjelstrup, Anna Sofie
author_sort Kjelstrup, Anna Sofie
title Detection and genotyping of Atlantic salmon structural variants with genome graphs
title_short Detection and genotyping of Atlantic salmon structural variants with genome graphs
title_full Detection and genotyping of Atlantic salmon structural variants with genome graphs
title_fullStr Detection and genotyping of Atlantic salmon structural variants with genome graphs
title_full_unstemmed Detection and genotyping of Atlantic salmon structural variants with genome graphs
title_sort detection and genotyping of atlantic salmon structural variants with genome graphs
publisher Norwegian University of Life Sciences, Ås
publishDate 2022
url https://hdl.handle.net/11250/3030212
genre Atlanterhavslaks
Atlantic salmon
genre_facet Atlanterhavslaks
Atlantic salmon
op_relation https://hdl.handle.net/11250/3030212
op_rights Attribution-NonCommercial-NoDerivatives 4.0 Internasjonal
http://creativecommons.org/licenses/by-nc-nd/4.0/deed.no
op_rightsnorm CC-BY-NC-ND
_version_ 1766356796164800512