Supporting data for "Canfam_GSD: De novo chromosome-length genome assembly of the German Shepherd Dog (Canis lupus familiaris) using a combination of long reads, optical mapping and Hi-C"

The German Shepherd Dog (GSD) is one of the most common breeds on earth and has been bred for its utility and intelligence. It is often first choice for police and military work, as well as protection, disability assistance and search-and-rescue. Yet, GSD’s are well known to be afflicted with a rang...

Full description

Bibliographic Details
Main Authors: Matt, Field A, Benjamin, Rosen D, Olga, Dudchenko, Eva, Chan K.F., Minoche E. Andre, Richard, Edwards J., Kirston, Barton, Lyons J. Ruth, Daniel, Enosi Tuipulotu, Vanessa, Hayes M., Arina, Omer, Colaric Zane, Jens, Keilwagen, Ksenia, Skvortsova, Ozren, Bogdanovic, Smith A M, Erez, Aiden Lieberman, Smith P.L. Timothy, Robert, Zammit A., O. Ballard William, J.
Format: Dataset
Language:English
Published: GigaScience Database 2020
Subjects:
Online Access:https://dx.doi.org/10.5524/100712
http://gigadb.org/dataset/100712
id ftdatacite:10.5524/100712
record_format openpolar
spelling ftdatacite:10.5524/100712 2023-05-15T15:51:22+02:00 Supporting data for "Canfam_GSD: De novo chromosome-length genome assembly of the German Shepherd Dog (Canis lupus familiaris) using a combination of long reads, optical mapping and Hi-C" Matt, Field A Benjamin, Rosen D Olga, Dudchenko Eva, Chan K.F. Minoche E. Andre Richard, Edwards J. Kirston, Barton Lyons J. Ruth Daniel, Enosi Tuipulotu Vanessa, Hayes M. Arina, Omer Colaric Zane Jens, Keilwagen Ksenia, Skvortsova Ozren, Bogdanovic Smith A M Erez, Aiden Lieberman Smith P.L. Timothy Robert, Zammit A. O. Ballard William, J. 2020 https://dx.doi.org/10.5524/100712 http://gigadb.org/dataset/100712 en eng GigaScience Database CC0 1.0 Universal http://creativecommons.org/publicdomain/zero/1.0 CC0 Genomic Epigenomic hi-c long read sequencing optical mapping de novo genome assembly canine hip dysplasia dna zoo dataset Dataset GigaDB Dataset 2020 ftdatacite https://doi.org/10.5524/100712 2021-11-05T12:55:41Z The German Shepherd Dog (GSD) is one of the most common breeds on earth and has been bred for its utility and intelligence. It is often first choice for police and military work, as well as protection, disability assistance and search-and-rescue. Yet, GSD’s are well known to be afflicted with a range of genetic diseases that can interfere with their training. Such diseases are of particular concern when they occur later in life, and fully trained animals are not able to continue their duties. Here, we provide the draft genome sequence of a healthy German Shepherd female as a reference for future disease and evolutionary studies We generated this improved canid reference genome (CanFam_GSD) utilising a combination of Pacific Bioscience, Oxford Nanopore,, 10X Genomics, Bionano, and Hi-C technologies. The GSD assembly is approximately 80 times as contiguous as the current canid reference genome (20.9 Mb vs 0.267 Mb contig N50), containing far fewer gaps (306 vs 23,876) and fewer scaffolds (429 vs 3,310) than the current canid reference genome CanFam v3.1. Two chromosomes (4 and 35) are assembled into single scaffolds with no gaps. Benchmarking Universal Single-Copy Orthologs analyses of the genome assembly results show 93.0% of the conserved single-copy genes are complete in the GSD assembly compared to 92.2% for CanFam v3.1. Homology-based gene annotation increases this value to about 99%. Detailed examination of the evolutionary important pancreatic amylase region reveals there are most likely seven copies of the gene indicative of a duplication of four ancestral copies and the disruption of one copy. GSD genome assembly and annotation were produced with major improvement in completeness, continuity and quality over the existing canid reference. This resource will enable further research related to canine diseases, the evolutionary relationships of canids, and other aspects of canid biology. Dataset Canis lupus DataCite Metadata Store (German National Library of Science and Technology) Pacific
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language English
topic Genomic
Epigenomic
hi-c
long read sequencing
optical mapping
de novo genome assembly
canine hip dysplasia
dna zoo
spellingShingle Genomic
Epigenomic
hi-c
long read sequencing
optical mapping
de novo genome assembly
canine hip dysplasia
dna zoo
Matt, Field A
Benjamin, Rosen D
Olga, Dudchenko
Eva, Chan K.F.
Minoche E. Andre
Richard, Edwards J.
Kirston, Barton
Lyons J. Ruth
Daniel, Enosi Tuipulotu
Vanessa, Hayes M.
Arina, Omer
Colaric Zane
Jens, Keilwagen
Ksenia, Skvortsova
Ozren, Bogdanovic
Smith A M
Erez, Aiden Lieberman
Smith P.L. Timothy
Robert, Zammit A.
O. Ballard William, J.
Supporting data for "Canfam_GSD: De novo chromosome-length genome assembly of the German Shepherd Dog (Canis lupus familiaris) using a combination of long reads, optical mapping and Hi-C"
topic_facet Genomic
Epigenomic
hi-c
long read sequencing
optical mapping
de novo genome assembly
canine hip dysplasia
dna zoo
description The German Shepherd Dog (GSD) is one of the most common breeds on earth and has been bred for its utility and intelligence. It is often first choice for police and military work, as well as protection, disability assistance and search-and-rescue. Yet, GSD’s are well known to be afflicted with a range of genetic diseases that can interfere with their training. Such diseases are of particular concern when they occur later in life, and fully trained animals are not able to continue their duties. Here, we provide the draft genome sequence of a healthy German Shepherd female as a reference for future disease and evolutionary studies We generated this improved canid reference genome (CanFam_GSD) utilising a combination of Pacific Bioscience, Oxford Nanopore,, 10X Genomics, Bionano, and Hi-C technologies. The GSD assembly is approximately 80 times as contiguous as the current canid reference genome (20.9 Mb vs 0.267 Mb contig N50), containing far fewer gaps (306 vs 23,876) and fewer scaffolds (429 vs 3,310) than the current canid reference genome CanFam v3.1. Two chromosomes (4 and 35) are assembled into single scaffolds with no gaps. Benchmarking Universal Single-Copy Orthologs analyses of the genome assembly results show 93.0% of the conserved single-copy genes are complete in the GSD assembly compared to 92.2% for CanFam v3.1. Homology-based gene annotation increases this value to about 99%. Detailed examination of the evolutionary important pancreatic amylase region reveals there are most likely seven copies of the gene indicative of a duplication of four ancestral copies and the disruption of one copy. GSD genome assembly and annotation were produced with major improvement in completeness, continuity and quality over the existing canid reference. This resource will enable further research related to canine diseases, the evolutionary relationships of canids, and other aspects of canid biology.
format Dataset
author Matt, Field A
Benjamin, Rosen D
Olga, Dudchenko
Eva, Chan K.F.
Minoche E. Andre
Richard, Edwards J.
Kirston, Barton
Lyons J. Ruth
Daniel, Enosi Tuipulotu
Vanessa, Hayes M.
Arina, Omer
Colaric Zane
Jens, Keilwagen
Ksenia, Skvortsova
Ozren, Bogdanovic
Smith A M
Erez, Aiden Lieberman
Smith P.L. Timothy
Robert, Zammit A.
O. Ballard William, J.
author_facet Matt, Field A
Benjamin, Rosen D
Olga, Dudchenko
Eva, Chan K.F.
Minoche E. Andre
Richard, Edwards J.
Kirston, Barton
Lyons J. Ruth
Daniel, Enosi Tuipulotu
Vanessa, Hayes M.
Arina, Omer
Colaric Zane
Jens, Keilwagen
Ksenia, Skvortsova
Ozren, Bogdanovic
Smith A M
Erez, Aiden Lieberman
Smith P.L. Timothy
Robert, Zammit A.
O. Ballard William, J.
author_sort Matt, Field A
title Supporting data for "Canfam_GSD: De novo chromosome-length genome assembly of the German Shepherd Dog (Canis lupus familiaris) using a combination of long reads, optical mapping and Hi-C"
title_short Supporting data for "Canfam_GSD: De novo chromosome-length genome assembly of the German Shepherd Dog (Canis lupus familiaris) using a combination of long reads, optical mapping and Hi-C"
title_full Supporting data for "Canfam_GSD: De novo chromosome-length genome assembly of the German Shepherd Dog (Canis lupus familiaris) using a combination of long reads, optical mapping and Hi-C"
title_fullStr Supporting data for "Canfam_GSD: De novo chromosome-length genome assembly of the German Shepherd Dog (Canis lupus familiaris) using a combination of long reads, optical mapping and Hi-C"
title_full_unstemmed Supporting data for "Canfam_GSD: De novo chromosome-length genome assembly of the German Shepherd Dog (Canis lupus familiaris) using a combination of long reads, optical mapping and Hi-C"
title_sort supporting data for "canfam_gsd: de novo chromosome-length genome assembly of the german shepherd dog (canis lupus familiaris) using a combination of long reads, optical mapping and hi-c"
publisher GigaScience Database
publishDate 2020
url https://dx.doi.org/10.5524/100712
http://gigadb.org/dataset/100712
geographic Pacific
geographic_facet Pacific
genre Canis lupus
genre_facet Canis lupus
op_rights CC0 1.0 Universal
http://creativecommons.org/publicdomain/zero/1.0
op_rightsnorm CC0
op_doi https://doi.org/10.5524/100712
_version_ 1766386552467881984