Genomic data from the domestic pigeon (Columba livia)

The domestic pigeon (Columba livia domestica) is one of the most common birds on planet Earth, located on every continent besides Antarctica. The sub-species sequenced was a breed known as the Danish Tumbler, a show pigeon with a distinct color markings. The domestic pigeon genome sequence provides...

Full description

Bibliographic Details
Main Authors: Li, C, Zhang, G, Gilbert, T, Wang, T
Format: Dataset
Language:English
Published: GigaScience 2011
Subjects:
Online Access:https://dx.doi.org/10.5524/100007
http://gigadb.org/dataset/100007
Description
Summary:The domestic pigeon (Columba livia domestica) is one of the most common birds on planet Earth, located on every continent besides Antarctica. The sub-species sequenced was a breed known as the Danish Tumbler, a show pigeon with a distinct color markings. The domestic pigeon genome sequence provides a better understanding of such a widespread creature, including certain mechanisms that scientists still fail to understand fully, such as the magnetosensitivity. The sequencing data also presents insight into the species’ similarities to and differences from other birds, and to how breeding might have shaped its genome as this sub-species was taken from Asian colonies to Denmark 400 years ago and selectively bred. In 2010, BGI used the whole genome shotgun sequencing and IlluminaHiseq 2000 system to generate 98X short reads for a Danish Tumbler. The raw data was then used by the assembler SOAPdenovo to produce a draft assembly of 1.1 Gb with N50 scaffold length of 3.1Mb and N50 contig length of 22.4 Kb. Based on the k-mer distribution of sequencing data, the genome size of Columba livia is estimated to be 1.3 Gb, suggesting the current assembly is about 84% complete. The percentage of GC content (41.5%) and the percentage of repetitive content (8.7%) in the pigeon are also similar in nature to three other avian genomes (chicken, zebra finch, turkey); the uncovered regions of the genome appear to be enriched in repeats. A total of 17,300 protein-coding genes are predicted in the assembly.