Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb)

Background De novo assembling of large genomes, such as in conifers (~ 12–30 Gbp), which also consist of ~ 80% of repetitive DNA, is a very complex and computationally intense endeavor. One of the main problems in assembling such genomes lays in computing limitations of nucleotide sequence assembly...

Full description

Bibliographic Details
Published in:BMC Bioinformatics
Main Authors: Dmitry A. Kuzmin, Sergey I. Feranchuk, Vadim V. Sharov, Alexander N. Cybin, Stepan V. Makolov, Yuliya A. Putintseva, Natalya V. Oreshkova, Konstantin V. Krutovsky
Other Authors: Институт космических и информационных технологий, Институт фундаментальной биологии и биотехнологии, Кафедра высокопроизводительных вычислений, Базовая кафедра защиты и современных технологии мониторинга лесов
Format: Article in Journal/Newspaper
Language:unknown
Published: 2019
Subjects:
Online Access:https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2570-y
http://elib.sfu-kras.ru/handle/2311/129305
https://doi.org/10.1186/s12859-018-2570-y
id ftsiberianfuniv:oai:elib.sfu-kras.ru:2311/129305
record_format openpolar
spelling ftsiberianfuniv:oai:elib.sfu-kras.ru:2311/129305 2023-05-15T18:19:40+02:00 Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb) Dmitry A. Kuzmin Sergey I. Feranchuk Vadim V. Sharov Alexander N. Cybin Stepan V. Makolov Yuliya A. Putintseva Natalya V. Oreshkova Konstantin V. Krutovsky Институт космических и информационных технологий Институт фундаментальной биологии и биотехнологии Кафедра высокопроизводительных вычислений Базовая кафедра защиты и современных технологии мониторинга лесов 2019-02 https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2570-y http://elib.sfu-kras.ru/handle/2311/129305 https://doi.org/10.1186/s12859-018-2570-y unknown BMC Bioinformatics Q1 Dmitry A. Kuzmin. Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb) [Текст] / Dmitry A. Kuzmin, Sergey I. Feranchuk, Vadim V. Sharov, Alexander N. Cybin, Stepan V. Makolov, Yuliya A. Putintseva, Natalya V. Oreshkova, Konstantin V. Krutovsky // BMC Bioinformatics. — 2019. — Т. 20. 14712105 https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2570-y http://elib.sfu-kras.ru/handle/2311/129305 doi:10.1186/s12859-018-2570-y 34.03.23 Journal Article Journal Article Preprint 2019 ftsiberianfuniv https://doi.org/10.1186/s12859-018-2570-y 2020-01-21T00:52:32Z Background De novo assembling of large genomes, such as in conifers (~ 12–30 Gbp), which also consist of ~ 80% of repetitive DNA, is a very complex and computationally intense endeavor. One of the main problems in assembling such genomes lays in computing limitations of nucleotide sequence assembly programs (DNA assemblers). As a rule, modern assemblers are usually designed to assemble genomes with a length not exceeding the length of the human genome (3.24 Gbp). Most assemblers cannot handle the amount of input sequence data required to provide sufficient coverage needed for a high-quality assembly. Results An original stepwise method of de novo assembly by parts (sets), which allows to bypass the limitations of modern assemblers associated with a huge amount of data being processed, is presented in this paper. The results of numerical assembling experiments conducted using the model plant Arabidopsis thaliana, Prunus persica (peach) and four most popular assemblers, ABySS, SOAPdenovo, SPAdes, and CLC Assembly Cell, showed the validity and effectiveness of the proposed stepwise assembling method. Conclusion Using the new stepwise de novo assembling method presented in the paper, the genome of Siberian larch, Larix sibirica Ledeb. (12.34 Gbp) was completely assembled de novo by the CLC Assembly Cell assembler. It is the first genome assembly for larch species in addition to only five other conifer genomes sequenced and assembled for Picea abies, Picea glauca, Pinus taeda, Pinus lambertiana, and Pseudotsuga menziesii var. menziesii. Article in Journal/Newspaper Sibirica Siberian Federal University: Archiv Elektronnych SFU Handle The ENVELOPE(161.983,161.983,-78.000,-78.000) BMC Bioinformatics 20 S1
institution Open Polar
collection Siberian Federal University: Archiv Elektronnych SFU
op_collection_id ftsiberianfuniv
language unknown
topic 34.03.23
spellingShingle 34.03.23
Dmitry A. Kuzmin
Sergey I. Feranchuk
Vadim V. Sharov
Alexander N. Cybin
Stepan V. Makolov
Yuliya A. Putintseva
Natalya V. Oreshkova
Konstantin V. Krutovsky
Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb)
topic_facet 34.03.23
description Background De novo assembling of large genomes, such as in conifers (~ 12–30 Gbp), which also consist of ~ 80% of repetitive DNA, is a very complex and computationally intense endeavor. One of the main problems in assembling such genomes lays in computing limitations of nucleotide sequence assembly programs (DNA assemblers). As a rule, modern assemblers are usually designed to assemble genomes with a length not exceeding the length of the human genome (3.24 Gbp). Most assemblers cannot handle the amount of input sequence data required to provide sufficient coverage needed for a high-quality assembly. Results An original stepwise method of de novo assembly by parts (sets), which allows to bypass the limitations of modern assemblers associated with a huge amount of data being processed, is presented in this paper. The results of numerical assembling experiments conducted using the model plant Arabidopsis thaliana, Prunus persica (peach) and four most popular assemblers, ABySS, SOAPdenovo, SPAdes, and CLC Assembly Cell, showed the validity and effectiveness of the proposed stepwise assembling method. Conclusion Using the new stepwise de novo assembling method presented in the paper, the genome of Siberian larch, Larix sibirica Ledeb. (12.34 Gbp) was completely assembled de novo by the CLC Assembly Cell assembler. It is the first genome assembly for larch species in addition to only five other conifer genomes sequenced and assembled for Picea abies, Picea glauca, Pinus taeda, Pinus lambertiana, and Pseudotsuga menziesii var. menziesii.
author2 Институт космических и информационных технологий
Институт фундаментальной биологии и биотехнологии
Кафедра высокопроизводительных вычислений
Базовая кафедра защиты и современных технологии мониторинга лесов
format Article in Journal/Newspaper
author Dmitry A. Kuzmin
Sergey I. Feranchuk
Vadim V. Sharov
Alexander N. Cybin
Stepan V. Makolov
Yuliya A. Putintseva
Natalya V. Oreshkova
Konstantin V. Krutovsky
author_facet Dmitry A. Kuzmin
Sergey I. Feranchuk
Vadim V. Sharov
Alexander N. Cybin
Stepan V. Makolov
Yuliya A. Putintseva
Natalya V. Oreshkova
Konstantin V. Krutovsky
author_sort Dmitry A. Kuzmin
title Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb)
title_short Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb)
title_full Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb)
title_fullStr Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb)
title_full_unstemmed Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb)
title_sort stepwise large genome assembly approach: a case of siberian larch (larix sibirica ledeb)
publishDate 2019
url https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2570-y
http://elib.sfu-kras.ru/handle/2311/129305
https://doi.org/10.1186/s12859-018-2570-y
long_lat ENVELOPE(161.983,161.983,-78.000,-78.000)
geographic Handle The
geographic_facet Handle The
genre Sibirica
genre_facet Sibirica
op_relation BMC Bioinformatics
Q1
Dmitry A. Kuzmin. Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb) [Текст] / Dmitry A. Kuzmin, Sergey I. Feranchuk, Vadim V. Sharov, Alexander N. Cybin, Stepan V. Makolov, Yuliya A. Putintseva, Natalya V. Oreshkova, Konstantin V. Krutovsky // BMC Bioinformatics. — 2019. — Т. 20.
14712105
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2570-y
http://elib.sfu-kras.ru/handle/2311/129305
doi:10.1186/s12859-018-2570-y
op_doi https://doi.org/10.1186/s12859-018-2570-y
container_title BMC Bioinformatics
container_volume 20
container_issue S1
_version_ 1766196870190727168