Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb)
Background De novo assembling of large genomes, such as in conifers (~ 12–30 Gbp), which also consist of ~ 80% of repetitive DNA, is a very complex and computationally intense endeavor. One of the main problems in assembling such genomes lays in computing limitations of nucleotide sequence assembly...
Published in: | BMC Bioinformatics |
---|---|
Main Authors: | , , , , , , , |
Other Authors: | , , , |
Format: | Article in Journal/Newspaper |
Language: | unknown |
Published: |
2019
|
Subjects: | |
Online Access: | https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2570-y http://elib.sfu-kras.ru/handle/2311/129305 https://doi.org/10.1186/s12859-018-2570-y |
id |
ftsiberianfuniv:oai:elib.sfu-kras.ru:2311/129305 |
---|---|
record_format |
openpolar |
spelling |
ftsiberianfuniv:oai:elib.sfu-kras.ru:2311/129305 2023-05-15T18:19:40+02:00 Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb) Dmitry A. Kuzmin Sergey I. Feranchuk Vadim V. Sharov Alexander N. Cybin Stepan V. Makolov Yuliya A. Putintseva Natalya V. Oreshkova Konstantin V. Krutovsky Институт космических и информационных технологий Институт фундаментальной биологии и биотехнологии Кафедра высокопроизводительных вычислений Базовая кафедра защиты и современных технологии мониторинга лесов 2019-02 https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2570-y http://elib.sfu-kras.ru/handle/2311/129305 https://doi.org/10.1186/s12859-018-2570-y unknown BMC Bioinformatics Q1 Dmitry A. Kuzmin. Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb) [Текст] / Dmitry A. Kuzmin, Sergey I. Feranchuk, Vadim V. Sharov, Alexander N. Cybin, Stepan V. Makolov, Yuliya A. Putintseva, Natalya V. Oreshkova, Konstantin V. Krutovsky // BMC Bioinformatics. — 2019. — Т. 20. 14712105 https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2570-y http://elib.sfu-kras.ru/handle/2311/129305 doi:10.1186/s12859-018-2570-y 34.03.23 Journal Article Journal Article Preprint 2019 ftsiberianfuniv https://doi.org/10.1186/s12859-018-2570-y 2020-01-21T00:52:32Z Background De novo assembling of large genomes, such as in conifers (~ 12–30 Gbp), which also consist of ~ 80% of repetitive DNA, is a very complex and computationally intense endeavor. One of the main problems in assembling such genomes lays in computing limitations of nucleotide sequence assembly programs (DNA assemblers). As a rule, modern assemblers are usually designed to assemble genomes with a length not exceeding the length of the human genome (3.24 Gbp). Most assemblers cannot handle the amount of input sequence data required to provide sufficient coverage needed for a high-quality assembly. Results An original stepwise method of de novo assembly by parts (sets), which allows to bypass the limitations of modern assemblers associated with a huge amount of data being processed, is presented in this paper. The results of numerical assembling experiments conducted using the model plant Arabidopsis thaliana, Prunus persica (peach) and four most popular assemblers, ABySS, SOAPdenovo, SPAdes, and CLC Assembly Cell, showed the validity and effectiveness of the proposed stepwise assembling method. Conclusion Using the new stepwise de novo assembling method presented in the paper, the genome of Siberian larch, Larix sibirica Ledeb. (12.34 Gbp) was completely assembled de novo by the CLC Assembly Cell assembler. It is the first genome assembly for larch species in addition to only five other conifer genomes sequenced and assembled for Picea abies, Picea glauca, Pinus taeda, Pinus lambertiana, and Pseudotsuga menziesii var. menziesii. Article in Journal/Newspaper Sibirica Siberian Federal University: Archiv Elektronnych SFU Handle The ENVELOPE(161.983,161.983,-78.000,-78.000) BMC Bioinformatics 20 S1 |
institution |
Open Polar |
collection |
Siberian Federal University: Archiv Elektronnych SFU |
op_collection_id |
ftsiberianfuniv |
language |
unknown |
topic |
34.03.23 |
spellingShingle |
34.03.23 Dmitry A. Kuzmin Sergey I. Feranchuk Vadim V. Sharov Alexander N. Cybin Stepan V. Makolov Yuliya A. Putintseva Natalya V. Oreshkova Konstantin V. Krutovsky Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb) |
topic_facet |
34.03.23 |
description |
Background De novo assembling of large genomes, such as in conifers (~ 12–30 Gbp), which also consist of ~ 80% of repetitive DNA, is a very complex and computationally intense endeavor. One of the main problems in assembling such genomes lays in computing limitations of nucleotide sequence assembly programs (DNA assemblers). As a rule, modern assemblers are usually designed to assemble genomes with a length not exceeding the length of the human genome (3.24 Gbp). Most assemblers cannot handle the amount of input sequence data required to provide sufficient coverage needed for a high-quality assembly. Results An original stepwise method of de novo assembly by parts (sets), which allows to bypass the limitations of modern assemblers associated with a huge amount of data being processed, is presented in this paper. The results of numerical assembling experiments conducted using the model plant Arabidopsis thaliana, Prunus persica (peach) and four most popular assemblers, ABySS, SOAPdenovo, SPAdes, and CLC Assembly Cell, showed the validity and effectiveness of the proposed stepwise assembling method. Conclusion Using the new stepwise de novo assembling method presented in the paper, the genome of Siberian larch, Larix sibirica Ledeb. (12.34 Gbp) was completely assembled de novo by the CLC Assembly Cell assembler. It is the first genome assembly for larch species in addition to only five other conifer genomes sequenced and assembled for Picea abies, Picea glauca, Pinus taeda, Pinus lambertiana, and Pseudotsuga menziesii var. menziesii. |
author2 |
Институт космических и информационных технологий Институт фундаментальной биологии и биотехнологии Кафедра высокопроизводительных вычислений Базовая кафедра защиты и современных технологии мониторинга лесов |
format |
Article in Journal/Newspaper |
author |
Dmitry A. Kuzmin Sergey I. Feranchuk Vadim V. Sharov Alexander N. Cybin Stepan V. Makolov Yuliya A. Putintseva Natalya V. Oreshkova Konstantin V. Krutovsky |
author_facet |
Dmitry A. Kuzmin Sergey I. Feranchuk Vadim V. Sharov Alexander N. Cybin Stepan V. Makolov Yuliya A. Putintseva Natalya V. Oreshkova Konstantin V. Krutovsky |
author_sort |
Dmitry A. Kuzmin |
title |
Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb) |
title_short |
Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb) |
title_full |
Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb) |
title_fullStr |
Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb) |
title_full_unstemmed |
Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb) |
title_sort |
stepwise large genome assembly approach: a case of siberian larch (larix sibirica ledeb) |
publishDate |
2019 |
url |
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2570-y http://elib.sfu-kras.ru/handle/2311/129305 https://doi.org/10.1186/s12859-018-2570-y |
long_lat |
ENVELOPE(161.983,161.983,-78.000,-78.000) |
geographic |
Handle The |
geographic_facet |
Handle The |
genre |
Sibirica |
genre_facet |
Sibirica |
op_relation |
BMC Bioinformatics Q1 Dmitry A. Kuzmin. Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb) [Текст] / Dmitry A. Kuzmin, Sergey I. Feranchuk, Vadim V. Sharov, Alexander N. Cybin, Stepan V. Makolov, Yuliya A. Putintseva, Natalya V. Oreshkova, Konstantin V. Krutovsky // BMC Bioinformatics. — 2019. — Т. 20. 14712105 https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2570-y http://elib.sfu-kras.ru/handle/2311/129305 doi:10.1186/s12859-018-2570-y |
op_doi |
https://doi.org/10.1186/s12859-018-2570-y |
container_title |
BMC Bioinformatics |
container_volume |
20 |
container_issue |
S1 |
_version_ |
1766196870190727168 |