A unified STR profiling system across multiple species with whole genome sequencing data

Abstract Background Short tandem repeats (STRs) serve as genetic markers in forensic scenes due to their high polymorphism in eukaryotic genomes. A variety of STRs profiling systems have been developed for species including human, dog, cat, cattle, etc. Maintaining these systems simultaneously can b...

Full description

Bibliographic Details
Published in:BMC Bioinformatics
Main Authors: Yilin Liu, Jiao Xu, Miaoxia Chen, Changfa Wang, Shuaicheng Li
Format: Article in Journal/Newspaper
Language:English
Published: BMC 2019
Subjects:
Online Access:https://doi.org/10.1186/s12859-019-3246-y
https://doaj.org/article/c72b94420038409d95bd2bf32e90f98f
id ftdoajarticles:oai:doaj.org/article:c72b94420038409d95bd2bf32e90f98f
record_format openpolar
spelling ftdoajarticles:oai:doaj.org/article:c72b94420038409d95bd2bf32e90f98f 2023-05-15T15:51:21+02:00 A unified STR profiling system across multiple species with whole genome sequencing data Yilin Liu Jiao Xu Miaoxia Chen Changfa Wang Shuaicheng Li 2019-12-01T00:00:00Z https://doi.org/10.1186/s12859-019-3246-y https://doaj.org/article/c72b94420038409d95bd2bf32e90f98f EN eng BMC https://doi.org/10.1186/s12859-019-3246-y https://doaj.org/toc/1471-2105 doi:10.1186/s12859-019-3246-y 1471-2105 https://doaj.org/article/c72b94420038409d95bd2bf32e90f98f BMC Bioinformatics, Vol 20, Iss S24, Pp 1-10 (2019) Short tandem repeats Whole genome sequencing Individual identification Computer applications to medicine. Medical informatics R858-859.7 Biology (General) QH301-705.5 article 2019 ftdoajarticles https://doi.org/10.1186/s12859-019-3246-y 2022-12-31T12:10:36Z Abstract Background Short tandem repeats (STRs) serve as genetic markers in forensic scenes due to their high polymorphism in eukaryotic genomes. A variety of STRs profiling systems have been developed for species including human, dog, cat, cattle, etc. Maintaining these systems simultaneously can be costly. These mammals share many high similar regions along their genomes. With the availability of the massive amount of the whole genomics data of these species, it is possible to develop a unified STR profiling system. In this study, our objective is to propose and develop a unified set of STR loci that could be simultaneously applied to multiple species. Result To find a unified STR set, we collected the whole genome sequence data of the concerned species and mapped them to the human genome reference. Then we extracted the STR loci across the species. From these loci, we proposed an algorithm which selected a subset of loci by incorporating the optimized combined power of discrimination. Our results show that the unified set of loci have high combined power of discrimination, >1−10−9, for both individual species and the mixed population, as well as the random-match probability, <10−7 for all the involved species, indicating that the identified set of STR loci could be applied to multiple species. Conclusions We identified a set of STR loci which shared by multiple species. It implies that a unified STR profiling system is possible for these species under the forensic scenes. The system can be applied to the individual identification or paternal test of each of the ten common species which are Sus scrofa (pig), Bos taurus (cattle), Capra hircus (goat), Equus caballus (horse), Canis lupus familiaris (dog), Felis catus (cat), Ovis aries (sheep), Oryctolagus cuniculus (rabbit), and Bos grunniens (yak), and Homo sapiens (human). Our loci selection algorithm employed a greedy approach. The algorithm can generate the loci under different forensic parameters and for a specific combination of species. Article in Journal/Newspaper Canis lupus Directory of Open Access Journals: DOAJ Articles BMC Bioinformatics 20 S24
institution Open Polar
collection Directory of Open Access Journals: DOAJ Articles
op_collection_id ftdoajarticles
language English
topic Short tandem repeats
Whole genome sequencing
Individual identification
Computer applications to medicine. Medical informatics
R858-859.7
Biology (General)
QH301-705.5
spellingShingle Short tandem repeats
Whole genome sequencing
Individual identification
Computer applications to medicine. Medical informatics
R858-859.7
Biology (General)
QH301-705.5
Yilin Liu
Jiao Xu
Miaoxia Chen
Changfa Wang
Shuaicheng Li
A unified STR profiling system across multiple species with whole genome sequencing data
topic_facet Short tandem repeats
Whole genome sequencing
Individual identification
Computer applications to medicine. Medical informatics
R858-859.7
Biology (General)
QH301-705.5
description Abstract Background Short tandem repeats (STRs) serve as genetic markers in forensic scenes due to their high polymorphism in eukaryotic genomes. A variety of STRs profiling systems have been developed for species including human, dog, cat, cattle, etc. Maintaining these systems simultaneously can be costly. These mammals share many high similar regions along their genomes. With the availability of the massive amount of the whole genomics data of these species, it is possible to develop a unified STR profiling system. In this study, our objective is to propose and develop a unified set of STR loci that could be simultaneously applied to multiple species. Result To find a unified STR set, we collected the whole genome sequence data of the concerned species and mapped them to the human genome reference. Then we extracted the STR loci across the species. From these loci, we proposed an algorithm which selected a subset of loci by incorporating the optimized combined power of discrimination. Our results show that the unified set of loci have high combined power of discrimination, >1−10−9, for both individual species and the mixed population, as well as the random-match probability, <10−7 for all the involved species, indicating that the identified set of STR loci could be applied to multiple species. Conclusions We identified a set of STR loci which shared by multiple species. It implies that a unified STR profiling system is possible for these species under the forensic scenes. The system can be applied to the individual identification or paternal test of each of the ten common species which are Sus scrofa (pig), Bos taurus (cattle), Capra hircus (goat), Equus caballus (horse), Canis lupus familiaris (dog), Felis catus (cat), Ovis aries (sheep), Oryctolagus cuniculus (rabbit), and Bos grunniens (yak), and Homo sapiens (human). Our loci selection algorithm employed a greedy approach. The algorithm can generate the loci under different forensic parameters and for a specific combination of species.
format Article in Journal/Newspaper
author Yilin Liu
Jiao Xu
Miaoxia Chen
Changfa Wang
Shuaicheng Li
author_facet Yilin Liu
Jiao Xu
Miaoxia Chen
Changfa Wang
Shuaicheng Li
author_sort Yilin Liu
title A unified STR profiling system across multiple species with whole genome sequencing data
title_short A unified STR profiling system across multiple species with whole genome sequencing data
title_full A unified STR profiling system across multiple species with whole genome sequencing data
title_fullStr A unified STR profiling system across multiple species with whole genome sequencing data
title_full_unstemmed A unified STR profiling system across multiple species with whole genome sequencing data
title_sort unified str profiling system across multiple species with whole genome sequencing data
publisher BMC
publishDate 2019
url https://doi.org/10.1186/s12859-019-3246-y
https://doaj.org/article/c72b94420038409d95bd2bf32e90f98f
genre Canis lupus
genre_facet Canis lupus
op_source BMC Bioinformatics, Vol 20, Iss S24, Pp 1-10 (2019)
op_relation https://doi.org/10.1186/s12859-019-3246-y
https://doaj.org/toc/1471-2105
doi:10.1186/s12859-019-3246-y
1471-2105
https://doaj.org/article/c72b94420038409d95bd2bf32e90f98f
op_doi https://doi.org/10.1186/s12859-019-3246-y
container_title BMC Bioinformatics
container_volume 20
container_issue S24
_version_ 1766386531120971776