A unified STR profiling system across multiple species with whole genome sequencing data
Abstract Background Short tandem repeats (STRs) serve as genetic markers in forensic scenes due to their high polymorphism in eukaryotic genomes. A variety of STRs profiling systems have been developed for species including human, dog, cat, cattle, etc. Maintaining these systems simultaneously can b...
Published in: | BMC Bioinformatics |
---|---|
Main Authors: | , , , , |
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
BMC
2019
|
Subjects: | |
Online Access: | https://doi.org/10.1186/s12859-019-3246-y https://doaj.org/article/c72b94420038409d95bd2bf32e90f98f |
id |
ftdoajarticles:oai:doaj.org/article:c72b94420038409d95bd2bf32e90f98f |
---|---|
record_format |
openpolar |
spelling |
ftdoajarticles:oai:doaj.org/article:c72b94420038409d95bd2bf32e90f98f 2023-05-15T15:51:21+02:00 A unified STR profiling system across multiple species with whole genome sequencing data Yilin Liu Jiao Xu Miaoxia Chen Changfa Wang Shuaicheng Li 2019-12-01T00:00:00Z https://doi.org/10.1186/s12859-019-3246-y https://doaj.org/article/c72b94420038409d95bd2bf32e90f98f EN eng BMC https://doi.org/10.1186/s12859-019-3246-y https://doaj.org/toc/1471-2105 doi:10.1186/s12859-019-3246-y 1471-2105 https://doaj.org/article/c72b94420038409d95bd2bf32e90f98f BMC Bioinformatics, Vol 20, Iss S24, Pp 1-10 (2019) Short tandem repeats Whole genome sequencing Individual identification Computer applications to medicine. Medical informatics R858-859.7 Biology (General) QH301-705.5 article 2019 ftdoajarticles https://doi.org/10.1186/s12859-019-3246-y 2022-12-31T12:10:36Z Abstract Background Short tandem repeats (STRs) serve as genetic markers in forensic scenes due to their high polymorphism in eukaryotic genomes. A variety of STRs profiling systems have been developed for species including human, dog, cat, cattle, etc. Maintaining these systems simultaneously can be costly. These mammals share many high similar regions along their genomes. With the availability of the massive amount of the whole genomics data of these species, it is possible to develop a unified STR profiling system. In this study, our objective is to propose and develop a unified set of STR loci that could be simultaneously applied to multiple species. Result To find a unified STR set, we collected the whole genome sequence data of the concerned species and mapped them to the human genome reference. Then we extracted the STR loci across the species. From these loci, we proposed an algorithm which selected a subset of loci by incorporating the optimized combined power of discrimination. Our results show that the unified set of loci have high combined power of discrimination, >1−10−9, for both individual species and the mixed population, as well as the random-match probability, <10−7 for all the involved species, indicating that the identified set of STR loci could be applied to multiple species. Conclusions We identified a set of STR loci which shared by multiple species. It implies that a unified STR profiling system is possible for these species under the forensic scenes. The system can be applied to the individual identification or paternal test of each of the ten common species which are Sus scrofa (pig), Bos taurus (cattle), Capra hircus (goat), Equus caballus (horse), Canis lupus familiaris (dog), Felis catus (cat), Ovis aries (sheep), Oryctolagus cuniculus (rabbit), and Bos grunniens (yak), and Homo sapiens (human). Our loci selection algorithm employed a greedy approach. The algorithm can generate the loci under different forensic parameters and for a specific combination of species. Article in Journal/Newspaper Canis lupus Directory of Open Access Journals: DOAJ Articles BMC Bioinformatics 20 S24 |
institution |
Open Polar |
collection |
Directory of Open Access Journals: DOAJ Articles |
op_collection_id |
ftdoajarticles |
language |
English |
topic |
Short tandem repeats Whole genome sequencing Individual identification Computer applications to medicine. Medical informatics R858-859.7 Biology (General) QH301-705.5 |
spellingShingle |
Short tandem repeats Whole genome sequencing Individual identification Computer applications to medicine. Medical informatics R858-859.7 Biology (General) QH301-705.5 Yilin Liu Jiao Xu Miaoxia Chen Changfa Wang Shuaicheng Li A unified STR profiling system across multiple species with whole genome sequencing data |
topic_facet |
Short tandem repeats Whole genome sequencing Individual identification Computer applications to medicine. Medical informatics R858-859.7 Biology (General) QH301-705.5 |
description |
Abstract Background Short tandem repeats (STRs) serve as genetic markers in forensic scenes due to their high polymorphism in eukaryotic genomes. A variety of STRs profiling systems have been developed for species including human, dog, cat, cattle, etc. Maintaining these systems simultaneously can be costly. These mammals share many high similar regions along their genomes. With the availability of the massive amount of the whole genomics data of these species, it is possible to develop a unified STR profiling system. In this study, our objective is to propose and develop a unified set of STR loci that could be simultaneously applied to multiple species. Result To find a unified STR set, we collected the whole genome sequence data of the concerned species and mapped them to the human genome reference. Then we extracted the STR loci across the species. From these loci, we proposed an algorithm which selected a subset of loci by incorporating the optimized combined power of discrimination. Our results show that the unified set of loci have high combined power of discrimination, >1−10−9, for both individual species and the mixed population, as well as the random-match probability, <10−7 for all the involved species, indicating that the identified set of STR loci could be applied to multiple species. Conclusions We identified a set of STR loci which shared by multiple species. It implies that a unified STR profiling system is possible for these species under the forensic scenes. The system can be applied to the individual identification or paternal test of each of the ten common species which are Sus scrofa (pig), Bos taurus (cattle), Capra hircus (goat), Equus caballus (horse), Canis lupus familiaris (dog), Felis catus (cat), Ovis aries (sheep), Oryctolagus cuniculus (rabbit), and Bos grunniens (yak), and Homo sapiens (human). Our loci selection algorithm employed a greedy approach. The algorithm can generate the loci under different forensic parameters and for a specific combination of species. |
format |
Article in Journal/Newspaper |
author |
Yilin Liu Jiao Xu Miaoxia Chen Changfa Wang Shuaicheng Li |
author_facet |
Yilin Liu Jiao Xu Miaoxia Chen Changfa Wang Shuaicheng Li |
author_sort |
Yilin Liu |
title |
A unified STR profiling system across multiple species with whole genome sequencing data |
title_short |
A unified STR profiling system across multiple species with whole genome sequencing data |
title_full |
A unified STR profiling system across multiple species with whole genome sequencing data |
title_fullStr |
A unified STR profiling system across multiple species with whole genome sequencing data |
title_full_unstemmed |
A unified STR profiling system across multiple species with whole genome sequencing data |
title_sort |
unified str profiling system across multiple species with whole genome sequencing data |
publisher |
BMC |
publishDate |
2019 |
url |
https://doi.org/10.1186/s12859-019-3246-y https://doaj.org/article/c72b94420038409d95bd2bf32e90f98f |
genre |
Canis lupus |
genre_facet |
Canis lupus |
op_source |
BMC Bioinformatics, Vol 20, Iss S24, Pp 1-10 (2019) |
op_relation |
https://doi.org/10.1186/s12859-019-3246-y https://doaj.org/toc/1471-2105 doi:10.1186/s12859-019-3246-y 1471-2105 https://doaj.org/article/c72b94420038409d95bd2bf32e90f98f |
op_doi |
https://doi.org/10.1186/s12859-019-3246-y |
container_title |
BMC Bioinformatics |
container_volume |
20 |
container_issue |
S24 |
_version_ |
1766386531120971776 |