Collaborative Mining of Whole Genome Sequences for Intelligent HIV-1 Sub-Strain(s) Discovery

Background: Effective global antiretroviral vaccines and therapeutic strategies depend on the diversity, evolution, and epidemiology of their various strains as well as their transmission and pathogenesis. Most viral disease-causing particles are clustered into a taxonomy of subtypes to suggest poin...

Full description

Bibliographic Details
Published in:Current HIV Research
Main Authors: Ekpenyong, Moses E., Adegoke, Anthony A., Edoho, Mercy E., Inyang, Udoinyang G., Udo, Ifiok J., Ekaidem, Itemobong S., Osang, Francis, Uto, Nseobong P., Geoffery, Joseph I.
Format: Article in Journal/Newspaper
Language:English
Published: Bentham Science Publishers Ltd. 2022
Subjects:
Online Access:http://dx.doi.org/10.2174/1570162x20666220210142209
https://www.eurekaselect.com/article/download?doi=10.2174/1570162X20666220210142209
https://www.eurekaselect.com/201020/article
id crbenthamsciepub:10.2174/1570162x20666220210142209
record_format openpolar
spelling crbenthamsciepub:10.2174/1570162x20666220210142209 2023-09-05T13:14:27+02:00 Collaborative Mining of Whole Genome Sequences for Intelligent HIV-1 Sub-Strain(s) Discovery Ekpenyong, Moses E. Adegoke, Anthony A. Edoho, Mercy E. Inyang, Udoinyang G. Udo, Ifiok J. Ekaidem, Itemobong S. Osang, Francis Uto, Nseobong P. Geoffery, Joseph I. 2022 http://dx.doi.org/10.2174/1570162x20666220210142209 https://www.eurekaselect.com/article/download?doi=10.2174/1570162X20666220210142209 https://www.eurekaselect.com/201020/article en eng Bentham Science Publishers Ltd. Current HIV Research volume 20, issue 2, page 163-183 ISSN 1570-162X Virology Infectious Diseases journal-article 2022 crbenthamsciepub https://doi.org/10.2174/1570162x20666220210142209 2023-08-11T15:31:34Z Background: Effective global antiretroviral vaccines and therapeutic strategies depend on the diversity, evolution, and epidemiology of their various strains as well as their transmission and pathogenesis. Most viral disease-causing particles are clustered into a taxonomy of subtypes to suggest pointers toward nucleotide-specific vaccines or therapeutic applications of clinical significance sufficient for sequence-specific diagnosis and homologous viral studies. These are very useful to formulate predictors to induce cross-resistance to some retroviral control drugs being used across study areas. Objective: This research proposed a collaborative framework of hybridized (Machine Learning and Natural Language Processing) techniques to discover hidden genome patterns and feature predictors for HIV-1 genome sequences mining. Method: 630 human HIV-1 genome sequences above 8500 bps were excavated from the National Center for Biotechnology Information (NCBI) database (https://www.ncbi.nlm.nih.gov) for 21 countries across different continents, except for Antarctica. These sequences were transformed and learned using a self-organizing map (SOM). To discriminate emerging/new sub-strain(s), the HIV-1 reference genome was included as part of the input isolates/samples during the training. After training the SOM, component planes defining pattern clusters of the input datasets were generated for cognitive knowledge mining and subsequent labeling of the datasets. Additional genome features, including dinucleotide transmission recurrences, codon recurrences, and mutation recurrences, were finally extracted from the raw genomes to construct output classification targets for supervised learning. Results: SOM training explains the inherent pattern diversity of HIV-1 genomes as well as interand intra-country transmissions in which mobility might play an active role, as corroborated by the literature. Nine sub-strains were discovered after disassembling the SOM correlation hunting matrix space attributed to disparate clusters. ... Article in Journal/Newspaper Antarc* Antarctica Bentham Science Publishers (via Crossref) Current HIV Research 20
institution Open Polar
collection Bentham Science Publishers (via Crossref)
op_collection_id crbenthamsciepub
language English
topic Virology
Infectious Diseases
spellingShingle Virology
Infectious Diseases
Ekpenyong, Moses E.
Adegoke, Anthony A.
Edoho, Mercy E.
Inyang, Udoinyang G.
Udo, Ifiok J.
Ekaidem, Itemobong S.
Osang, Francis
Uto, Nseobong P.
Geoffery, Joseph I.
Collaborative Mining of Whole Genome Sequences for Intelligent HIV-1 Sub-Strain(s) Discovery
topic_facet Virology
Infectious Diseases
description Background: Effective global antiretroviral vaccines and therapeutic strategies depend on the diversity, evolution, and epidemiology of their various strains as well as their transmission and pathogenesis. Most viral disease-causing particles are clustered into a taxonomy of subtypes to suggest pointers toward nucleotide-specific vaccines or therapeutic applications of clinical significance sufficient for sequence-specific diagnosis and homologous viral studies. These are very useful to formulate predictors to induce cross-resistance to some retroviral control drugs being used across study areas. Objective: This research proposed a collaborative framework of hybridized (Machine Learning and Natural Language Processing) techniques to discover hidden genome patterns and feature predictors for HIV-1 genome sequences mining. Method: 630 human HIV-1 genome sequences above 8500 bps were excavated from the National Center for Biotechnology Information (NCBI) database (https://www.ncbi.nlm.nih.gov) for 21 countries across different continents, except for Antarctica. These sequences were transformed and learned using a self-organizing map (SOM). To discriminate emerging/new sub-strain(s), the HIV-1 reference genome was included as part of the input isolates/samples during the training. After training the SOM, component planes defining pattern clusters of the input datasets were generated for cognitive knowledge mining and subsequent labeling of the datasets. Additional genome features, including dinucleotide transmission recurrences, codon recurrences, and mutation recurrences, were finally extracted from the raw genomes to construct output classification targets for supervised learning. Results: SOM training explains the inherent pattern diversity of HIV-1 genomes as well as interand intra-country transmissions in which mobility might play an active role, as corroborated by the literature. Nine sub-strains were discovered after disassembling the SOM correlation hunting matrix space attributed to disparate clusters. ...
format Article in Journal/Newspaper
author Ekpenyong, Moses E.
Adegoke, Anthony A.
Edoho, Mercy E.
Inyang, Udoinyang G.
Udo, Ifiok J.
Ekaidem, Itemobong S.
Osang, Francis
Uto, Nseobong P.
Geoffery, Joseph I.
author_facet Ekpenyong, Moses E.
Adegoke, Anthony A.
Edoho, Mercy E.
Inyang, Udoinyang G.
Udo, Ifiok J.
Ekaidem, Itemobong S.
Osang, Francis
Uto, Nseobong P.
Geoffery, Joseph I.
author_sort Ekpenyong, Moses E.
title Collaborative Mining of Whole Genome Sequences for Intelligent HIV-1 Sub-Strain(s) Discovery
title_short Collaborative Mining of Whole Genome Sequences for Intelligent HIV-1 Sub-Strain(s) Discovery
title_full Collaborative Mining of Whole Genome Sequences for Intelligent HIV-1 Sub-Strain(s) Discovery
title_fullStr Collaborative Mining of Whole Genome Sequences for Intelligent HIV-1 Sub-Strain(s) Discovery
title_full_unstemmed Collaborative Mining of Whole Genome Sequences for Intelligent HIV-1 Sub-Strain(s) Discovery
title_sort collaborative mining of whole genome sequences for intelligent hiv-1 sub-strain(s) discovery
publisher Bentham Science Publishers Ltd.
publishDate 2022
url http://dx.doi.org/10.2174/1570162x20666220210142209
https://www.eurekaselect.com/article/download?doi=10.2174/1570162X20666220210142209
https://www.eurekaselect.com/201020/article
genre Antarc*
Antarctica
genre_facet Antarc*
Antarctica
op_source Current HIV Research
volume 20, issue 2, page 163-183
ISSN 1570-162X
op_doi https://doi.org/10.2174/1570162x20666220210142209
container_title Current HIV Research
container_volume 20
_version_ 1776205395902070784