Distance Measures in Bioinformatics
Many bioinformatics applications rely on the computation of similarities between objects. Distance and similarity measures applied to vectors of characteristics are essential to problems such as classification, clustering and information retrieval. This study explores the usefulness of distance and...
Main Author: | |
---|---|
Other Authors: | , |
Format: | Thesis |
Language: | English |
Published: |
Drexel University
2015
|
Subjects: | |
Online Access: | http://hdl.handle.net/1860/idea:6403 |
id |
ftdrexeluniv:oai:idea.library.drexel.edu:idea_6403 |
---|---|
record_format |
openpolar |
spelling |
ftdrexeluniv:oai:idea.library.drexel.edu:idea_6403 2023-05-15T16:01:36+02:00 Distance Measures in Bioinformatics Xiong, Feiyu Kam, Moshe Hrebien, Leonid, 1949- 2015-01-01- http://hdl.handle.net/1860/idea:6403 eng eng Drexel University idea:6403 http://hdl.handle.net/1860/idea:6403 Electrical engineering Bioinformatics Computer science Thesis Text 2015 ftdrexeluniv 2019-03-23T23:52:39Z Many bioinformatics applications rely on the computation of similarities between objects. Distance and similarity measures applied to vectors of characteristics are essential to problems such as classification, clustering and information retrieval. This study explores the usefulness of distance and similarity measures in several bioinformatics applications. These applications are in two categories. (1) Estimation of the adverse reaction severity of unknown pharmaceutical treatments, based on the severity of known treatments, in order to provide guidance for testing of the unknown treatments in clinical trials. (2) Classification of cancer tissue types and estimation of cancer stages, based on high-dimensional microarray data, in order to support clinical decisions making. To address the first category, we studied several clustering and classification approaches for binary severity estimation of Cytokine Release Syndrome (CRS). We developed a Severity Estimation using Distance Metric Learning (SE-DML) approach to get graded severity estimation. With binary estimation we were able to identify treatments that caused the most severe response and then built prediction models for CRS. Using the SE-DML approach, we evaluated four known data sets and showed that SE-DML outperformed other widely used methods on these data sets. For the second category, we presented Kernelized Information-Theoretic Metric Learning (KITML) algorithms that optimize distance metrics and effectively handle high-dimensional data. This learned metric by KITML is used to improve the performance of $k$-nearest neighbor classification for cancer tissue microarray data. We evaluated our approach on fourteen (14) cancer microarray data sets and compared our results with other state-of-the-art approaches. We achieved the best overall performance for the classification task. In addition we tested the KITML algorithm in estimating the severity stages of cancer samples, with accurate results. Ph.D., Electrical Engineering -- Drexel University, 2015 Thesis DML Drexel University: iDEA - Drexel Libraries E-Repository And Archives |
institution |
Open Polar |
collection |
Drexel University: iDEA - Drexel Libraries E-Repository And Archives |
op_collection_id |
ftdrexeluniv |
language |
English |
topic |
Electrical engineering Bioinformatics Computer science |
spellingShingle |
Electrical engineering Bioinformatics Computer science Xiong, Feiyu Distance Measures in Bioinformatics |
topic_facet |
Electrical engineering Bioinformatics Computer science |
description |
Many bioinformatics applications rely on the computation of similarities between objects. Distance and similarity measures applied to vectors of characteristics are essential to problems such as classification, clustering and information retrieval. This study explores the usefulness of distance and similarity measures in several bioinformatics applications. These applications are in two categories. (1) Estimation of the adverse reaction severity of unknown pharmaceutical treatments, based on the severity of known treatments, in order to provide guidance for testing of the unknown treatments in clinical trials. (2) Classification of cancer tissue types and estimation of cancer stages, based on high-dimensional microarray data, in order to support clinical decisions making. To address the first category, we studied several clustering and classification approaches for binary severity estimation of Cytokine Release Syndrome (CRS). We developed a Severity Estimation using Distance Metric Learning (SE-DML) approach to get graded severity estimation. With binary estimation we were able to identify treatments that caused the most severe response and then built prediction models for CRS. Using the SE-DML approach, we evaluated four known data sets and showed that SE-DML outperformed other widely used methods on these data sets. For the second category, we presented Kernelized Information-Theoretic Metric Learning (KITML) algorithms that optimize distance metrics and effectively handle high-dimensional data. This learned metric by KITML is used to improve the performance of $k$-nearest neighbor classification for cancer tissue microarray data. We evaluated our approach on fourteen (14) cancer microarray data sets and compared our results with other state-of-the-art approaches. We achieved the best overall performance for the classification task. In addition we tested the KITML algorithm in estimating the severity stages of cancer samples, with accurate results. Ph.D., Electrical Engineering -- Drexel University, 2015 |
author2 |
Kam, Moshe Hrebien, Leonid, 1949- |
format |
Thesis |
author |
Xiong, Feiyu |
author_facet |
Xiong, Feiyu |
author_sort |
Xiong, Feiyu |
title |
Distance Measures in Bioinformatics |
title_short |
Distance Measures in Bioinformatics |
title_full |
Distance Measures in Bioinformatics |
title_fullStr |
Distance Measures in Bioinformatics |
title_full_unstemmed |
Distance Measures in Bioinformatics |
title_sort |
distance measures in bioinformatics |
publisher |
Drexel University |
publishDate |
2015 |
url |
http://hdl.handle.net/1860/idea:6403 |
genre |
DML |
genre_facet |
DML |
op_relation |
idea:6403 http://hdl.handle.net/1860/idea:6403 |
_version_ |
1766397387513790464 |