A systematic literature review on meta-heuristic based feature selection techniques for text classification

Feature selection (FS) is a critical step in many data science-based applications, especially in text classification, as it includes selecting relevant and important features from an original feature set. This process can improve learning accuracy, streamline learning duration, and simplify outcomes...

Full description

Bibliographic Details
Published in:PeerJ Computer Science
Main Authors: Al-shalif, Sarah Abdulkarem, Senan, Norhalina, Saeed, Faisal, Ghaban, Wad, Ibrahim, Noraini, Aamir, Muhammad, Sharif, Wareesa
Other Authors: Research Management Center at Universiti Teknologi Malaysia, Data Analytics and Artificial Intelligence (DAAI) Research Group in Birmingham City University, UK
Format: Article in Journal/Newspaper
Language:English
Published: PeerJ 2024
Subjects:
Online Access:http://dx.doi.org/10.7717/peerj-cs.2084
https://peerj.com/articles/cs-2084.pdf
https://peerj.com/articles/cs-2084.xml
https://peerj.com/articles/cs-2084.html
id crpeerj:10.7717/peerj-cs.2084
record_format openpolar
spelling crpeerj:10.7717/peerj-cs.2084 2024-09-09T20:05:18+00:00 A systematic literature review on meta-heuristic based feature selection techniques for text classification Al-shalif, Sarah Abdulkarem Senan, Norhalina Saeed, Faisal Ghaban, Wad Ibrahim, Noraini Aamir, Muhammad Sharif, Wareesa Research Management Center at Universiti Teknologi Malaysia Data Analytics and Artificial Intelligence (DAAI) Research Group in Birmingham City University, UK 2024 http://dx.doi.org/10.7717/peerj-cs.2084 https://peerj.com/articles/cs-2084.pdf https://peerj.com/articles/cs-2084.xml https://peerj.com/articles/cs-2084.html en eng PeerJ https://creativecommons.org/licenses/by/4.0/ PeerJ Computer Science volume 10, page e2084 ISSN 2376-5992 journal-article 2024 crpeerj https://doi.org/10.7717/peerj-cs.2084 2024-06-18T04:09:39Z Feature selection (FS) is a critical step in many data science-based applications, especially in text classification, as it includes selecting relevant and important features from an original feature set. This process can improve learning accuracy, streamline learning duration, and simplify outcomes. In text classification, there are often many excessive and unrelated features that impact performance of the applied classifiers, and various techniques have been suggested to tackle this problem, categorized as traditional techniques and meta-heuristic (MH) techniques. In order to discover the optimal subset of features, FS processes require a search strategy, and MH techniques use various strategies to strike a balance between exploration and exploitation. The goal of this research article is to systematically analyze the MH techniques used for FS between 2015 and 2022, focusing on 108 primary studies from three different databases such as Scopus, Science Direct, and Google Scholar to identify the techniques used, as well as their strengths and weaknesses. The findings indicate that MH techniques are efficient and outperform traditional techniques, with the potential for further exploration of MH techniques such as Ringed Seal Search (RSS) to improve FS in several applications. Article in Journal/Newspaper ringed seal PeerJ Publishing PeerJ Computer Science 10 e2084
institution Open Polar
collection PeerJ Publishing
op_collection_id crpeerj
language English
description Feature selection (FS) is a critical step in many data science-based applications, especially in text classification, as it includes selecting relevant and important features from an original feature set. This process can improve learning accuracy, streamline learning duration, and simplify outcomes. In text classification, there are often many excessive and unrelated features that impact performance of the applied classifiers, and various techniques have been suggested to tackle this problem, categorized as traditional techniques and meta-heuristic (MH) techniques. In order to discover the optimal subset of features, FS processes require a search strategy, and MH techniques use various strategies to strike a balance between exploration and exploitation. The goal of this research article is to systematically analyze the MH techniques used for FS between 2015 and 2022, focusing on 108 primary studies from three different databases such as Scopus, Science Direct, and Google Scholar to identify the techniques used, as well as their strengths and weaknesses. The findings indicate that MH techniques are efficient and outperform traditional techniques, with the potential for further exploration of MH techniques such as Ringed Seal Search (RSS) to improve FS in several applications.
author2 Research Management Center at Universiti Teknologi Malaysia
Data Analytics and Artificial Intelligence (DAAI) Research Group in Birmingham City University, UK
format Article in Journal/Newspaper
author Al-shalif, Sarah Abdulkarem
Senan, Norhalina
Saeed, Faisal
Ghaban, Wad
Ibrahim, Noraini
Aamir, Muhammad
Sharif, Wareesa
spellingShingle Al-shalif, Sarah Abdulkarem
Senan, Norhalina
Saeed, Faisal
Ghaban, Wad
Ibrahim, Noraini
Aamir, Muhammad
Sharif, Wareesa
A systematic literature review on meta-heuristic based feature selection techniques for text classification
author_facet Al-shalif, Sarah Abdulkarem
Senan, Norhalina
Saeed, Faisal
Ghaban, Wad
Ibrahim, Noraini
Aamir, Muhammad
Sharif, Wareesa
author_sort Al-shalif, Sarah Abdulkarem
title A systematic literature review on meta-heuristic based feature selection techniques for text classification
title_short A systematic literature review on meta-heuristic based feature selection techniques for text classification
title_full A systematic literature review on meta-heuristic based feature selection techniques for text classification
title_fullStr A systematic literature review on meta-heuristic based feature selection techniques for text classification
title_full_unstemmed A systematic literature review on meta-heuristic based feature selection techniques for text classification
title_sort systematic literature review on meta-heuristic based feature selection techniques for text classification
publisher PeerJ
publishDate 2024
url http://dx.doi.org/10.7717/peerj-cs.2084
https://peerj.com/articles/cs-2084.pdf
https://peerj.com/articles/cs-2084.xml
https://peerj.com/articles/cs-2084.html
genre ringed seal
genre_facet ringed seal
op_source PeerJ Computer Science
volume 10, page e2084
ISSN 2376-5992
op_rights https://creativecommons.org/licenses/by/4.0/
op_doi https://doi.org/10.7717/peerj-cs.2084
container_title PeerJ Computer Science
container_volume 10
container_start_page e2084
_version_ 1809937617681121280