Generic placeholder image

Current Bioinformatics


ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Research Article

Gene Selection in Multi-class Imbalanced Microarray Datasets Using Dynamic Length Particle Swarm Optimization

Author(s): R. Devi Priya* and R. Sivaraj

Volume 16, Issue 5, 2021

Published on: 02 October, 2020

Page: [734 - 748] Pages: 15

DOI: 10.2174/1574893615999201002093834

Price: $65


Background: Microarray gene expression datasets usually contain a large number of genes that complicate further operations like classification, clustering and other kinds of analysis. During the classification process, the identification of salient genes is a brainstorming task and needs a careful selection.

Methods: The classification of multi-class datasets is more critical when compared with binary classification. When there are multiple class labels, chances are more likely that the datasets are imbalanced. Large variations can be seen in the number of samples belonging to each class, and hence the classification process may go biased with incorrect samples chosen for training. There is no sufficient research work available to address all these three scenarios together in microarray datasets.

Results and Discussion: The paper fills this gap with the following contributions: i) Selects salient genes for classification using multiSURF algorithm ii) Identifies right instances from imbalanced datasets using Retained Tomek Link algorithm and iii) Performs gene selection for multi-class classification using Dynamic Length Particle Swarm Optimization (DPSO).

Conclusion: The proposed method is implemented on multi-class imbalanced microarray datasets, and the final classification performance is seen to be encouraging and better than other compared methods.

Keywords: Feature weighing, retained tomek link, dynamic PSO, apriori algorithm, microarray datasets.

Graphical Abstract

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy