Lung Cancer Subtyping from Gene Expression Data using General and
Enhanced Fuzzy Min-Max Neural Networks
Abstract
Cancer diagnosis using gene expression data is significant research for
facilitating early treatment and prevention of cancer. The
classification of gene expression data is challenging due to its high
dimensionality and smaller number of samples that renders classification
a difficult task. Creation of well-defined class boundaries is the aim
of every classification algorithm. The Fuzzy min-max (FMM) neural
network classifier is known to create good decision boundaries using
hyperboxes constructed for each class. In this paper, we explore the
General Fuzzy min-max (GFMM) and Enhanced Fuzzy min-max (EFMM) neural
network architectures for the classification of lung cancer subtypes
from microarray gene expression data. Both GFMM and EFMM are advanced
versions of Simpson’s FMM neural network classifier. The GFMM is
extremely efficient because it involves very simple operations for
hyperbox manipulation, and can handle both labeled and unlabeled data.
On the other hand, EFMM proposes three heuristic rules related to
hyperbox expansion, contraction and the overlap test, which enhances the
learning algorithm. We perform the classification of gene expression
data using these two algorithms, then we analyze the performance by
visualizing the hyperboxes obtained after training, and compare the
accuracies of these classifiers. LASSO is used for selecting the
important genes from the high-dimensional gene expression data. After
the analysis of the results, we observe that EFMM with LASSO gives the
best performance as compared to GFMM, FMM and other machine learning
algorithms.