Unsupervised Feature Selection by a Genetic Algorithm for Mid-Infrared Spectral Data.

Fiche publication


Date publication

novembre 2022

Journal

Analytical chemistry

Auteurs

Membres identifiés du Cancéropôle Est :
Dr GOBINET Cyril, Dr GUENOT Dominique, Pr PIOT Olivier


Tous les auteurs :
Boutegrabet W, Piot O, Guenot D, Gobinet C

Résumé

Dimensional reduction of highly multidimensional datasets such as those acquired by Fourier transform infrared spectroscopy (FTIR) is a critical step in the data analysis workflow. To achieve this goal, numerous feature selection methods have been developed and applied in a supervised context, i.e., using a priori knowledge about data usually in the form of labels for classification or quantitative values for regression. For this, genetic algorithms have been largely exploited due to their flexibility and global optimization principle. However, few applications in an unsupervised context have been reported in infrared spectroscopy. The aim of this article is to propose a new unsupervised feature selection method based on a genetic algorithm using a validity index computed from KMeans partitions as a fitness function. Evaluated on a simulated dataset and validated and tested on three real-world infrared spectroscopic datasets, our developed algorithm is able to find the spectral descriptors improving clustering accuracy and simplifying the spectral interpretation of results.

Référence

Anal Chem. 2022 11 8;: