Optimal feature selection with Random Forest Classifier for Lymph Diseases Prediction using Particle Swarm Optimization
Abstract
ML-based approaches to data classification can be used in various areas such as healthcare, disease prediction, etc. as a decision-making tool. Most of the medical data currently comprise high-dimensionality nature FS (FS) methodologies are sometimes used to improvise the classification results, particularly for high-dimensionality issues, by extracting more determined characteristics from the appropriate training instances. This paper tried to study the application of FS approaches to the performance of the classifier. Genetic algorithm and particle swarm optimization (PSO) algorithm are used for FS, while random forest (RF) classifier is used for dataset classification of lymph diseases. GA and PSO are used to decrease the feature subset in the first stage and the RF classifier is used in the second stage. It is evident from the experimentation portion that the PSO-based FS increases the results of the classifier compared to GA-based FS. It is also studied that the FS process improvises the outcomes of the classifier in a significant way in various performance measures.