| Abstract |
In the era of big data, technologies like the Internet of Things, smart cities, healthcare, and social media rely heavily on advanced data analytics. In medical data, certain critical diseases are significantly underrepresented compared to more prevalent conditions, creating a class imbalance that can lead to biased models favoring majority class predictions. This imbalance reduces the accuracy and reliability of predictions for the minority class, which is often essential for early diagnosis and intervention in rare but severe diseases. This is particularly challenging in medical data, where cancer classification faces problems such as high dimensionality, redundancy, and severe class imbalance. To address these challenges, this paper proposes a novel framework which integrates a Relevance Vector Machine classifier with an Incremental Ensemble framework to effectively manage data imbalance. It employs a Gaussian Mixture Models-based combined resampling algorithm to balance the dataset by resampling. Mutual Information Gain Maximization enhances the effectiveness of feature selection. To further enhance performance, an Adaptive Weighted Broad Learning System is incorporated a density-based weight generation mechanism using prior distribution information. Additionally, an Incremental Dynamic Learning Policy-based Relevance Vector Machine classifier is incorporated to adapt to new data, and maintain high accuracy. The proposed model achieves superior performance with an Accuracy of 99 %, a Kappa value of 98 %, an F1-Score of 99 %, and an MCC of 96.9 %. These results underscore the model's effectiveness in addressing class imbalance, enhancing predictive accuracy for minority classes, and offering a robust solution for complex medical datasets essential for improved healthcare outcomes. © 2024 |