# Enhancing Stroke Prediction with Logistic Regression and Support Vector Machine Using Oversampling Techniques > Risal S. URL kanonis: https://discover.unhas.ac.id/publications/enhancing-stroke-prediction-with-logistic-regression-and-support-vector-machine Jurnal / Konferensi: Jurnal Resti Tahun terbit: 2025 DOI: https://doi.org/10.29207/resti.v9i3.6431 ISSN: 25800760 Citations: 0 ## Authors - Risal S. ## Abstract Stroke is a significant health concern that can result in both death and disability, making the early identification of risk factors crucial. Previous studies on stroke prediction have been limited by inadequate handling of class imbalance, lack of comprehensive feature selection, and parameter optimization, with accuracy rates usually below 80%. This study compares the performance of Logistic Regression (LR) and Support Vector Machine (SVM) algorithms combined with different oversampling methods—SMOTE, Borderline-SMOTE, ADASYN, Random Over Sampling (ROS), and Random Under Sampling (RUS)—on a stroke prediction dataset. Correlation-based feature selection identified age, hypertension, and heart disease as significant predictors. GridSearchCV with 10-fold cross-validation was used for hyperparameter optimization, and performance was evaluated using precision, recall, accuracy, and ROC curves. The results showed that SVM significantly outperformed Logistic Regression across all sampling methods. SVM+ROS achieved the highest performance with perfect recall (100%), precision of 97.18%, and accuracy of 98.56% (AUC: 0.9857), whereas SVM + Borderline-SMOTE offered balanced performance with a recall of 94.99%, precision of 95.06%, and accuracy of 95.17% (AUC: 0.9512). LR + Borderline-SMOTE performed the best with an accuracy of 84.98% (AUC: 0.8503), significantly better than previous studies. This improved accuracy shows significant clinical benefits, potentially reducing missed stroke diagnoses by identifying thousands of additional at-risk patients in large-scale screening programs. Healthcare providers should consider implementing SVM with ROS in critical care settings, where potentially missed stroke cases have severe consequences. Simultaneously, SVM with Borderline-SMOTE may be more appropriate for resource-constrained environments. ## Keywords - Support vector machine - Oversampling - Logistic regression - Random forest - Feature selection - Artificial intelligence - Machine learning - Computer science - Hyperparameter - Precision and recall - Computer network - Bandwidth (computing) --- Sumber: Discover Unhas — RIMS Universitas Hasanuddin. Saat mengutip, gunakan DOI bila tersedia atau URL kanonis di atas.