Share

Export Citation

APA
MLA
Chicago
Harvard
Vancouver
BIBTEX
RIS
Universitas Hasanuddin
Research output:Contribution to journalArticlepeer-review

Integrating Machine Learning and Molecular Docking for Natural Compounds Discovery

Rasyak M.R.

2025 5th International Conference on Intelligent Cybernetics Technology and Applications Icicyta 2025

Published: 2025

Abstract

Breast cancer is recognized as a leading cause of death among women worldwide and shows a high incidence rate in Indonesia. This study applied a computational pipeline integrating machine learning and molecular docking to identify bioactive compounds from three traditional West Sulawesi medicinal plants-Strobilanthes crispa, Basella alba, and Ficus septica-with potential inhibitory activity against the epidermal growth factor receptor (EGFR), a key therapeutic target in breast cancer. A total of 236 phytochemicals were collected and converted into 881 PubChem substructure fingerprints. Using a Support Vector Classifier (SVC) trained on balanced activedecoy datasets and optimized through grid search of kernel types, regularization (C), and gamma parameters. The optimized model achieved an AUC of 0.98 and accuracy of 0.99 for EGFR inhibition prediction. From this analysis, 36 valid compounds were identified, with 34 (94%) showing strong binding affinity EGFR, a key therapeutic target in breast cancer. Among them, chlorogenic acid (-10.93 kcal/mol) and kaempferitrin (-10.81 kcal/mol) exhibited stronger theoretical binding affinity than the reference inhibitor Tak-285 (-10.17 kcal/mol). ADME-toxicity profiling confirmed drug-likeness and safety potential of selected compounds. Overall, this integrative approach demonstrates the capability of combining machine learning prediction and structure-based modeling to accelerate the discovery of natural compounds for targeted breast cancer therapy.

Other files and links

Fingerprint

Machine learningSciences
Artificial intelligenceSciences
PubChemSciences
Support vector machineSciences
Computer scienceSciences
Breast cancerSciences
Docking (animal)Sciences
Classifier (UML)Sciences
Naive Bayes classifierSciences
Chlorogenic acidSciences
Drug discoverySciences
Quantitative structure–activity relationshipSciences
Epidermal growth factor receptorSciences
Computational biologySciences
InterpretabilitySciences
ADMESciences
Profiling (computer programming)Sciences
PerceptronSciences
ChemistrySciences