ANALYSIS OF KNN ALGORITHM AND APPLICATION OF SMOTE IN EARLY DETECTION OF LUNG CANCER
DOI:
https://doi.org/10.35457/quateknika.v15i02.4603Keywords:
Lung Cancer, KNN, SMOTE, Early Detection, Classification, Machine LearningAbstract
Lung cancer is one of the deadliest diseases and a major global health issue. Early detection is crucial to improving survival rates; however, challenges remain in prediction accuracy due to class imbalance in medical datasets. This study aims to analyze the implementation of the K-Nearest Neighbors (KNN) algorithm combined with the Synthetic Minority Oversampling Technique (SMOTE) for early detection of lung cancer. The dataset used was obtained from Kaggle.com and consists of 1000 patient records with 26 clinical and demographic features. The research process followed the CRISP-DM methodology, which includes business understanding, data understanding, data preparation, modeling, evaluation, and deployment stages. In the modeling phase, the KNN algorithm was implemented with k=3 after applying SMOTE to balance the class distribution. Evaluation results showed excellent model performance with an accuracy of 99.50%, and precision, recall, and F1-score values that were nearly perfect. Therefore, the combination of the KNN algorithm and SMOTE has proven to be effective in enhancing the predictive capability for lung cancer severity levels, indicating its potential to be developed into a medical decision support system in the future.
References
[2] A. Reynaldi, Y. Trisyani, and D. Adiningsih, “KUALITAS HIDUP PASIEN KANKER PARU STADIUM LANJUT.”
[3] M. Yunianto, F. Anwar, D. Nur Septianingsih, T. Dwi Ardyanto, and R. Farits Pradana, “KLASIFIKASI KANKER PARU PARU MENGGUNAKAN NAÏVE BAYES DENGAN VARIASI FILTER DAN EKSTRAKSI CIRI GRAY LEVEL CO-OCCURANCE MATRIX (GLCM),” Indonesian Journal of Applied Physics, vol. 11, no. 2, 2021.
[4] T. Abdi Mangun, O. Nurdiawan, and A. Irma Purnamasari, “LUNG CANCER ANALYSIS USING K-NEARST NEIGHBOR ALGORITHM,” 2023. [Online]. Available: https://ejournal.ubibanyuwangi.ac.id/index.php/jurnal_tinsika
[5] A. Rifa’i and Y. Prabowo, “Krea-TIF: Jurnal Teknik Informatika Diagnosis Kanker Paru-Paru dengan Sistem Fuzzy,” vol. 10, no. 1, pp. 19–28, 2022, doi: 10.32832/kreatif.v10i1.6317.
[6] D. Septhya, K. Rahayu, S. Rabbani, V. Fitria, Y. Irawan, and R. Hayami, “Indonesian Journal of Machine Learning and Computer Science Implementation of Decision Tree Algorithm and Support Vector Machine for Lung Cancer Classification Implementasi Algoritma Decision Tree dan Support Vector Machine untuk Klasifikasi Penyakit Kanker Paru,” vol. 3, pp. 15–19, 2023, Accessed: Dec. 19, 2024. [Online]. Available: https://doi.org/10.57152/malcom.v3i1.591
[7] J. Han, M. Kamber, and J. Pei, “Data Mining. Concepts and Techniques, 3rd Edition (The Morgan Kaufmann Series in Data Management Systems),” 2017.
[8] P. Gulande and R. Awale, “A Hybrid mRMR-RSA Feature Selection Approach for Lung Cancer Diagnosis Using Gene Expression Data,” Biomedical and Pharmacology Journal, vol. 18, pp. 257–270, Mar. 2025, doi: 10.13005/bpj/3086.
[9] K. K. V, S. Balaji B, S. K R, and A. Najat Ahmed, “Enhanced Lung Cancer Prediction Using Ensemble Machine Learning Algorithms,” in 2024 International Conference on Emerging Research in Computational Science (ICERCS), Dec. 2024, pp. 1–5. doi: 10.1109/ICERCS63125.2024.10894971.
[10] R. Ullah, K. Parveen, I. Rehan, and S. Khan, “Enhancing lung cancer diagnostics through Raman spectroscopy and machine learning,” Phys Scr, vol. 100, no. 4, p. 046015, 2025, doi: 10.1088/1402-4896/adc214.
[11] Y. Lin et al., “A fast, non-invasive auxiliary screening algorithm for lung cancer based on electronic nose system,” Sens Actuators A Phys, vol. 389, p. 116490, 2025, doi: https://doi.org/10.1016/j.sna.2025.116490.
[12] A. Rifa’i and Y. Prabowo, “Krea-TIF: Jurnal Teknik Informatika Diagnosis Kanker Paru-Paru dengan Sistem Fuzzy,” vol. 10, no. 1, pp. 19–28, 2022, doi: 10.32832/kreatif.v10i1.6317.
[13] M. Yunianto, F. Anwar, D. Nur Septianingsih, T. Dwi Ardyanto, and R. Farits Pradana, “KLASIFIKASI KANKER PARU PARU MENGGUNAKAN NAÏVE BAYES DENGAN VARIASI FILTER DAN EKSTRAKSI CIRI GRAY LEVEL CO-OCCURANCE MATRIX (GLCM),” Indonesian Journal of Applied Physics, vol. 11, no. 2, 2021.
[14] S. Sucipto, D. Dwi Prasetya, and T. Widiyaningtyas, “Educational Data Mining: Multiple Choice Question Classification in Vocational School,” MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer, vol. 23, no. 2, pp. 379–388, Mar. 2024, doi: 10.30812/matrik.v23i2.3499.
[15] A. Nugroho and D. Harini, “Teknik Random Forest untuk Meningkatan Akurasi Data Tidak Seimbang,” JSITIK, vol. 2, no. 2, 2024, doi: 10.53624/jsitik.v2i2.XX.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Jurnal Qua Teknika

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Copyright on any article is retained by the author(s).
- Author grant the journal, right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work’s authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal’s published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
- The article and any associated published material is distributed under the Creative Commons Attribution-ShareAlike 4.0 International License
Deprecated: json_decode(): Passing null to parameter #1 ($json) of type string is deprecated in /home/ejournal.unisbablitar.ac.id/public_html/plugins/generic/citations/CitationsPlugin.php on line 68




