ANALYSIS OF KNN ALGORITHM AND APPLICATION OF SMOTE IN EARLY DETECTION OF LUNG CANCER

Authors

  • Bifadhlillah Marsheila Islami Universitas Nusantara PGRI Kediri
  • Sucipto Universitas Nusantara PGRI Kediri
  • Arie Nugroho Universitas Nusantara PGRI Kediri

DOI:

https://doi.org/10.35457/quateknika.v15i02.4603

Keywords:

Lung Cancer, KNN, SMOTE, Early Detection, Classification, Machine Learning

Abstract

Lung cancer is one of the deadliest diseases and a major global health issue. Early detection is crucial to improving survival rates; however, challenges remain in prediction accuracy due to class imbalance in medical datasets. This study aims to analyze the implementation of the K-Nearest Neighbors (KNN) algorithm combined with the Synthetic Minority Oversampling Technique (SMOTE) for early detection of lung cancer. The dataset used was obtained from Kaggle.com and consists of 1000 patient records with 26 clinical and demographic features. The research process followed the CRISP-DM methodology, which includes business understanding, data understanding, data preparation, modeling, evaluation, and deployment stages. In the modeling phase, the KNN algorithm was implemented with k=3 after applying SMOTE to balance the class distribution. Evaluation results showed excellent model performance with an accuracy of 99.50%, and precision, recall, and F1-score values that were nearly perfect. Therefore, the combination of the KNN algorithm and SMOTE has proven to be effective in enhancing the predictive capability for lung cancer severity levels, indicating its potential to be developed into a medical decision support system in the future.

References

[1] S. A. Naufal, A. Adiwijaya, and W. Astuti, “Analisis Perbandingan Klasifikasi Support Vector Machine (SVM) dan K-Nearest Neighbors (KNN) untuk Deteksi Kanker dengan Data Microarray,” JURIKOM (Jurnal Riset Komputer), vol. 7, no. 1, p. 162, Feb. 2020, doi: 10.30865/jurikom.v7i1.2014.
[2] A. Reynaldi, Y. Trisyani, and D. Adiningsih, “KUALITAS HIDUP PASIEN KANKER PARU STADIUM LANJUT.”
[3] M. Yunianto, F. Anwar, D. Nur Septianingsih, T. Dwi Ardyanto, and R. Farits Pradana, “KLASIFIKASI KANKER PARU PARU MENGGUNAKAN NAÏVE BAYES DENGAN VARIASI FILTER DAN EKSTRAKSI CIRI GRAY LEVEL CO-OCCURANCE MATRIX (GLCM),” Indonesian Journal of Applied Physics, vol. 11, no. 2, 2021.
[4] T. Abdi Mangun, O. Nurdiawan, and A. Irma Purnamasari, “LUNG CANCER ANALYSIS USING K-NEARST NEIGHBOR ALGORITHM,” 2023. [Online]. Available: https://ejournal.ubibanyuwangi.ac.id/index.php/jurnal_tinsika
[5] A. Rifa’i and Y. Prabowo, “Krea-TIF: Jurnal Teknik Informatika Diagnosis Kanker Paru-Paru dengan Sistem Fuzzy,” vol. 10, no. 1, pp. 19–28, 2022, doi: 10.32832/kreatif.v10i1.6317.
[6] D. Septhya, K. Rahayu, S. Rabbani, V. Fitria, Y. Irawan, and R. Hayami, “Indonesian Journal of Machine Learning and Computer Science Implementation of Decision Tree Algorithm and Support Vector Machine for Lung Cancer Classification Implementasi Algoritma Decision Tree dan Support Vector Machine untuk Klasifikasi Penyakit Kanker Paru,” vol. 3, pp. 15–19, 2023, Accessed: Dec. 19, 2024. [Online]. Available: https://doi.org/10.57152/malcom.v3i1.591
[7] J. Han, M. Kamber, and J. Pei, “Data Mining. Concepts and Techniques, 3rd Edition (The Morgan Kaufmann Series in Data Management Systems),” 2017.
[8] P. Gulande and R. Awale, “A Hybrid mRMR-RSA Feature Selection Approach for Lung Cancer Diagnosis Using Gene Expression Data,” Biomedical and Pharmacology Journal, vol. 18, pp. 257–270, Mar. 2025, doi: 10.13005/bpj/3086.
[9] K. K. V, S. Balaji B, S. K R, and A. Najat Ahmed, “Enhanced Lung Cancer Prediction Using Ensemble Machine Learning Algorithms,” in 2024 International Conference on Emerging Research in Computational Science (ICERCS), Dec. 2024, pp. 1–5. doi: 10.1109/ICERCS63125.2024.10894971.
[10] R. Ullah, K. Parveen, I. Rehan, and S. Khan, “Enhancing lung cancer diagnostics through Raman spectroscopy and machine learning,” Phys Scr, vol. 100, no. 4, p. 046015, 2025, doi: 10.1088/1402-4896/adc214.
[11] Y. Lin et al., “A fast, non-invasive auxiliary screening algorithm for lung cancer based on electronic nose system,” Sens Actuators A Phys, vol. 389, p. 116490, 2025, doi: https://doi.org/10.1016/j.sna.2025.116490.
[12] A. Rifa’i and Y. Prabowo, “Krea-TIF: Jurnal Teknik Informatika Diagnosis Kanker Paru-Paru dengan Sistem Fuzzy,” vol. 10, no. 1, pp. 19–28, 2022, doi: 10.32832/kreatif.v10i1.6317.
[13] M. Yunianto, F. Anwar, D. Nur Septianingsih, T. Dwi Ardyanto, and R. Farits Pradana, “KLASIFIKASI KANKER PARU PARU MENGGUNAKAN NAÏVE BAYES DENGAN VARIASI FILTER DAN EKSTRAKSI CIRI GRAY LEVEL CO-OCCURANCE MATRIX (GLCM),” Indonesian Journal of Applied Physics, vol. 11, no. 2, 2021.
[14] S. Sucipto, D. Dwi Prasetya, and T. Widiyaningtyas, “Educational Data Mining: Multiple Choice Question Classification in Vocational School,” MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer, vol. 23, no. 2, pp. 379–388, Mar. 2024, doi: 10.30812/matrik.v23i2.3499.
[15] A. Nugroho and D. Harini, “Teknik Random Forest untuk Meningkatan Akurasi Data Tidak Seimbang,” JSITIK, vol. 2, no. 2, 2024, doi: 10.53624/jsitik.v2i2.XX.

Downloads

Published

2025-09-25

Deprecated: json_decode(): Passing null to parameter #1 ($json) of type string is deprecated in /home/ejournal.unisbablitar.ac.id/public_html/plugins/generic/citations/CitationsPlugin.php on line 68

How to Cite

ANALYSIS OF KNN ALGORITHM AND APPLICATION OF SMOTE IN EARLY DETECTION OF LUNG CANCER. (2025). Jurnal Qua Teknika, 15(02), 38-50. https://doi.org/10.35457/quateknika.v15i02.4603