Main Article Content



Breast Cancer, KNN-Distance Measurements, Recurrence Prediction


Breast cancer is a prevalent type of cancer that primarily affects females. Extensive research has been conducted in the field of breast cancer, and with the advancements in technology, the early detection of this disease has become possible through the utilization of artificial intelligence or machine learning techniques. The objective of this study is to assess the accuracy of predicting the recurrence of breast cancer by employing the k-Nearest Neighbor (k-NN) algorithm. The k-NN classifier is a straightforward and versatile approach to classification, which often demonstrates comparable performance to more intricate machine-learning algorithms. The effectiveness of k-NN classifiers is closely associated with the selection of a suitable distance or similarity measure. Therefore, it is crucial to investigate the impact of employing various distance measures when analyzing biomedical data. The findings of this study indicate that the k-NN algorithm, utilizing diverse distance measurements, yields the most favorable outcomes in terms of accurately predicting the recurrence of breast cancer.

Abstract 54 | PDF Downloads 19


[1]Abu Alfeilat HA, Hassanat ABA, Lasassmeh O, et al. Effects of distance measure choice on k-nearest neighbor classifier performance: a review. Big Data. 2019;7:221-248
[2]Ahmad LG*, Eshlaghy AT, Poorebrahimi A, Ebrahimi M and Razavi AR - Using Three Machine Learning Techniques for Predicting Breast Cancer Recurrence, J Health Med Inform 2013,4:2,
[3]Bajramovic F, Mattern F, Butko N, Denzler J. A Comparison of Nearest Neighbor Search Algorithms for Generic Object Recognition. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 4179. Berlin, Germany: Springer
[4] Cesare, S., and Xiang, Y. (2012). Software Similarity and Classification. Springer.
[5]D. Shetty, K. Rit, S. Shaikh, and N. Patil, “Diabetes disease prediction using data mining,” in Proceedings of 2017 International Conference on Innovations in Information, Embedded and Communication Systems, ICIIECS 2017, 2018.
[6]Enriko, I. K. A., Suryanegara, M., & Gunawan, D, “Heart Disease Prediction System using k-Nearest Neighbor Algorithm with Simplified Patient’s Health Parameters,” J. Telecommun. Electron. Comput. Eng., vol. 8, no. 12, pp. 59–65, 2016.
[7]Geng X, Liu T-Y, Qin T, Arnold A, Li H, Shum H-Y. Query dependent ranking using K-nearest neighbor. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Singapore). New York, NY: Association for Computing Machinery; 2008:115-122.
[8]Grabusts, P. (2011). The choice of metrics for clustering algorithms. Environment. Technology. Resources , 70–76.
[9]H. Asri, H. Mousannif, H. Al Moatassime, and T. Noel, “Using Machine Learning Algorithms for Breast Cancer Risk Prediction and Diagnosis,” in Procedia Computer Science, 2016.
[10]I. K. A. Enriko, M. Suryanegara, and D. Gunawan, “Heart disease diagnosis system with k-nearest neighbors method using real clinical medical records,” in ACM International Conference Proceeding Series, 2018
[11]Jaccard, P. (1901). Etude comparative de la distribution florale dans une portion des Alpes et du Jura.
[12]Khamis HS, Cheruiyot KW, Kimani S. Application of k-nearest neighbour classification in medical data mining. Int J Inform Commun Technol Res. 2014;4:121-128.
[13]Kusmirek W, Szmurlo A, Wiewiorka M, Nowak R, Gambin T. Comparison of kNN and k-means optimization methods of reference set selection for improved CNV callers performance. BMC Bioinform. 2019;20:266.
[14]Manne S, Kotha SK, Sameen Fatima S. Text categorization with k-nearest neighbor approach. In: Proceedings of the International Conference on Information Systems Design and Intelligent Applications 2012 (INDIA 2012), Visakhapatnam, India; Berlin, Germany; Heidelberg, Germany: Springer; 2012:413-420
[15]National Breast Cancer Foundation Inc.,
[16] Premaratne, P. (2014). Human computer interaction using hand gestures. Springer.
[17]Roder J, Oliveira C, Net L, Tsypin M, Linstid B, Roder H. A dropout-regularized classifier development approach optimized for precision medicine test discovery from omics data. BMC Bioinform. 2019;20:325.
[18]S. Mohan, C. Thirumalai, and G. Srivastava, “Effective heart disease prediction using hybrid machine learning techniques,” IEEE Access, 2019.
[19]Silverman BW, Jones MC, Fix E, Hodges JL. An important contribution to nonparametric discriminant analysis and density estimation: commentary on Fix and Hodges (1951). Int Stat Rev. 1989;57:233-238.
[20]Sørensen T. A method of establishing groups of equal amplitudes in plant sociology based on similarity of species content and its application to analyses of the vegetation on Danish commons. Kongelige Danske Videnskabernes Selskab, Biologiske Skrifter. 1948;5:1-34.
[21]Todeschini, R., Ballabio, D., & Consonni, V. (2015). Distances and other dissimilarity measures in chemometrics. Encyclopedia of Analytical Chemistry.
[22]V. Chaurasia, S. Pal, and B. B. Tiwari, “Prediction of benign and malignant breast cancer using data mining techniques,” J. Algorithms Comput. Technol., 2018.
[23]Verma, J. P. (2012). Data Analysis in Management with SPSS Software. Springer.
[24]A.G Waks and E.P . Winer, “Breast Cancer Treatment: A Review ,” JAMA-Journal of the American Medical Association.2019.
[25]W. Bank, “Physicians (Per 1,000 People),” World Bank Report, 2020. [Online]. Available: ZS?most_recent_value_desc=true. [Accessed: 15-Feb2021].
[26]Xu S, Wu Y. An algorithm for remote sensing image classification based on artificial immune B-cell network. In: Jun C, Jie J, Cho K, eds. Xxist ISPRS Congress, Youth Forum, Vol. 37. Beijing, China: The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences; 2008:107-112