Empirical Analysis of a New Immunohistochemical Breast Cancer Images Dataset

  • Hasanain H. Razzaq, Rozaida Ghazali, Loay E. George, Salama A. Mostafa, Asaad A. Al-Janabi, Ali Hussein Fadel, Zaid Rajih Mohammed, Maadh Hmosze, Hussein Bahaa Abdulrazzaq, Taif Raoof Hamza

Abstract

In the medical domain, medical image analysis plays an imperative role in the automation of the diagnosis process. The outcome of a medical dataset’s attributesis to enhance the improvement of breast cancer surveillance including the diagnosis and prognosis.The specification of the breast cancer immune dataset is not practically introduced as a diagnostic tool until now in the pathology field. Subsequently, this paperhas developedanewIndependent Breast Cancer Immunohistochemical Images (IBCII) dataset for breast cancer diagnosis and prognosis tasks. The images were produced from Iraqi breast cancer patients under the supervision ofspecialist physicians.It intends to assess estrogen and progesterone receptors expression immunostaining in both benign and malignant breast tumors. The automated classification of these imagesis considered as a main computationalmachine learning task which is performed by three different types of algorithms. The algorithms are support vector machine (SVM), k-nearest neighbors (k-NN) and random forest (RF). The classification is performedwith training and testing phases based on the expression of the receptor to the image characteristics. The collected dataset was tested and compared with previous works in order to validate the usefulness of data analysis as a standard dataset for immunohistochemical assessment ofestrogen receptor (ER) and progesterone receptor (PR).The test resultsshow that the RF classification has the best accuracy score of 95.4%, precision score of 95.5%, recall score of 95.5%, specificity score of 95.4%, and ROC score of 99.5% in average. Furthermore, it has the lowest variation of results in the 10-fold testing, which makes it more robust and stable than the other two classifiers. Subsequently, the results confirm the suitability of the dataset in diagnosing breast cancer immunohistochemistry due to the high evaluation performance presented in the results tables.

Published
2021-06-09
How to Cite
Hasanain H. Razzaq, Rozaida Ghazali, Loay E. George, Salama A. Mostafa, Asaad A. Al-Janabi, Ali Hussein Fadel, Zaid Rajih Mohammed, Maadh Hmosze, Hussein Bahaa Abdulrazzaq, Taif Raoof Hamza. (2021). Empirical Analysis of a New Immunohistochemical Breast Cancer Images Dataset. Design Engineering, 21 - 36. Retrieved from http://www.thedesignengineering.com/index.php/DE/article/view/1937
Section
Articles