Guofan Shao, Hao Zhang, Jinyuan Shao, Keith Woeste and Lina Tang
Adv. Artif. Intell. Mach. Learn., 2 (4):471-476
Guofan Shao : Purdue University
Hao Zhang : Purdue University
Jinyuan Shao : Purdue University
Keith Woeste : US Department of Agriculture
Lina Tang : Chinese Academy of Sciences
DOI: 10.54364/AAIML.2022.1132
Article History: Received on: 31-Oct-22, Accepted on: 01-Nov-22, Published on: 08-Nov-22
Corresponding Author: Guofan Shao
Email: shao@purdue.edu
Citation: Guofan Shao (2022). Strengthening Machine Learning Reproducibility for Image Classification. Adv. Artif. Intell. Mach. Learn., 2 (4 ):471-476
Machine learning (ML) reproducibility needs to be assured with reliable evaluation measures. However, routine image classification is evaluated using metrics that are highly sensitive to class prevalence. Consequently, the reproducibility of ML models remains unclear due to the class imbalance-induced noise. We suggest regularly using class imbalance-resistant evaluation metrics, including balanced accuracy, area under precision-recall curve, and image classification efficacy, for the evaluation of the reproducibility of ML models. Each of these evaluation metrics is conceptually consistent with and logically complements the others, and their joint use can help explain different aspects of classification performance at the whole-class level and individual class level. These metrics can be used for the validation, testing, and/or transfer of ML classifiers. Comprehensive analysis using these metrics as a routine approach strengthens the reproducibility of ML models.