TY - JOUR
T1 - Reducing Uncertainty and Increasing Confidence in Unsupervised Learning
AU - Christakis, Nicholas
AU - Drikakis, Dimitris
N1 - Publisher Copyright:
© 2023 by the authors.
PY - 2023/7
Y1 - 2023/7
N2 - This paper presents the development of a novel algorithm for unsupervised learning called RUN-ICON (Reduce UNcertainty and Increase CONfidence). The primary objective of the algorithm is to enhance the reliability and confidence of unsupervised clustering. RUN-ICON leverages the K-means++ method to identify the most frequently occurring dominant centres through multiple repetitions. It distinguishes itself from existing K-means variants by introducing novel metrics, such as the Clustering Dominance Index and Uncertainty, instead of relying solely on the Sum of Squared Errors, for identifying the most dominant clusters. The algorithm exhibits notable characteristics such as robustness, high-quality clustering, automation, and flexibility. Extensive testing on diverse data sets with varying characteristics demonstrates its capability to determine the optimal number of clusters under different scenarios. The algorithm will soon be deployed in real-world scenarios, where it will undergo rigorous testing against data sets based on measurements and simulations, further proving its effectiveness.
AB - This paper presents the development of a novel algorithm for unsupervised learning called RUN-ICON (Reduce UNcertainty and Increase CONfidence). The primary objective of the algorithm is to enhance the reliability and confidence of unsupervised clustering. RUN-ICON leverages the K-means++ method to identify the most frequently occurring dominant centres through multiple repetitions. It distinguishes itself from existing K-means variants by introducing novel metrics, such as the Clustering Dominance Index and Uncertainty, instead of relying solely on the Sum of Squared Errors, for identifying the most dominant clusters. The algorithm exhibits notable characteristics such as robustness, high-quality clustering, automation, and flexibility. Extensive testing on diverse data sets with varying characteristics demonstrates its capability to determine the optimal number of clusters under different scenarios. The algorithm will soon be deployed in real-world scenarios, where it will undergo rigorous testing against data sets based on measurements and simulations, further proving its effectiveness.
KW - artificial intelligence
KW - machine learning
KW - uncertainty
KW - unsupervised learning
UR - http://www.scopus.com/inward/record.url?scp=85175113735&partnerID=8YFLogxK
U2 - 10.3390/math11143063
DO - 10.3390/math11143063
M3 - Article
AN - SCOPUS:85175113735
SN - 2227-7390
VL - 11
JO - Mathematics
JF - Mathematics
IS - 14
M1 - 3063
ER -