TY - JOUR
T1 - SUN
T2 - Stochastic UNsupervised Learning for Data Noise and Uncertainty Reduction
AU - Christakis, Nicholas
AU - Drikakis, Dimitris
N1 - Publisher Copyright:
© 2025 by the authors.
PY - 2025/12
Y1 - 2025/12
N2 - Unsupervised learning methods significantly benefit various practical applications by effectively identifying intrinsic patterns within unlabelled data. However, inherent data noise and uncertainties often compromise model reliability, result interpretability, and the overall effectiveness of unsupervised learning strategies, particularly in complex fields such as biomedical, engineering, and physics research. To address these critical challenges, this study proposes SUN (Stochastic UNsupervised learning), a novel approach that integrates probabilistic unsupervised techniques—specifically Gaussian Mixture Models—into the RUN-ICON unsupervised learning algorithm to achieve optimal clustering, systematically reduce data noise, and quantify inherent uncertainties. The SUN methodology strategically leverages probabilistic modelling for robust classification and detection tasks, explicitly targeting particle dispersion scenarios related to environmental pollution and airborne viral transmission, with implications for minimising public health risks. By combining advanced uncertainty quantification methods and innovative unsupervised denoising techniques, the proposed study aims to provide more reliable and interpretable insights than conventional methods while alleviating issues such as computational complexity and reproducibility that limit traditional mathematical modelling. This research contributes to enhanced trustworthiness and interpretability of unsupervised learning systems, offering a robust methodological framework for handling significant uncertainty in complex real-world data environments.
AB - Unsupervised learning methods significantly benefit various practical applications by effectively identifying intrinsic patterns within unlabelled data. However, inherent data noise and uncertainties often compromise model reliability, result interpretability, and the overall effectiveness of unsupervised learning strategies, particularly in complex fields such as biomedical, engineering, and physics research. To address these critical challenges, this study proposes SUN (Stochastic UNsupervised learning), a novel approach that integrates probabilistic unsupervised techniques—specifically Gaussian Mixture Models—into the RUN-ICON unsupervised learning algorithm to achieve optimal clustering, systematically reduce data noise, and quantify inherent uncertainties. The SUN methodology strategically leverages probabilistic modelling for robust classification and detection tasks, explicitly targeting particle dispersion scenarios related to environmental pollution and airborne viral transmission, with implications for minimising public health risks. By combining advanced uncertainty quantification methods and innovative unsupervised denoising techniques, the proposed study aims to provide more reliable and interpretable insights than conventional methods while alleviating issues such as computational complexity and reproducibility that limit traditional mathematical modelling. This research contributes to enhanced trustworthiness and interpretability of unsupervised learning systems, offering a robust methodological framework for handling significant uncertainty in complex real-world data environments.
KW - artificial intelligence
KW - stochastic modelling
KW - uncertainty
KW - unsupervised learning
UR - https://www.scopus.com/pages/publications/105025815304
U2 - 10.3390/app152412954
DO - 10.3390/app152412954
M3 - Article
AN - SCOPUS:105025815304
SN - 2076-3417
VL - 15
JO - Applied Sciences (Switzerland)
JF - Applied Sciences (Switzerland)
IS - 24
M1 - 12954
ER -