TY - JOUR
T1 - A Framework for Efficient N-Way Interaction Testing in Case/Control Studies with Categorical Data
AU - Aristodimou, Aristos
AU - Antoniades, Athos
AU - Dardiotis, Efthimios
AU - Loizidou, Eleni
AU - Spyrou, George
AU - Votsi, Christina
AU - Kyproula, Christodoulou
AU - Pantzaris, Marios
AU - Grigoriadis, Nikolaos
AU - Hadjigeorgiou, Georgios
AU - Kyriakides, Theodoros
AU - Pattichi, Constantinos
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2021
Y1 - 2021
N2 - Goal: Most common diseases are influenced by multiple gene interactions and interactions with the environment. Performing an exhaustive search to identify such interactions is computationally expensive and needs to address the multiple testing problem. A four-step framework is proposed for the efficient identification of n-Way interactions. Methods: The framework was applied on a Multiple Sclerosis dataset with 725 subjects and 147 tagging SNPs. The first two steps of the framework are quality control and feature selection. The next step uses clustering and binary encodes the features. The final step performs the n-Way interaction testing. Results: The feature space was reduced to 7 SNPs and using the proposed binary encoding, more 2-SNP and 3-SNP interactions were identified compared to using the initial encoding. Conclusions: The framework selects informative features and with the proposed binary encoding it is able to identify more n-way interactions by increasing the power of the statistical analysis.
AB - Goal: Most common diseases are influenced by multiple gene interactions and interactions with the environment. Performing an exhaustive search to identify such interactions is computationally expensive and needs to address the multiple testing problem. A four-step framework is proposed for the efficient identification of n-Way interactions. Methods: The framework was applied on a Multiple Sclerosis dataset with 725 subjects and 147 tagging SNPs. The first two steps of the framework are quality control and feature selection. The next step uses clustering and binary encodes the features. The final step performs the n-Way interaction testing. Results: The feature space was reduced to 7 SNPs and using the proposed binary encoding, more 2-SNP and 3-SNP interactions were identified compared to using the initial encoding. Conclusions: The framework selects informative features and with the proposed binary encoding it is able to identify more n-way interactions by increasing the power of the statistical analysis.
KW - Clustering
KW - Epistasis
KW - Feature Selection
KW - Interaction Testing
KW - Machine Learning
UR - https://www.scopus.com/pages/publications/85121061051
U2 - 10.1109/OJEMB.2021.3100416
DO - 10.1109/OJEMB.2021.3100416
M3 - Article
AN - SCOPUS:85121061051
SN - 2644-1276
VL - 2
SP - 256
EP - 262
JO - IEEE Open Journal of Engineering in Medicine and Biology
JF - IEEE Open Journal of Engineering in Medicine and Biology
M1 - 9497700
ER -