TY - JOUR
T1 - Differentiating Between Human-Written and AI-Generated Texts Using Automatically Extracted Linguistic Features
AU - Georgiou, Georgios P.
N1 - Publisher Copyright:
© 2025 by the author.
PY - 2025/11
Y1 - 2025/11
N2 - While extensive research has focused on ChatGPT in recent years, very few studies have systematically quantified and compared linguistic features between human-written and artificial intelligence (AI)-generated language. This exploratory study aims to investigate how various linguistic components are represented in both types of texts, assessing AI’s ability to emulate human writing. Using human-authored essays as a benchmark, we prompted ChatGPT to generate essays of equivalent length. These texts were analyzed using Open Brain AI, an online computational tool, to extract measures of phonological, morphological, syntactic, and lexical constituents. Despite AI-generated texts appearing to mimic human speech, the results revealed significant differences across multiple linguistic features such as specific types of consonants, nouns, adjectives, pronouns, adjectival/prepositional modifiers, and use of difficult words, among others. These findings underscore the importance of integrating automated tools for efficient language assessment, reducing time and effort in data analysis. Moreover, they emphasize the necessity for enhanced training methodologies to improve AI’s engineering capacity for producing more human-like text.
AB - While extensive research has focused on ChatGPT in recent years, very few studies have systematically quantified and compared linguistic features between human-written and artificial intelligence (AI)-generated language. This exploratory study aims to investigate how various linguistic components are represented in both types of texts, assessing AI’s ability to emulate human writing. Using human-authored essays as a benchmark, we prompted ChatGPT to generate essays of equivalent length. These texts were analyzed using Open Brain AI, an online computational tool, to extract measures of phonological, morphological, syntactic, and lexical constituents. Despite AI-generated texts appearing to mimic human speech, the results revealed significant differences across multiple linguistic features such as specific types of consonants, nouns, adjectives, pronouns, adjectival/prepositional modifiers, and use of difficult words, among others. These findings underscore the importance of integrating automated tools for efficient language assessment, reducing time and effort in data analysis. Moreover, they emphasize the necessity for enhanced training methodologies to improve AI’s engineering capacity for producing more human-like text.
KW - ChatGPT
KW - essays
KW - linguistic features
KW - Open Brain AI
UR - https://www.scopus.com/pages/publications/105022886860
U2 - 10.3390/info16110979
DO - 10.3390/info16110979
M3 - Article
AN - SCOPUS:105022886860
SN - 2078-2489
VL - 16
JO - Information (Switzerland)
JF - Information (Switzerland)
IS - 11
M1 - 979
ER -