This project is my
dissertation project.
Generative AI like ChatGPT,
GPT-4, and Google Bard (PALM2) are reshaping various sectors by mimicking
human-like responses, blurring the line between AI-generated and human content.
This shift has led to
reliance on these AI models for various tasks such as academic writing, fake
reviews, misleading news, and social media posts worldwide. To tackle this,
multilingual models have emerged to distinguish between human and AI-generated
text. However, most prior studies focused primarily on English, with limited
testing on other languages like Japanese, German, and Hindi.
Seven models and a
perplexity-based method were analyzed, with five models consistently tested
across multiple languages. The absence of comprehensive datasets for these
languages required new dataset development. Generally, these models performed
well when trained and tested on diverse topics but struggled when exposed to
single-topic datasets, particularly RoBERTa.
Misclassifications occurred, with some
machine-generated texts being labeled as human-written. BERT showed better
overall performance in languages like German and English, while XLM-RoBERTa and
DistilBERT-Multilingual excelled with Hindi texts. Perplexity-based methods
like GPTZero effectively differentiated between human and machine-generated English
texts, suggesting the use of watermarking algorithms by language models. Models
specifically pretrained on languages like HindiBERT and BERTJapanese accurately
classified human-written and machine-generated text in their respective
languages.
For more details visit:
https://github.com/surajjeoor/Human_vs_machine_Analysis