HYBRID CNN-DRIVEN RANDOM FOREST AND FASTTEXT EMBEDDINGS FOR UNMASKING AI-GENERATED TWEETS

Authors

  • K. Manohar Rao Author
  • Manaswi Ramidi Author
  • Nikitha Manga Author
  • Mythili Koppula Author

DOI:

https://doi.org/10.62643/ijerst.v21.n3(1).pp528-535

Keywords:

Deepfake Tweet Detection, AI-generated Tweets, Social Media Misinformation, Convolutional Neural Network, Random Forest Classifier.

Abstract

This study introduces a robust framework for detecting fake tweets using a dedicated Twitter fake tweet 
dataset, combining natural language processing (NLP) and machine learning to improve classification 
accuracy. Traditional manual detection methods are limited by scalability issues, subjectivity, and the 
inability to effectively identify subtle linguistic or contextual signals in vast volumes of social media data. 
To overcome these limitations, the proposed approach employs a multi-stage pipeline. It begins with 
comprehensive NLP preprocessing to clean and normalize the tweet content, followed by the application 
of FastText embeddings to convert textual information into meaningful numerical vectors. The data is 
then partitioned into training and testing sets using a train-test split strategy to ensure reliable evaluation. 
A deep learning convolutional neural network (DLCNN) is used for sophisticated feature extraction, 
uncovering complex patterns within the text. These features are subsequently classified using a random 
forest algorithm, which determines whether tweets are real or fake. The model's performance is 
thoroughly evaluated using key metrics to validate its accuracy and applicability in real-world 
misinformation detection scenarios. 

Downloads

Published

14-07-2025

How to Cite

HYBRID CNN-DRIVEN RANDOM FOREST AND FASTTEXT EMBEDDINGS FOR UNMASKING AI-GENERATED TWEETS . (2025). International Journal of Engineering Research and Science & Technology, 21(3 (1), 528-535. https://doi.org/10.62643/ijerst.v21.n3(1).pp528-535