HYBRID CNN-DRIVEN RANDOM FOREST AND FASTTEXT EMBEDDINGS FOR UNMASKING AI-GENERATED TWEETS
DOI:
https://doi.org/10.62643/ijerst.v21.n3(1).pp528-535Keywords:
Deepfake Tweet Detection, AI-generated Tweets, Social Media Misinformation, Convolutional Neural Network, Random Forest Classifier.Abstract
This study introduces a robust framework for detecting fake tweets using a dedicated Twitter fake tweet
dataset, combining natural language processing (NLP) and machine learning to improve classification
accuracy. Traditional manual detection methods are limited by scalability issues, subjectivity, and the
inability to effectively identify subtle linguistic or contextual signals in vast volumes of social media data.
To overcome these limitations, the proposed approach employs a multi-stage pipeline. It begins with
comprehensive NLP preprocessing to clean and normalize the tweet content, followed by the application
of FastText embeddings to convert textual information into meaningful numerical vectors. The data is
then partitioned into training and testing sets using a train-test split strategy to ensure reliable evaluation.
A deep learning convolutional neural network (DLCNN) is used for sophisticated feature extraction,
uncovering complex patterns within the text. These features are subsequently classified using a random
forest algorithm, which determines whether tweets are real or fake. The model's performance is
thoroughly evaluated using key metrics to validate its accuracy and applicability in real-world
misinformation detection scenarios.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.













