DEEP LEARNING APPROACH FOR AUTOMATED DEEPFAKE VIDEO FORGERY DETECTION
DOI:
https://doi.org/10.62643/ijerst.2026.v22.i1(2).pp112-115Keywords:
Bidirectional LSTM; Deepfake Detection; Face Forgery; GoogLeNet; Grad-CAM; Temporal Modelling; Threshold Optimisation; Video Analysis.Abstract
Deepfake videos, generated through face-swapping and manipulation techniques powered by Generative Adversarial Networks (GANs) and autoencoders, pose a significant threat to digital media integrity, online trust, and democratic discourse. This paper presents an end-to-end deepfake video detection system that combines GoogLeNetbased spatial feature extraction with a Bidirectional Long Short-Term Memory (BiLSTM) classifier for temporal sequence modelling. Each video is uniformly sampled into frames; faces are localised using the Viola-Jones Haar cascade detector; and per-frame 1024-dimensional feature vectors are extracted from the final pooling layer of a pretrained GoogLeNet. The resulting feature sequences are fed to a BiLSTM network trained with cross-entropy loss, Adam optimisation, and gradient clipping. A data-driven threshold tuning procedure sweeps the fake-class probability threshold from 0.35 to 0.60 to maximise the F1-score, yielding an optimal threshold of 0.35, which achieves 88.5% accuracy, 98.7% recall, and an F1-score of 89.56%. Grad-CAM heatmaps computed on the inception5b layer of GoogLeNet provide spatial explainability by highlighting manipulated facial regions. A Streamlit web application integrates all pipeline stages, offering real-time prediction, confidence calibration, and frame-level Grad-CAM overlays. Experiments conducted on the Celeb-DF benchmark demonstrate the effectiveness of the proposed approach for high-recall deepfake detection.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.













