![]() |
International Journal of Scientific Research and Engineering Development( International Peer Reviewed Open Access Journal ) ISSN [ Online ] : 2581 - 7175 |
IJSRED » Archives » Volume 9 -Issue 2

📑 Paper Information
| 📑 Paper Title | Authentica: A Multimodal DeepFake Detection System Using Computer Vision and Audio Signal Processing |
| 👤 Authors | Abhijeet Rameshwar Patil, Aryan Deepak Punde, Pratik Suresh Morye, Prof. Supriya Kale |
| 📘 Published Issue | Volume 9 Issue 2 |
| 📅 Year of Publication | 2026 |
| 🆔 Unique Identification Number | IJSRED-V9I2P109 |
| 📑 Search on Google | Click Here |
📝 Abstract
The proliferation of AI-generated synthetic media, commonly referred to as deepfakes, poses an escalating threat to digital authenticity, personal identity, and public trust. Manually verifying the authenticity of video content is impractical at scale, creating a critical need for automated detection systems. This paper presents an Enhanced Multimodal DeepFake Detection System that leverages computer vision, audio signal processing, and deep learning techniques to classify video content as real or synthetically manipulated. The proposed system accepts video files in standard formats and performs parallel analysis across two modalities: a video pipeline that extracts 13 facial feature attributes — including eye aspect ratio, inter-pupillary distance, head pose estimation, skin tone, and GLCM texture features — across multiple frames using OpenCV Haar cascade classifiers, processed by a PyTorch-based Artificial Neural Network (ANN); and an audio pipeline that generates melspectrogram representations from FFmpeg-extracted audio tracks, analyzed by a TensorFlow-based EfficientNetB0 Convolutional Neural Network. Final classification employs a confidence-weighted multimodal fusion algorithm that dynamically assigns weights to each modality based on prediction certainty, producing an interpretable verdict of REAL or DEEPFAKE accompanied by a percentage confidence score and risk level categorization. The system is deployed as a Flask web application supporting drag-and-drop video upload, automatic audio extraction, thumbnail generation, and structured result reporting. A memory-optimized training pipeline accommodates datasets of up to 56,000 labeled samples with data augmentation, achieving a 92% reduction in memory consumption through spectrogram dimensionality optimization. Experimental results demonstrate that the multimodal fusion approach yields more robust detection than single-modality analysis, making the system suitable for practical deployment in digital forensics, journalism verification, and social media monitoring contexts.
📝 How to Cite
Abhijeet Rameshwar Patil, Aryan Deepak Punde, Pratik Suresh Morye, Prof. Supriya Kale,"Authentica: A Multimodal DeepFake Detection System Using Computer Vision and Audio Signal Processing" International Journal of Scientific Research and Engineering Development, V9(2): Page(734-740) Mar-Apr 2026. ISSN: 2581-7175. www.ijsred.com. Published by Scientific and Academic Research Publishing.
📘 Other Details
