MelSpectroNet: Enhancing Voice Authentication Security with AI-based Siamese Model and Noise Reduction for Seamless User Experience

Gitesh Kambli; Jay Oza; Amit Maity

doi:10.22541/au.170258945.57517510/v1

loading page

MelSpectroNet: Enhancing Voice Authentication Security with AI-based Siamese Model and Noise Reduction for Seamless User Experience

Gitesh Kambli,
Jay Oza ,
Amit Maity

Abstract

Voice authentication has become critical for secure access control while achieving usability. Background noise and increased security requirements, however, continue to be problems. This paper presents MelSpectroNet, an innovative voice authentication system using Siamese neural network trained on over one million samples. It leverages mel spec-trograms for efficient feature extraction and employs noise reduction, enhancing reliability. The model achieves 96.62% test accuracy, demonstrating efficacy. Our methodology involves audio denoising, meticulous spectrogram preprocessing, a tailored Siamese architecture, and rigorous training. Testing demonstrates MelSpectroNet's exceptional performance and ability to generalize. However, enhancing longitudinal accuracy by accounting for natural voice variations over time still needs exploration. Overall, MelSpectroNet pioneers highly accurate and usable voice au-thentication with enhanced security. It balances user convenience and stringent authentication needs. This research motivates further work to optimize these systems for diverse conditions while advancing security and inclusiveness.