Ifrah Andleeb

and 2 more

Machine learning (ML) has become essential for tasks like detection and classification in autonomous vehicles (AVs). However, ML models are vulnerable to adversarial attacks, which can undermine passenger trust and raise safety concerns in autonomous driving systems. This is especially critical in systems like traffic sign recognition (TSR), where a misclassification caused by an adversarial attack could lead to serious safety risks. In this work, we propose a lightweight yet accurate defense system against adversarial attacks in TSR systems. Specifically, we first investigate the vulnerabilities of TSR models to adversarial attacks, where adversarial attacks, such as projected gradient descent (PGD) and the fast gradient sign method (FGSM), are considered and an adversarial attack pipeline is proposed that focuses on specific regions of interest (ROI) in traffic signs using a ScoreCAM-based approach to improve the effectiveness of FGSM and PGD attacks on TSR models. These attacks manipulate the input data to mislead the models, achieving a high attack success rate (ASR) by exploiting their vulnerabilities. Then, to address adversarial attacks, we propose a dual autoencoder-based defense system against adversarial attacks on traffic signs. This model combines two encoders that work collaboratively: one optimized for global feature extraction and the other for local features. The defense mechanism also integrates residual connections to retain important features of the input and attention mechanism to highlight critical regions in traffic sign images. Experimental results demonstrate that the proposed defense model outperforms existing works with a test classification accuracy of 96.08% and 96.69% against PGD and FGSM attack scenarios, respectively, while maintaining a fraction of the model size (approximately 9MBs) and a reduced parameter count, thereby making it the most lightweight and high-performance model available for TSR.