loading page

BERTGuard: Robust Text Classification against Adversarial Attacks
  • Laxmi Shaw,
  • Mohammed Wasim Ansari,
  • Tahir Ekin
Laxmi Shaw

Corresponding Author:[email protected]

Author Profile
Mohammed Wasim Ansari
Tahir Ekin

Corresponding Author:

Abstract

This paper introduces BERTGuard, a text classification framework that enhances the resilience of BERT models against adversarial attacks through feature trimming and sub-sampling within an adversarial training framework. Our approach outperforms the baseline BERT models against various augmented white-box adversarial attacks. Its resiliency is demonstrated using the benchmark IMDB movie review dataset.