AUTHOREA
Log in
Sign Up
Browse Preprints
LOG IN
SIGN UP
Essential Site Maintenance
: Authorea-powered sites will be updated circa 15:00-17:00 Eastern on Tuesday 5 November.
There should be no interruption to normal services, but please contact us at
[email protected]
in case you face any issues.
Zhao Li
Public Documents
1
SVIT-SSR:A sEMG-based Vision Transformer Approach for silent speech recognition
Zhao Li
and 5 more
August 01, 2024
Silent Speech Recognition (SSR) based on Surface Electromyography (sEMG) is a voice interaction technology proposed for scenarios requiring silent operations. In this article, we abstract the SSR task based on sEMG into a short-term image sequence classification task. We perform time-frequency domain feature extraction and data reconstruction on the muscle activity segment data. Additionally, we analyze the temporal and spatial dimensions to capture the intrinsic correlation representation of muscle activity. We propose the SVIT-SSR model based on the Vision Transformer (VIT) framework. Finally, we design experiments to identify 33 types of typical silent speech commands in the SSR dataset. The results demonstrate that the proposed model achieves an accuracy of 96.67±1.15%, outperforming similar algorithms.