Xiang Zhang - 21DOCS Test Site

Xiang Zhang

Public Documents 3

Wital: A COTS WiFi Devices Based Vital Signs Monitoring System Using NLOS Sensing Mod...

Xiang Zhang

and 4 more

May 26, 2023

Vital sign (breathing and heartbeat) monitoring is essential for patient care and sleep disease prevention. Most current solutions are based on wearable sensors or cameras; however, the former could affect sleep quality, while the latter often present privacy concerns. To address these shortcomings, we propose Wital, a contactless vital sign monitoring system based on low-cost and widespread commercial off-the-shelf (COTS) Wi-Fi devices. There are two challenges that need to be overcome. First, the torso deformations caused by breathing/heartbeats are weak. How can such deformations be effectively captured? Second, movements such as turning over affect the accuracy of vital sign monitoring. How can such detrimental effects be avoided? For the former, we propose a non-line-of-sight (NLOS) sensing model for modeling the relationship between the energy ratio of line-of-sight (LOS) to NLOS signals and the vital sign monitoring capability using Ricean K theory and use this model to guide the system construction to better capture the deformations caused by breathing/heartbeats. For the latter, we propose a motion segmentation method based on motion regularity detection that accurately distinguishes respiration from other motions, and we remove periods that include movements such as turning over to eliminate detrimental effects. We have implemented and validated Wital on low-cost COTS devices. The experimental results demonstrate the effectiveness of Wital in monitoring vital signs.

CFNet: A Coarse-to-Fine Learning Strategy for Facial Expression Recognition

Xiang Zhang

October 05, 2022

Facial expression recognition (FER) is a challenging job in Computer Vision due to data uncertainties rooted in the ambiguity of facial expressions. As a complement to current FER studies huddling in data-level or feature-level for suppressing such uncertainties, we propose a simple yet efficient coarse-to-fine learning strategy at task-level inspired by human beings’ emotion cognitive mode. Specifically, a child learns quickly whether his behavior is allowed by reading adults’ facial expressions for coarse attitude like positive or negative, and then adjusts accordingly via further interpreting its fine sentiment meaning like happy or angry. Similarly, we follow the same divide-and-conquer paradigm and decompose FER into two correlated easier sub-tasks: i) coarse classification for attitude analysis and ii) fine recognition for sentiment interpretation, and build a multi-branch deep network (CFNet) to tackle them in a joint manner. The key idea is to aggregate the discrete universal facial expressions into several coarse groups reflecting attitude tendency based on their empirical projections in the continuous Valence-Arousal (VA) emotion space. However, the coarse classification sub-task is inherently tougher due to its significantly larger intra-class variations compared to the fine recognition sub-task. Such discord could lead to immature learning and degrade the overall performance. To overcome this issue, CFNet leverages a synchronization mechanism to control the learning process via knowledge sharing between both sub-tasks. In addition, a novel center loss is introduced to enhance the discriminative power of the network in extracting compact intra-class representations while preserving intrinsic inter-class relationships. Experiments on three benchmark datasets show that our method achieves state-of-the-art performance, which demonstrates its superiority. The code is available at https://github.com/codpub/CFNet.

WiGRUNT: WiFi-enabled Gesture Recognition Using Dual-attention Network

Yu Gu

and 6 more

February 10, 2021

Gestures constitute an important form of nonverbal communication where bodily actions are used for delivering messages alone or in parallel with spoken words. Recently, there exists an emerging trend of WiFi sensing enabled gesture recognition due to its inherent merits like device-free, non-line-of-sight covering, and privacy-friendly. However, current WiFi-based approaches mainly reply on domain-specific training since they don’t know “\emph{where to look}’‘ and “\emph{when to look}”. To this end, we propose WiGRUNT, a WiFi-enabled gesture recognition system using dual-attention network, to mimic how a keen human being intercepting a gesture regardless of the environment variations. The key insight is to train the network to dynamically focus on the domain-independent features of a gesture on the WiFi Channel State Information (CSI) via a spatial-temporal dual-attention mechanism. WiGRUNT roots in a Deep Residual Network (ResNet) backbone to evaluate the importance of spatial-temporal clues and exploit their inbuilt sequential correlations for fine-grained gesture recognition. We evaluate WiGRUNT on the open Widar3 dataset and show that it significantly outperforms its state-of-the-art rivals by achieving the best-ever performance in-domain or cross-domain.