Human-vehicle co-pilot has recently been the research focus in autonomous vehicles. However, there have been many subjective uncertainties in the driver’s state and the behavior of human-vehicle interaction, which is a crucial factor constraining the safety of Human-in-the-loop co-driving. To promote the research in the domain, we propose a new visual?tactile perception method based on a driving simulation platform and have built a new large-scale dataset. The data of drivers’ visual and tactile behaviors and states during driver fatigue and human-vehicle switched driving are included to satisfy different investigation needs. Concerning visual data, we extract features from the drivers’ behaviors and states through a deep convolutional network in multiple angles, followed by adjusting the model with subjective feedback from Human-in-the-loop to improve the accuracy of the extracted feature through continuous iterations. Regarding haptic processing, we designed a steering wheel using flexible conductive electronic-skin material, which achieved time-series data extraction built upon the driver’s heart rate and rate of change based on the haptic signal when the drivers are holding the steering wheel. Moreover, we calibrated the extracted data with the wearable heart rate sensor. Finally, we synchronized the time-series visual-tactile data with time stamps, forming a cross-modal database that provides data support for implementing cross-modal perception algorithms for driver behavior. VTD Dataset Download Address: VTD Datasets (merakoswang.github.io)Â