In this paper, a new path loss prediction model based on multi-modal sensory data is proposed to enhance the accuracy of path loss prediction in vehicular communication scenarios. Meanwhile, a new dataset under vehicular urban crossroads for intelligent multi-modal sensing-communication integration is constructed. Meanwhile, the mapping relationship between physical space and electromagnetic space is explored. Furthermore, path loss prediction is achieved with environmental information via multi-modal sensory data. Simulation results show that the proposed path loss prediction model is validated by the ground truth, which achieves a mean squared error (MSE) of 1.9283*10^{-6}. The proposed model improves the accuracy by 2 orders of magnitude over 3GPP TR 38.901 channel models. Compared to the artificial neural network (ANN), support vector regression (SVR), random forest (RF), and gradient tree boosting (GTB), the proposed model achieves the highest accuracy. Finally, the effectiveness of multi-modal sensory data fusion in path loss prediction for vehicular communication scenarios is validated, which shows a 19.8\% improvement in accuracy compared to predictions based on uni-modal data.