Hybrid precoding has been envisaged as an attractive alternative to fully digital precoding in massive multiple-input-multiple-output (mMIMO) systems, where it can ultimately reduce the cost and the power consumption while maintaining an acceptable sum rate. Within this context, deep learning (DL) is proposed as an optimization tool to realize this. However, most of the existing DL based hybrid precoding solutions are based on the convolutional neural networks (CNNs), which have inductive bias that limits the learning capability of the highly stochastic massive MIMO channel. Conversely, the present contribution proposes a novel DL based hybrid precoder that can achieve an improved performance at a reduced computation time compared to the CNN based techniques. This is achieved through the design of a Perceiver Neural Network (PNN) architecture for hybrid precoding, where the PNN accepts a noisy channel matrix as an input and produces the analog precoder and combiners as an output. Also, the proposed architecture learns to reshape the input data in order to achieve the best accuracy through a novel trainable reshaping module. Our design involves an offline phase, in which we build our realistic ray tracing based massive MIMO dataset to train our Perceiver-based hybrid precoder (PBHP). In this context, we conduct extensive experiments to compare the PBHP with a CNN-based hybrid precoder (CNN-HP). It is extensively shown that the proposed PBHP outperforms the CNN-HP in terms of both accuracy and inference time. Moreover, the PBHP is more robust when the transmit power is low and the number of antennas is large. Finally, the offered results demonstrate that the PBHP exhibits a drastically less inference time (by nearly an order of magnitude) than the CNN-HP, especially for higher number of antennas, rendering it a promising candidate for the next-generation wireless networks.