In recent years, transformers have been attracting considerable attention in various natural language processing tasks. Recently, they have been used not only in natural language processes, but also for processing multimodal data such as images, video, and audio, and their effectiveness has been demonstrated. The processing of multimodal data is extremely important in robot intelligence. Therefore, the multimodal transformers have the potential to contribute to the development of robotics in various domains. In this paper, we review the application of transformers to robots and discuss the possibility of transformers solving the problems in current intelligent robotics.