The idea of a Cyber-Physical-Social System, or CPSS for short, is a relatively new concept that has emerged as a response to the requirement to comprehend the influence that Cyber-Physical Systems (CPS) have on people and vice versa. Conversational assistants (CAs), also called bots, are dedicated to oral or written communication. Over time, the CAs have gradually diversified to today touch various fields such as e-commerce, healthcare, tourism, fashion, travel, and many others sectors. Natural-language understanding (NLU) is fundamental in the Natural Language Processing (NLP) field. Identifying user intents from natural language utterances is a crucial step in conversational systems, and the diversity in user utterances makes intent detection even a challenging problem. Recently, with the emergence of Deep Neural Networks. New State of the Art (SOA) results have been achieved for different NLP tasks. Recurrent Neural networks (RNNs) and recent Transformer architectures are two major players in those improvements. In addition, RNNs have been playing an increasingly important role in sequence modeling in different application areas. On the other hand, Transformer models are new architectures that benefit from the attention mechanism, extensive training datasets, and compute power. First, this review paper presents a comprehensive overview of RNN and Transformer models. Then, a comparative study of the performance of different RNNs and Transformer architectures for the specific task of intent recognition for CAs which is a fundamental task of NLU.