In recent years, deep learning methods have significantly increased the classification accuracy of remotely sensed images. However, most of the methods focus only on spectral information ignoring the spatial information, thus extracting only low-level features from a hyperspectral image. In this study, a multi-level 3-dimensional convolutional neural network (3-D CNN) has been proposed. The 3-D CNN serves the purpose of including both spatial and spectral information. The multi-level architecture consists of varying kernel sizes to extract features at different levels. This helps in distinguishing classes from multiple spatial scales and aspect ratios. We have evaluated the performance of the proposed approach on four standard hyperspectral datasets to verify the generalisation ability. Compared with other state-of-the-art methods, an improvement of 2% − 5% in overall accuracy and kappa coefficient has been observed. The effect of spatial window size on classification accuracy has been analysed as well in this study. Furthermore, in comparison with the former deep learning models, our approach is found to be less sensitive to the network parameters and achieves better accuracy even with lesser network depth.