Brain-Computer Interface (BCI) is a technology that enables direct communication between the brain and external devices, typically by interpreting neural signals. BCI-based solutions for neurodegenerative disorders need datasets with patients’ native languages. However, research in BCI lacks insufficient language-specific datasets, as seen in Odia, spoken by 35-40 million individuals in India. To address this gap, we developed an Electroencephalograph (EEG) based BCI dataset featuring EEG signal samples of commonly spoken Odia words. Using the OpenBCI Cyton device, EEG recordings are collected from a volunteer who speaks Odia language. The dataset is divided into 4 parts: (i) vocal Odia words, (ii) English translations of these Odia words, (iii) sub-vocalization of the Odia words, and (iv) sub-vocalization of English words. The dataset contains information about 100 different words. Each word is recorded with ten trials. By training the dataset using Machine Learning (ML) and Deep Learning (DL) methods, a BCI system can be designed to translate EEG signals into both vocal and subvocal for the Odia and English languages. This can enhance the communication and quality of Odia-speaking patients with neurodegenerative diseases.