Parvathy S S - 21DOCS Test Area

Parvathy S S

Master's Student

Thiruvananthapuram

Public Documents 7

IIST BCI Dataset-1 for Selected Common Malayalam Words

Parvathi Nair

and 6 more

January 08, 2024

Designing Brain Computer Interfaces (BCIs), for helping patients, needs appropriate datasets which are relevant for the language of the patients. There exists a significant shortage of datasets for Indian languages that can be used for BCI research. Malayalam is a prominent south Indian language spoken by more than 34 million people, yet, there exist no BCI datasets for research. We address this issue by creating a dataset for selected Malayalam words by collecting Electro Encephalograph (EEG) signal samples. Our dataset was created by generating EEG samples using the OpenBCI Cyton device when the commonly used Malayalam words were spoken by a volunteer. The created dataset consists of three major types of data: (i) EEG data for spoken Malayalam words, (ii) EEG data for the spoken English words which were closest to the English translation of the corresponding Malayalam words, and (iii) EEG data for sub-vocal (silent) pronunciation of the Malayalam words. We created the dataset for 26 words where each of these words had been recorded for the above mentioned three types. For each word, 10 EEG samples over 8 channels were recorded. This dataset is useful for developing BCI solutions for patients suffering from neuro-degenerative diseases by developing Machine Learning (ML) classifiers for translating EEG-signals to Malayalam words, vocal or sub-vocal, especially considering the scarcity of datasets available in Indian languages.

IIST BCI Dataset-2 for Selected Common Marathi Words

Shubham Tayade

and 5 more

March 14, 2024

To solve problems of neurodegenerative disorder patients, Brain-Computer Interface (BCI) based solutions require datasets relevant to the languages spoken by patients. BCI Research sometimes gets restricted due to the lack of datasets. For example, Marathi, a prominent language spoken by over 83 million people in India, lacks BCI datasets based on the language for research purposes. To tackle this gap, we created a dataset comprising of Electroencephalograph (EEG) signal samples of selected common Marathi words. EEG samples were captured using the OpenBCI Cyton device for constructing a dataset by volunteers who speak commonly used Marathi words. The dataset contains EEG recordings involving volunteers pronouncing commonly used Marathi words. It encompasses three main categories: (i) Utterances of Marathi words (Vocal), (ii) English translations of these Marathi words (Vocal), and (iii) Silent pronunciation (sub-vocalization) of the Marathi words. We compiled data for 100 distinct words, each with recordings for these three categories. Ten trials were conducted for each phrase. This dataset is valuable for developing BCI solutions to assist Marathi-speaking patients with neurodegenerative diseases. BCI solutions using Machine Learning (ML) classifiers and Deep Learning methods can be used to translate EEG signals into Marathi words, both vocal and sub-vocal.

IIST BCI Dataset-8 for Selected Common Telugu words

Likhith Boddapu

and 5 more

August 20, 2024

Brain-Computer Interface (BCI) systems require the usage of datasets corresponding to the spoken languages of patients in order to serve them with neurodegenerative illnesses.There exist a shortage of such BCI datasets for Indian langauges. The primary objective of this endeavor is to provide BCI datasets for Telugu, a prominent Indian language that is spoken by over 90 million people.This dataset includes male and female Telugu language BCI data. The goal of this dataset is to aid in the advancement of BCI research for native Telugu speakers. Using machine learning (ML) and other classifiers, BCI systems can interpret and categorize EEG data into the display or pronunciation of Telugu words. We have recorded 100 distinct Telugu words with ten trials each in both vocal and subvocal forms, along with their equivalent English translations. Increasing the applicability of BCI technology for Telugu speakers is the aim of the IIST-BCI Dataset for Telugu. By encouraging further developments in BCI application research, this dataset enhances the support given to Telugu-speaking patients afflicted with neurological diseases.

IIST BCI Dataset-3 for 100 Malayalam Words

Parvathy S S

and 5 more

April 08, 2024

This paper introduces a dataset capturing brain signals generated by the recognition of 100 Malayalam words, accompanied by their English translations. The dataset encompasses recordings acquired from both vocal and sub-vocal modalities for the Malayalam vocabulary. For the English equivalents, solely vocal signals were collected. This dataset is created to help Malayalam speaking patients with neuro-degenerative diseases. This dataset not only contributes to the advancement of braincomputer interface technology but also holds promise in fostering effective communication solutions for individuals with restricted verbal abilities.

IIST BCI Dataset-5 for Malayalam Vowels and Consonants

Nancy Sunil

and 5 more

May 09, 2024

This paper presents a dataset of brain Electroencephalogram (EEG) signals created when Malayalam vowels and consonants are spoken. The dataset was created by capturing EEG signals utilizing the OpenBCI Cyton device while a volunteer spoke Malayalam vowels and consonants. It includes recordings obtained from both sub-vocal and vocal. The creation of this dataset aims to support individuals who speak Malayalam and suffer from neurodegenerative diseases. Moreover, this dataset is expected to advance brain-computer interface technology and has potential in developing effective communication solutions for individuals with limited verbal abilities.

IIST BCI Dataset-7 for Human Space Missions

Adithya Pramod Menon

and 4 more

August 06, 2024

We created a unique dataset for Brain Computer Interface (BCI) system development by recording brain activity from a dedicated volunteer pronouncing 100 specially selected Malayalam words and English translations relevant to astronauts during human space missions. These words were spoken vocally and subvocally 50 times each, utilizing non-invasive Electroencephalography (EEG) sensors. This dataset paves the way for future BCI applications, potentially transforming communication for astronauts and individuals with limitations.

IIST BCI Dataset-4 for Selected 100 Telugu words

Chittaloori Likhitha

and 6 more

April 23, 2024

To overcome the challenges faced by people with neurodegenerative diseases, Brain-Computer Interface (BCI) systems must make use of datasets relevant to patient's spoken languages. However, BCI research frequently faces setbacks due to the absence of such datasets for the target language population. This paper deals with the BCI datasets for one of the major Indian languages, Telugu, used by more than 90 million people, yet to obtain essential BCI datasets capturing the linguistic characteristics. To solve the unavailability of the Telugu BCI datasets, we created a dataset featuring EEG signal samples corresponding to frequently used Telugu words, aiming to fill this void and facilitate advancements in BCI research for native Telugu speakers. Through the utilization of Machine Learning and other classifiers, BCI systems can potentially translate and classify EEG signals into a display of or pronouncing Telugu words. Our dataset consists of both vocal and sub-vocal datasets of Telugu words and an English dataset of the corresponding English equivalents of the Telugu words. This IIST-BCI Dataset for Telugu is the first of its kind. It is dedicated to improving the accessibility of BCI technologies for Telugu-speaking individuals and fostering further research progress in this particular area.