BiLSTM-SLA for enhancing biomedical entity linking in short texts:
Bidirectional LSTM approach with Stacked Layers and Attention mechanism
Abstract
Biomedical Entity Linking (BEL) aims to map biomedical mentions, such as
diseases and drugs, to standard entities in a given knowledge base. BEL
within short text is viewed as a ranking challenge. Current research on
this topic is mostly about long text relaying on the rich contextual
information, which might not suitably apply to shorter texts.
Furthermore, supervised approaches face hurdles in training data
suitability, particularly in domains like biomedicine characterized by
entities with diverse naming conventions, including abbreviations,
acronyms, synonyms with different morphological variations and word
orderings. To address these challenges, we propose a BiLSTM-Stacked
Layers and Attention mechanism (BiLSTM-SLA) which refers a novel
approach that integrates a BiLSTM with stacked layers and attention
mechanism to enhance BEL within short texts. BiLSTM-SLA aims to provide
a deeper understanding of the input text by considering bidirectional
context analysis and leveraging stacked layers for nuanced temporal
dependencies within the candidate sequence. Moreover, the attention
mechanism enables the model to dynamically focus on and assign weights
to different important part of the text. BiLSTM-SLA is assessed against
two types of gold standards: KORE50 and Webscope for general English
short texts, and the NCBI Disease Corpus, TAC2017ADR and ShARe/CLEF
datasets for the biomedical domain. The experimental findings
demonstrate that BiLSTM-SLA attains state-of-the-art results across all
datasets, showcasing significant superiority over baseline methods.
Notably, it achieves accuracies of 87.63%, 92.87%, 91.23%, 91.74%,
and 92.78% on these benchmark datasets, respectively.