Rahul Shetty - 21DOCS Test Area

Digital content has grown exponentially, requiring efficient and context-aware information retrieval systems. Search methods which utilize key word based functions, such as BM25, rely on lexical matching but often fail to capture semantic relationships. Using dense vector embeddings models, Small Language Models (SLMs) can improve retrieval accuracy, however Retrieval-Augmented Generation (RAG) incorporates the concept of contextual ranking through generative artificial intelligence to further enhance search relevance. An experiment on a single document corpus and a cryptography-related query evaluates BM25, SLM embeddings, and SLM + RAG for document retrieval. Based on experimental results, BM25 achieves a moderate relevance score of 0.500, retrieving documents based on exact matches but lacking contextual understanding. As SLM embeddings identify semantically similar concepts, they increase recall for queries with conceptual variations and a relevance score 0.7668. SLM + RAG outperforms both approaches' relevance score 0.9202, retrieving the most relevant document with accuracy and contextually enriched responses. A hybrid retrieval model that combines dense embeddings and generative ranking improves search quality substantially. This exploratory study has multiple implications for information retrieval and including the need for scalable multi-document retrieval, FAISS/DPR integration for efficient vector searches, and domain-specific fine-tuning of SLMs. Using this approach, enterprises can achieve faster, more accurate, and contextually aware document retrieval by combining SLMs and RAG methods. Hybrid approaches could be explored in large-scale retrieval settings, with stable workflow integration processes across diverse industries like legal research, healthcare, government, and finance.