This paper presents a new hybrid methodology for evaluating Retrieval-Augmented Generation (RAG) systems. The existing approaches provide a limited, one-dimensional assessment of RAG that lacks generalizability and is not suitable for interdisciplinary applications which cannot be extended to more than one domain. To address this, we propose a comprehensive framework that incorporates semantic metrics assessing several aspects, such as query relevance, factual accuracy, context consistency, semantic coherence, relevance, hallucination detection, and answer correctness using modern natural language processing tools. Our proposed methodology was compared to the existing state-of-the-art like llm-as-judge approach and outperformed the competitors by upto 10%. The architecture is designed for flexibility, making it applicable not only to RAG systems but also to a range of natural language generation tasks. This work extends the existing body of knowledge in both RAG systems as well as natural language generation tasks by providing a robust multidimensional evaluation approach. Robust scoring system that penalizes lower scores with Harmonic Means together with PCA, Adaptive, and Entropy weighting approaches identifies areas for improvement and provides specific retrieval and chunking method recommendations. Semantic metrics provides a low-cost alternative evaluation technique suitable for closed or offline contexts across multiple domains.