An Enhanced RAG Chatbot Using Multi-Metric LLM Evaluation

P Harish; M Praneeth Kumar

doi:10.62643/ijerst.2026.v22.n2(1).pp379-384

Authors

P Harish Author
M Praneeth Kumar Author

DOI:

https://doi.org/10.62643/ijerst.2026.v22.n2(1).pp379-384

Keywords:

Retrieval-Augmented Generation (RAG), Large Language Model (LLM), RAG-based chatbot, Enhanced RAG, Multi-metric evaluation, LLM-based evaluation, Document retrieval.

Abstract

Retrieval-Augmented Generation (RAG) improves the factual accuracy of Large Language Model (LLM) chatbots by grounding responses in external knowledge. However, challenges such as hallucinations, incomplete answers, and poor evaluation methods remain. This project presents an Enhanced RAG-Based Chatbot with Multi-Metric LLM-Based Answer Evaluation. The framework combines efficient document retrieval, context-aware response generation, and automated multi-dimensional evaluation. Using embeddingbased similarity search, it retrieves relevant information and generates responses with an LLM. A multi-metric evaluation module assesses answers based on criteria like semantic correctness, contextual relevance, fidelity to sources, completeness, and language quality, allowing for scalable and consistent assessments with minimal human input. Experimental results show significant improvements in answer reliability and evaluation over traditional single-metric RAG systems, making it suitable for enterprise knowledge systems and domainspecific applications where accuracy is essential.