بنچمارک TekRAG فارسی: کدام مدل هوش مصنوعی کمتر توهم میزند؟ نتایج و رتبهبندی
آرسام صباغ · ۱۴۰۵/۰۲/۱۸ · 34 دقیقه
Retrieval-augmented generation has become the default architecture for connecting large language models to external knowledge, but its reliability is still uneven in lower-resource, mixed-script, and technical domains.…
Abstract
Retrieval-augmented generation, or RAG, improves language-model reliability by grounding generation in external documents. Yet most RAG evaluation remains concentrated in English or in broad-domain b…
Highlights
1. Introduction
Large language models are now used as interfaces to knowledge bases, documentation portals, customer-support repositories, news archives, and internal corporate memory. In these settings, a fluent an…
2. Contributions
TekRAG-Persian is designed around three contributions. First, it defines a Persian technical QA benchmark built from curated technology articles with explicit question, answer, and evidence mapping.…
تحلیل کامل را در تکناو بخوانید
TekRAG-Persian — نخستین بنچمارک دادهمحور ارزیابی RAG فارسی: روش، معیارها و نتایج مدلها
خواندن مقاله →