علم داده

بنچمارک TekRAG فارسی: کدام مدل هوش مصنوعی کمتر توهم می‌زند؟ نتایج و رتبه‌بندی

آرسام صباغ · ۱۴۰۵/۰۲/۱۸ · 34 دقیقه

← بکشید

Retrieval-augmented generation has become the default architecture for connecting large language models to external knowledge, but its reliability is still uneven in lower-resource, mixed-script, and technical domains.…

Abstract

Retrieval-augmented generation, or RAG, improves language-model reliability by grounding generation in external documents. Yet most RAG evaluation remains concentrated in English or in broad-domain b…

Highlights

1. Introduction

Large language models are now used as interfaces to knowledge bases, documentation portals, customer-support repositories, news archives, and internal corporate memory. In these settings, a fluent an…

2. Contributions

TekRAG-Persian is designed around three contributions. First, it defines a Persian technical QA benchmark built from curated technology articles with explicit question, answer, and evidence mapping.…

تکناو

تحلیل کامل را در تکناو بخوانید

TekRAG-Persian — نخستین بنچمارک داده‌محور ارزیابی RAG فارسی: روش، معیارها و نتایج مدل‌ها

خواندن مقاله →