Research
Discover our latest research
Blog
Follow UltraRAG open-source progress and technical updates.
UltraRAG 3.0: No More Black Boxes, Full Transparency in Reasoning
Addressing the pain point of "algorithm prototyping takes a week, but building a usable system takes months", UltraRAG 3.0 brings three core upgrades: full-chain visible reasoning, modular MCP architecture, and unified evaluation system.
UltraRAG 2.1: Deep Knowledge Integration, Cross-Modal Support
Comprehensive upgrades focused on native multimodal support, automated knowledge integration and corpus construction, and unified RAG workflows for building and evaluation.
UltraRAG 2.0: Minimal Code, Maximum Innovation
The first RAG framework designed with Model Context Protocol (MCP) architecture, enabling researchers to implement multi-stage reasoning systems with just YAML files.
Models
Our open-source core models, providing foundational capabilities for the RAG ecosystem.
AgentCPM-Report
An intelligent Agent model for long document generation and report writing, enabling automated research report generation.
MiniCPM-Embedding-Light
A lightweight and efficient text embedding model, achieving leading performance on multiple retrieval benchmarks, suitable for large-scale semantic retrieval in RAG scenarios.
Selected Papers
Representative research work from our team in the RAG domain.
VisRAG: Vision-based Retrieval-Augmented Generation on Multi-modality Documents
Shi Yu, Chaoyue Tang, et al.
Proposes a "vision-first" retrieval-augmented generation paradigm that fundamentally solves the information degradation problem of complex layout documents in traditional text parsing by converting documents directly into visual vectors for matching and generation.
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
Xinze Li, Sen Mei, et al.
Proposes a new RAG optimization paradigm based on "differentiable data rewards", significantly improving the model's ability to extract core information from external knowledge and resolve knowledge conflicts through end-to-end reward alignment between retriever and generator.
RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework
Kunlun Zhu, Yifan Luo, et al.
Proposes a new paradigm for automated RAG evaluation benchmark construction, enabling efficient customization of evaluation datasets for specific vertical scenarios (such as finance, law, healthcare) through Schema-based knowledge distillation and document generation.
DeepNote: Note-Centric Deep Retrieval-Augmented Generation
Ruobing Wang, et al.
Proposes a "note-centric" adaptive retrieval-augmented generation paradigm that significantly improves the model's depth and robustness in handling complex open-domain QA tasks by introducing an iterative knowledge accumulation mechanism.