Research

Discover our latest research

Blog

Follow UltraRAG open-source progress and technical updates.

UltraRAG 3.0: No More Black Boxes, Full Transparency in Reasoning

Addressing the pain point of "algorithm prototyping takes a week, but building a usable system takes months", UltraRAG 3.0 brings three core upgrades: full-chain visible reasoning, modular MCP architecture, and unified evaluation system.

2026.01.23·Sen Mei, Haidong Xin

Release

UltraRAG 2.1: Deep Knowledge Integration, Cross-Modal Support

Comprehensive upgrades focused on native multimodal support, automated knowledge integration and corpus construction, and unified RAG workflows for building and evaluation.

2025.11.11·Sen Mei, Haidong Xin

Release

UltraRAG 2.0: Minimal Code, Maximum Innovation

The first RAG framework designed with Model Context Protocol (MCP) architecture, enabling researchers to implement multi-stage reasoning systems with just YAML files.

2025.08.28·Sen Mei, Haidong Xin, Chunyi Peng

Models

Our open-source core models, providing foundational capabilities for the RAG ecosystem.

AgentCPM-Report

An intelligent Agent model for long document generation and report writing, enabling automated research report generation.

AgentLong-Text GenerationDeepResearch

View Model →

MiniCPM-Embedding-Light

A lightweight and efficient text embedding model, achieving leading performance on multiple retrieval benchmarks, suitable for large-scale semantic retrieval in RAG scenarios.

EmbeddingRetrieval

View Model →

Selected Papers

Representative research work from our team in the RAG domain.

2025.04ICLR 2025

VisRAG: Vision-based Retrieval-Augmented Generation on Multi-modality Documents

Shi Yu, Chaoyue Tang, et al.

Proposes a "vision-first" retrieval-augmented generation paradigm that fundamentally solves the information degradation problem of complex layout documents in traditional text parsing by converting documents directly into visual vectors for matching and generation.

2025.04ICLR 2025

RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards

Xinze Li, Sen Mei, et al.

Proposes a new RAG optimization paradigm based on "differentiable data rewards", significantly improving the model's ability to extract core information from external knowledge and resolve knowledge conflicts through end-to-end reward alignment between retriever and generator.

2025.07ACL 2025

RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework

Kunlun Zhu, Yifan Luo, et al.

Proposes a new paradigm for automated RAG evaluation benchmark construction, enabling efficient customization of evaluation datasets for specific vertical scenarios (such as finance, law, healthcare) through Schema-based knowledge distillation and document generation.

2025.11EMNLP 2025

DeepNote: Note-Centric Deep Retrieval-Augmented Generation

Ruobing Wang, et al.

Proposes a "note-centric" adaptive retrieval-augmented generation paradigm that significantly improves the model's depth and robustness in handling complex open-domain QA tasks by introducing an iterative knowledge accumulation mechanism.