Skip to content

Releases: OpenBMB/UltraRAG

v0.2.1

22 Oct 07:20
f3bec0e

Choose a tag to compare

Release date: 2025.10.22

Highlights

  1. Comprehensive Multimodal Upgrade: Both the Retriever and Generation Servers now support multimodal inputs, enabling a complete end-to-end multimodal workflow from retrieval to generation.
  2. Corpus Parsing and Chunking Redesign: The Corpus Server adds multi-format file parsing with deep MinerU integration, supporting token-level, sentence-level, and customizable chunking strategies to flexibly adapt to diverse corpus structures.
  3. Unified Deployment and Efficient Inference: The Retriever and Generation Servers are fully compatible with standardized deployment frameworks such as vLLM, supporting offline inference, multi-engine adaptation, and accelerated experimentation.
  4. Enhanced Evaluation and Experimentation Workflow: Introduced TREC-based retrieval evaluation and significance testing modules, supporting parallel experiment execution and multimodal result visualization to optimize research assessment and experimental workflows.

What's Changed

  • Corpus Server supports plain text extraction from .txt, .md, .pdf, .xps, .oxps, .epub, .mobi, and .fb2 files. @mssssss123
  • Corpus Server adds simple per-page image conversion for .pdf files. @mssssss123
  • Corpus Server integrates MinerU for high-precision PDF parsing. @mssssss123
  • Corpus Server introduces a new chunking strategy supporting token-level (word/character) segmentation. @mssssss123
  • Corpus Server supports sentence-level chunking. @mssssss123
  • Corpus Server supports customizable chunking rules (default rule recognizes Markdown sections; other rules can be extended via config files). @mssssss123
  • Retriever Server supports three retrieval engines: Infinity, Sentence-Transformers, and OpenAI. @mssssss123
  • Retriever Server supports multimodal retrieval. @mssssss123
  • Retriever Server adds BM25 sparse retrieval. @xhd0728
  • Retriever Server supports hybrid retrieval (dense + sparse). @mssssss123
  • Retriever Server provides standardized deployment based on vLLM, unified under the OpenAI-compatible API. @xhd0728
  • Retriever Server supports online retrieval via Exa, Tavily, and ZhipuAI. @xhd0728
  • Reranker Server supports Infinity, Sentence-Transformers, and OpenAI ranking engines. @xhd0728
  • Generation Server supports multimodal inference. @mssssss123
  • Generation Server introduces vLLM offline inference, significantly improving experimental efficiency. @mssssss123
  • Generation Server supports Hugging Face inference for local debugging. @xhd0728
  • Evaluation Server supports TREC retrieval evaluation. @xhd0728
  • Evaluation Server supports TREC significance testing. @xhd0728
  • VisRAG Pipeline enables an end-to-end workflow from local PDF ingestion to multimodal retrieval and generation. @mssssss123
  • RAG Client supports running multiple experiments in parallel under the same pipeline through custom parameter files. @mssssss123
  • UltraRAG Benchmark adds six new VQA datasets, including wiki2024 and corresponding VQA corpora. @mssssss123 @xhd0728 @hm1229
  • Case Study UI adds multimodal result visualization support. @mssssss123

v0.2.0

21 Oct 06:42
b2779da

Choose a tag to compare

Release date: 2025.08.28

Low-Code Construction of Complex Pipelines

UltraRAG 2.0 natively supports sequential, loop, and conditional-branch inference control structures.
Developers can now build iterative RAG pipelines (e.g., Search-o1) using just a few lines of YAML—eliminating the need for extensive orchestration code.

Rapid Reproduction & Extensibility

Built on the Model Context Protocol (MCP) architecture, all components are encapsulated as independent, reusable servers.

  • Freely customize or reuse existing modules.
  • Each server exposes its capabilities as function-level tools, allowing new features to be added with a single function.
  • Supports integration with external MCP servers, making it effortless to extend pipelines across domains and applications.

Unified Evaluation & Comparison

UltraRAG 2.0 introduces a standardized evaluation and metrics management suite—ready to use with 17 + mainstream research benchmarks.

  • Continuous integration of the latest baselines
  • Built-in leaderboard-style result visualization
  • Enables systematic comparison and optimization for research experiments