Skip to content

Releases: xorbitsai/inference

v2.5.0

13 Apr 04:30
c9b597e

Choose a tag to compare

What's new in 2.5.0 (2026-04-13)

These are the changes in inference v2.5.0.

New features

Enhancements

  • ENH: update model "DeepSeek-OCR" JSON by @amumu96 in #4751
  • ENH: update 2 models JSON ("Ernie4.5", "qwen3.5") by @XprobeBot in #4754
  • ENH: update model "DeepSeek-V3.2" JSON by @amumu96 in #4762
  • ENH: update 2 models JSON ("Qwen3-ASR-0.6B", "Qwen3-ASR-1.7B") by @qinxuye in #4765
  • ENH: auto-detect PyTorch CUDA version for virtual environment setup by @qinxuye in #4766
  • ENH: update model "jina-embeddings-v4" JSON by @qinxuye in #4775
  • ENH: Optimize worker details for deployment progress tooltip. by @leslie2046 in #4746
  • ENH: update model "qwen3.5" JSON by @llyycchhee in #4782
  • ENH: update 2 models JSON ("Kokoro-82M-v1.1-zh", "Kokoro-82M") by @qinxuye in #4795
  • ENH: update model "gemma-3-it" JSON by @qinxuye in #4794
  • ENH: update models JSON [llm] by @XprobeBot in #4796
  • ENH: add lightweight heartbeat mechanism for worker liveness detection by @qinxuye in #4785
  • ENH: update model "ChatTTS" JSON by @qinxuye in #4793
  • bld: Fix the front-end UI access issue for aarch64 image by @zwt-1234 in #4743
  • bld: Fix the front-end UI access issue for aarch64 image by @zwt-1234 in #4749
  • bld: Fix the front-end UI access issue by @zwt-1234 in #4758
  • BLD: limit gptqmodel installation to specified version by @zwt-1234 in #4798

Bug fixes

Documentation

Others

New Contributors

Full Changelog: v2.4.0...v2.5.0

v2.4.0

29 Mar 03:21
d189e2a

Choose a tag to compare

What's new in 2.4.0 (2026-03-29)

These are the changes in inference v2.4.0.

New features

Enhancements

Bug fixes

  • BUG: Fix async client FormData handling and response lifecycle issues by @qinxuye in #4687
  • BUG: MLX backend accumulates intermediate generation steps into final output (tested on 1.17.0, 2.0.0, 2.1.0) #4615 by @nasircsms in #4617
  • fix(worker): inject parent site-packages into child venv via .pth file by @nasircsms in #4692
  • BUG: launch multi gpu qwen3.5 error by @llyycchhee in #4700
  • fix(tool_call): add qwen3.5 by @llyycchhee in #4703
  • fix(qwen3.5): support tool calls by @llyycchhee in #4709
  • FIX: qwen3.5 reasoning parse by @llyycchhee in #4719
  • fix(qwen3.5): support XML-like tool call format in non-streaming mode by @amumu96 in #4715
  • FIX: webui crash when gpu_utilization is none by @leslie2046 in #4728

Documentation

New Contributors

Full Changelog: v2.3.0...v2.4.0

v2.3.0

13 Mar 11:29
8984aef

Choose a tag to compare

What's new in 2.3.0 (2026-03-13)

These are the changes in inference v2.3.0.

New features

Enhancements

Bug fixes

  • BUG: fix error WorkerWrapperBase.__init__() got multiple values for argument 'rpc_rank' by @llyycchhee in #4649
  • BUG: fix vLLM embedding check for qwen3-vl-embedding by @ace-xc in #4647
  • FIX: update the QR code URL by @yiboyasss in #4668
  • BUG: fix chat for multiple gpus by @llyycchhee in #4671
  • BUG: [UI] initialize formData with default values from modelFormConfig. by @yiboyasss in #4678
  • BUG: fix qwen 3.5 vllm since no generation_config.json exists by @llyycchhee in #4681

Documentation

New Contributors

Full Changelog: v2.2.0...v2.3.0

v2.2.0

28 Feb 12:01
cdd0c67

Choose a tag to compare

What's new in 2.2.0 (2026-02-28)

These are the changes in inference v2.2.0.

New features

Enhancements

Bug fixes

Documentation

New Contributors

Full Changelog: v2.1.0...v2.2.0

v2.1.0

14 Feb 05:11
45e7c90

Choose a tag to compare

What's new in 2.1.0 (2026-02-14)

These are the changes in inference v2.1.0.

New features

Enhancements

Bug fixes

Documentation

Others

New Contributors

Full Changelog: v2.0.0...v2.1.0

v2.0.0

31 Jan 04:09
ea7001b

Choose a tag to compare

What's new in 2.0.0 (2026-01-31)

These are the changes in inference v2.0.0.

New features

Enhancements

Bug fixes

Documentation

Others

Full Changelog: v1.17.0...v2.0.0

v1.17.1

13 Jan 07:18

Choose a tag to compare

v1.17.1 is a hotfix version of v1.17.0

Full Changelog: v1.17.0...v1.17.1

v1.17.0

10 Jan 12:40
6fef085

Choose a tag to compare

What's new in 1.17.0 (2026-01-10)

These are the changes in inference v1.17.0.

New features

Enhancements

Bug fixes

Documentation

New Contributors

Full Changelog: v1.16.0...v1.17.0

v1.16.0

27 Dec 01:44
ae7b4ad

Choose a tag to compare

What's new in 1.16.0 (2025-12-27)

These are the changes in inference v1.16.0.

New features

Enhancements

Bug fixes

Documentation

  • DOC: update new models and release notes for v1.15.0 by @qinxuye in #4359

Full Changelog: v1.15.0...v1.16.0

v1.15.0

13 Dec 15:30
b2adcee

Choose a tag to compare

What's new in 1.15.0 (2025-12-13)

These are the changes in inference v1.15.0.

New features

Enhancements

Bug fixes

Documentation

  • DOC: add new models and v1.14.0 release notes by @qinxuye in #4305

Others

New Contributors

Full Changelog: v1.14.0...v1.15.0