FEAT: [model] gemma-4 support by qinxuye · Pull Request #4768 · xorbitsai/inference

qinxuye · 2026-04-05T05:11:42Z

Automated sync for model "gemma-4" (llm) by user qinxuye.

… support - Add _sanitize_generate_config to MLXVisionModel to handle stop/stop_token_ids from model family - Initialize reasoning parser in MLXVisionModel.load() for hybrid models - Pass reasoning_parser to completion methods for proper content parsing - Pass tokenizer to get_full_context for correct jinja template rendering - Set enable_thinking=False as default for MLX models - Enable stream by default for multimodal interface in Gradio UI

…abled - Fix _force_virtualenv_engine_params: only include engines with matching specs, previously fallback to all specs when matched_specs is empty - Add Gemma4ForConditionalGeneration to VLLM_SUPPORTED_MULTI_MODEL_LIST for vllm >= 0.19.0

- VLLMMultiModel: fall back to tokenizer's chat_template when model_family.chat_template is empty - SGLANGVisionModel: same logic as VLLMMultiModel, align with MLX core behavior - SGLANGModel: use empty string instead of asserting when chat_template is empty When chat_template is empty, engines will: 1. Get tokenizer and retrieve its chat_template attribute 2. If still empty, raise ValueError 3. Pass tokenizer to get_full_context for proper template application This aligns the behavior across vllm, sglang, and mlx engines.

When thinking mode is enabled (enable_thinking=True), special tokens are needed for the thinking/reasoning format. Set skip_special_tokens=False in this case to preserve the special tokens.

amumu96

LGTM

XprobeBot added the feature label Apr 5, 2026

XprobeBot added this to the v2.x milestone Apr 5, 2026

qinxuye force-pushed the chore/models-sync/user-qinxuye/gemma-4 branch 2 times, most recently from 5ef21fb to 7fab628 Compare April 10, 2026 15:19

qinxuye and others added 25 commits April 12, 2026 12:07

[models-hub] Update llm:gemma-4

96ce40d

chore(docs): auto-run gen_docs.py

fbc6a2a

[models-hub] Update llm:gemma-4

83a0193

chore(docs): auto-run gen_docs.py

8bf6091

[models-hub] Update llm:gemma-4

084392b

[models-hub] Update llm:gemma-4

6046d25

[models-hub] Update llm:gemma-4

b1f43fb

chore(docs): auto-run gen_docs.py

9720fe1

[models-hub] Update llm:gemma-4

07441cd

[models-hub] Update llm:gemma-4

bb8cddc

fix(mlx): improve chat template fallback and logging

fb0de3b

[models-hub] Update llm:gemma-4

7382931

fix(mlx): improve reasoning logging and gradio display

dd9846b

chore(utils): centralize context length helper

26722cd

feat(gemma): add tool call parser

6f3afc2

remove vllm dependency upper version

2d792f5

update cu130 vllm version

99a1ce2

[models-hub] Update llm:gemma-4

c47cf9d

[models-hub] Update llm:gemma-4

a9aa6d7

FIX: don't skip special tokens when enable_thinking is True

682324e

When thinking mode is enabled (enable_thinking=True), special tokens are needed for the thinking/reasoning format. Set skip_special_tokens=False in this case to preserve the special tokens.

[models-hub] Update llm:gemma-4

b15bb51

FIX: type errors in vllm and sglang multimodal chat engines

5fd9e51

qinxuye force-pushed the chore/models-sync/user-qinxuye/gemma-4 branch from f36ea85 to 5fd9e51 Compare April 12, 2026 04:33

chore(docs): auto-run gen_docs.py

7ead102

amumu96 approved these changes Apr 12, 2026

View reviewed changes

qinxuye merged commit d6d1007 into xorbitsai:main Apr 12, 2026
2 checks passed

qinxuye deleted the chore/models-sync/user-qinxuye/gemma-4 branch April 12, 2026 05:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT: [model] gemma-4 support#4768

FEAT: [model] gemma-4 support#4768
qinxuye merged 26 commits intoxorbitsai:mainfrom
qinxuye:chore/models-sync/user-qinxuye/gemma-4

qinxuye commented Apr 5, 2026

Uh oh!

amumu96 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

qinxuye commented Apr 5, 2026

Uh oh!

amumu96 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants