[INC-302] Add Gemma 4 to inference by dkosowski87 · Pull Request #2227 · roboflow/inference

dkosowski87 · 2026-04-13T12:16:26Z

What does this PR do?

Linked issue: INC-302

Added Gemma-4 to inference models
Amended the download_chunk function to resume from partial progress instead of an empty file when a connection error was received.
Added RangeRequestNotSupportedError error with proper docs.

Testing

I have tested this change locally
I have added/updated tests for this change

Test details:

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code where necessary, particularly in hard-to-understand areas
My changes generate no new warnings or errors
I have updated the documentation accordingly (if applicable)

Additional Context

- Introduced new configuration parameters for Gemma 4, including max new tokens, sampling options, and temperature settings. - Registered multiple Gemma 4 model variants in the models registry. - Added Gemma 4 multimodal model implementation in `gemma4_hf.py`. - Created an `__init__.py` for the Gemma 4 module. - Added unit tests to verify model resolution for Gemma 4 variants.

- Introduced a new example configuration file for Gemma 4 model. - Added a script to run Gemma 4 locally, demonstrating end-to-end inference with a sample image. - Included instructions for using the script with both hosted and local model setups.

- Created `__init__.py` files for the examples directory and the Gemma 4 module. - Added `count_backpacks.py` script demonstrating end-to-end inference with the Gemma 4 model using a sample image. - Included prompts and instructions for running the example locally.

- Updated the `count_backpacks.py` example to clarify its purpose and usage instructions. - Moved the image URL definition to a more appropriate location in the script. - Enhanced type annotations in `gemma4_hf.py` for better clarity and maintainability. - Changed variable names to be more consistent and descriptive.

- Added `RangeRequestNotSupportedError` to handle cases where the server does not support range requests. - Introduced new configuration parameters for chunk download timeouts and maximum attempts. - Updated the `download_chunk` function to implement retry logic for connectivity errors and range request failures. - Improved documentation for file and download errors, including scenarios and troubleshooting steps.

- Improved the `count_backpacks.py` script by clarifying usage instructions and ensuring proper package resolution when run as a script. - Updated the `threaded_download_file` function in `download.py` to use specific connect and read timeouts for better error handling during downloads.

- Introduced `RangeRequestNotSupportedError` to manage cases where the server returns a 200 status instead of 206 for range requests. - Updated documentation to clarify error handling steps for file downloads. - Refactored the `download_chunk` function to raise the new error appropriately and adjusted related test cases to ensure robust coverage.

- Increased `CHUNK_DOWNLOAD_CONNECT_TIMEOUT` from 15.0 to 30.0 seconds. - Increased `CHUNK_DOWNLOAD_READ_TIMEOUT` from 30.0 to 60.0 seconds. - Increased `CHUNK_DOWNLOAD_MAX_ATTEMPTS` from 10 to 60 to improve resilience during downloads.

…nd lock file

- Included Gemma 4 in the navigation of mkdocs.yml. - Updated index.md to feature Gemma 4 in the list of Vision-Language Models. - Added detailed documentation for Gemma 4 in a new gemma4.md file, covering usage, supported backends, and performance tips. - Documented environment variables specific to Gemma 4 in environment-variables.md for configuration of model defaults.

…Gemma 4 documentation additions

PawelPeczek-Roboflow · 2026-04-21T16:19:50Z

+
+model = AutoModel.from_pretrained(
+    "gemma-4-e2b-it",
+    backend="hugging-face",


is backend essential here?

No, the default goes to BackendType.HF so I can skip it from this example.

- Consolidated Gemma 4 model IDs under a single architecture key `gemma-4` for clarity. - Updated `gemma4.md` to reflect changes in model ID usage and local package loading instructions. - Modified example configuration to align with the new architecture structure. - Adjusted tests to ensure they validate the new model architecture setup.

…tation - Added parameters `max_chunk_fetch_passes` and `max_consecutive_retryable_http` to control retry behavior for chunk downloads. - Updated the function's docstring to clarify usage and parameters. - Implemented logic to cap consecutive retryable HTTP responses without advancing the byte offset, raising a `RetryError` when exceeded. - Adjusted unit tests to reflect changes in parameter names and added new tests for retry logic behavior.

…ngelog

dkosowski87 changed the title ~~[WIP]~~ [INC-302] Add Gemma 4 to inference Apr 13, 2026

dkosowski87 force-pushed the INC-302/gemma4-support branch 2 times, most recently from 41f9ed2 to 303232f Compare April 20, 2026 10:14

dkosowski87 added 17 commits April 21, 2026 17:07

Update transformers to 5.5 withouth minor lock

e88a367

Release candidate 1 for 0.25.0

21bae38

Bump inference models to 0.25.0rc1

94e475f

Bump inference models version to 0.25.0rc2

8d437a9

Bump inference models version to 0.25.0rc2 across all requirements files

b1c6ae1

Revert lock file

1faab54

Update uv.lock

1fb45cc

Fix pre-release versions

737a12f

Bump inference models version to 0.25.1rc3 in project configuration a…

8d8b104

…nd lock file

dkosowski87 force-pushed the INC-302/gemma4-support branch from edb1b55 to 8d8b104 Compare April 21, 2026 15:08

dkosowski87 added 2 commits April 21, 2026 17:25

Bump inference models version to 0.26.0rc1 and update changelog with …

d45bae4

…Gemma 4 documentation additions

dkosowski87 marked this pull request as ready for review April 21, 2026 16:19

dkosowski87 requested review from PawelPeczek-Roboflow, grzegorz-roboflow, hansent, probicheaux, rafel-roboflow and yeldarby as code owners April 21, 2026 16:19

PawelPeczek-Roboflow reviewed Apr 21, 2026

View reviewed changes

dkosowski87 added 2 commits April 21, 2026 20:49

PawelPeczek-Roboflow previously approved these changes Apr 21, 2026

View reviewed changes

Update version to 0.25.0 in project configuration and changelog

019029e

dkosowski87 dismissed PawelPeczek-Roboflow’s stale review via 019029e April 21, 2026 20:22

dkosowski87 added 2 commits April 21, 2026 23:05

Update version to 0.25.1 in project configuration, lock file, and cha…

c754b83

…ngelog

Update inference models version to 0.25.1 across all requirements files

e09f9c6

PawelPeczek-Roboflow approved these changes Apr 21, 2026

View reviewed changes

PawelPeczek-Roboflow merged commit 30b0996 into main Apr 21, 2026
44 checks passed

PawelPeczek-Roboflow deleted the INC-302/gemma4-support branch April 21, 2026 21:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[INC-302] Add Gemma 4 to inference#2227

[INC-302] Add Gemma 4 to inference#2227
PawelPeczek-Roboflow merged 24 commits intomainfrom
INC-302/gemma4-support

dkosowski87 commented Apr 13, 2026 •

edited

Loading

Uh oh!

PawelPeczek-Roboflow Apr 21, 2026

Uh oh!

dkosowski87 Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dkosowski87 commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Testing

Checklist

Additional Context

Uh oh!

PawelPeczek-Roboflow Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

dkosowski87 Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dkosowski87 commented Apr 13, 2026 •

edited

Loading