Skip to content

[INC-302] Add Gemma 4 to inference#2227

Merged
PawelPeczek-Roboflow merged 24 commits intomainfrom
INC-302/gemma4-support
Apr 21, 2026
Merged

[INC-302] Add Gemma 4 to inference#2227
PawelPeczek-Roboflow merged 24 commits intomainfrom
INC-302/gemma4-support

Conversation

@dkosowski87
Copy link
Copy Markdown
Contributor

@dkosowski87 dkosowski87 commented Apr 13, 2026

What does this PR do?

Linked issue: INC-302

  • Added Gemma-4 to inference models
  • Amended the download_chunk function to resume from partial progress instead of an empty file when a connection error was received.
  • Added RangeRequestNotSupportedError error with proper docs.

Testing

  • I have tested this change locally
  • I have added/updated tests for this change

Test details:

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code where necessary, particularly in hard-to-understand areas
  • My changes generate no new warnings or errors
  • I have updated the documentation accordingly (if applicable)

Additional Context

@dkosowski87 dkosowski87 changed the title [WIP] [INC-302] Add Gemma 4 to inference Apr 13, 2026
@dkosowski87 dkosowski87 force-pushed the INC-302/gemma4-support branch 2 times, most recently from 41f9ed2 to 303232f Compare April 20, 2026 10:14
- Introduced new configuration parameters for Gemma 4, including max new tokens, sampling options, and temperature settings.
- Registered multiple Gemma 4 model variants in the models registry.
- Added Gemma 4 multimodal model implementation in `gemma4_hf.py`.
- Created an `__init__.py` for the Gemma 4 module.
- Added unit tests to verify model resolution for Gemma 4 variants.
- Introduced a new example configuration file for Gemma 4 model.
- Added a script to run Gemma 4 locally, demonstrating end-to-end inference with a sample image.
- Included instructions for using the script with both hosted and local model setups.
- Created `__init__.py` files for the examples directory and the Gemma 4 module.
- Added `count_backpacks.py` script demonstrating end-to-end inference with the Gemma 4 model using a sample image.
- Included prompts and instructions for running the example locally.
- Updated the `count_backpacks.py` example to clarify its purpose and usage instructions.
- Moved the image URL definition to a more appropriate location in the script.
- Enhanced type annotations in `gemma4_hf.py` for better clarity and maintainability.
- Changed variable names to be more consistent and descriptive.
- Added `RangeRequestNotSupportedError` to handle cases where the server does not support range requests.
- Introduced new configuration parameters for chunk download timeouts and maximum attempts.
- Updated the `download_chunk` function to implement retry logic for connectivity errors and range request failures.
- Improved documentation for file and download errors, including scenarios and troubleshooting steps.
- Improved the `count_backpacks.py` script by clarifying usage instructions and ensuring proper package resolution when run as a script.
- Updated the `threaded_download_file` function in `download.py` to use specific connect and read timeouts for better error handling during downloads.
- Introduced `RangeRequestNotSupportedError` to manage cases where the server returns a 200 status instead of 206 for range requests.
- Updated documentation to clarify error handling steps for file downloads.
- Refactored the `download_chunk` function to raise the new error appropriately and adjusted related test cases to ensure robust coverage.
- Increased `CHUNK_DOWNLOAD_CONNECT_TIMEOUT` from 15.0 to 30.0 seconds.
- Increased `CHUNK_DOWNLOAD_READ_TIMEOUT` from 30.0 to 60.0 seconds.
- Increased `CHUNK_DOWNLOAD_MAX_ATTEMPTS` from 10 to 60 to improve resilience during downloads.
@dkosowski87 dkosowski87 force-pushed the INC-302/gemma4-support branch from edb1b55 to 8d8b104 Compare April 21, 2026 15:08
- Included Gemma 4 in the navigation of mkdocs.yml.
- Updated index.md to feature Gemma 4 in the list of Vision-Language Models.
- Added detailed documentation for Gemma 4 in a new gemma4.md file, covering usage, supported backends, and performance tips.
- Documented environment variables specific to Gemma 4 in environment-variables.md for configuration of model defaults.

model = AutoModel.from_pretrained(
"gemma-4-e2b-it",
backend="hugging-face",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is backend essential here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the default goes to BackendType.HF so I can skip it from this example.

- Consolidated Gemma 4 model IDs under a single architecture key `gemma-4` for clarity.
- Updated `gemma4.md` to reflect changes in model ID usage and local package loading instructions.
- Modified example configuration to align with the new architecture structure.
- Adjusted tests to ensure they validate the new model architecture setup.
…tation

- Added parameters `max_chunk_fetch_passes` and `max_consecutive_retryable_http` to control retry behavior for chunk downloads.
- Updated the function's docstring to clarify usage and parameters.
- Implemented logic to cap consecutive retryable HTTP responses without advancing the byte offset, raising a `RetryError` when exceeded.
- Adjusted unit tests to reflect changes in parameter names and added new tests for retry logic behavior.
@PawelPeczek-Roboflow PawelPeczek-Roboflow merged commit 30b0996 into main Apr 21, 2026
44 checks passed
@PawelPeczek-Roboflow PawelPeczek-Roboflow deleted the INC-302/gemma4-support branch April 21, 2026 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants