[INC-302] Add Gemma 4 to inference#2227
Merged
PawelPeczek-Roboflow merged 24 commits intomainfrom Apr 21, 2026
Merged
Conversation
41f9ed2 to
303232f
Compare
- Introduced new configuration parameters for Gemma 4, including max new tokens, sampling options, and temperature settings. - Registered multiple Gemma 4 model variants in the models registry. - Added Gemma 4 multimodal model implementation in `gemma4_hf.py`. - Created an `__init__.py` for the Gemma 4 module. - Added unit tests to verify model resolution for Gemma 4 variants.
- Introduced a new example configuration file for Gemma 4 model. - Added a script to run Gemma 4 locally, demonstrating end-to-end inference with a sample image. - Included instructions for using the script with both hosted and local model setups.
- Created `__init__.py` files for the examples directory and the Gemma 4 module. - Added `count_backpacks.py` script demonstrating end-to-end inference with the Gemma 4 model using a sample image. - Included prompts and instructions for running the example locally.
- Updated the `count_backpacks.py` example to clarify its purpose and usage instructions. - Moved the image URL definition to a more appropriate location in the script. - Enhanced type annotations in `gemma4_hf.py` for better clarity and maintainability. - Changed variable names to be more consistent and descriptive.
- Added `RangeRequestNotSupportedError` to handle cases where the server does not support range requests. - Introduced new configuration parameters for chunk download timeouts and maximum attempts. - Updated the `download_chunk` function to implement retry logic for connectivity errors and range request failures. - Improved documentation for file and download errors, including scenarios and troubleshooting steps.
- Improved the `count_backpacks.py` script by clarifying usage instructions and ensuring proper package resolution when run as a script. - Updated the `threaded_download_file` function in `download.py` to use specific connect and read timeouts for better error handling during downloads.
- Introduced `RangeRequestNotSupportedError` to manage cases where the server returns a 200 status instead of 206 for range requests. - Updated documentation to clarify error handling steps for file downloads. - Refactored the `download_chunk` function to raise the new error appropriately and adjusted related test cases to ensure robust coverage.
- Increased `CHUNK_DOWNLOAD_CONNECT_TIMEOUT` from 15.0 to 30.0 seconds. - Increased `CHUNK_DOWNLOAD_READ_TIMEOUT` from 30.0 to 60.0 seconds. - Increased `CHUNK_DOWNLOAD_MAX_ATTEMPTS` from 10 to 60 to improve resilience during downloads.
edb1b55 to
8d8b104
Compare
- Included Gemma 4 in the navigation of mkdocs.yml. - Updated index.md to feature Gemma 4 in the list of Vision-Language Models. - Added detailed documentation for Gemma 4 in a new gemma4.md file, covering usage, supported backends, and performance tips. - Documented environment variables specific to Gemma 4 in environment-variables.md for configuration of model defaults.
…Gemma 4 documentation additions
|
|
||
| model = AutoModel.from_pretrained( | ||
| "gemma-4-e2b-it", | ||
| backend="hugging-face", |
Collaborator
There was a problem hiding this comment.
is backend essential here?
Contributor
Author
There was a problem hiding this comment.
No, the default goes to BackendType.HF so I can skip it from this example.
- Consolidated Gemma 4 model IDs under a single architecture key `gemma-4` for clarity. - Updated `gemma4.md` to reflect changes in model ID usage and local package loading instructions. - Modified example configuration to align with the new architecture structure. - Adjusted tests to ensure they validate the new model architecture setup.
…tation - Added parameters `max_chunk_fetch_passes` and `max_consecutive_retryable_http` to control retry behavior for chunk downloads. - Updated the function's docstring to clarify usage and parameters. - Implemented logic to cap consecutive retryable HTTP responses without advancing the byte offset, raising a `RetryError` when exceeded. - Adjusted unit tests to reflect changes in parameter names and added new tests for retry logic behavior.
PawelPeczek-Roboflow
previously approved these changes
Apr 21, 2026
PawelPeczek-Roboflow
approved these changes
Apr 21, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Linked issue: INC-302
download_chunkfunction to resume from partial progress instead of an empty file when a connection error was received.RangeRequestNotSupportedErrorerror with proper docs.Testing
Test details:
Checklist
Additional Context