Skip to content

Improved dependency handling#400

Open
pjbull wants to merge 14 commits intomasterfrom
dep-split
Open

Improved dependency handling#400
pjbull wants to merge 14 commits intomasterfrom
dep-split

Conversation

@pjbull
Copy link
Copy Markdown
Member

@pjbull pjbull commented Feb 16, 2026

Lots of updates with the goals of:

  • Separate image/video extras and dependencies (along with tests for isolated envs)
  • Move to pixeltable-yolox, which is pip-installable and builds on Windows
  • Windows support (partly from above)
  • Make zamba pip-installable again

Changelog:

  • Split core dependencies into optional extras: video (av, ffmpeg-python, pytorchvideo, pixeltable-yolox, etc.), image (megadetector, Pillow), tests (pytest, black, flake8, coverage, nvidia-ml-py, etc.), and docs (mkdocs, mike, mkdocstrings). Base install no longer pulls in video/image stacks; use pip install zamba[video], zamba[image], or zamba[video,image].
  • Remove requirements-dev.txt and requirements-dev/; use pip install -e ".[tests]" for development and pip install -e ".[docs]" for building docs.
  • Replace mlflow with mlflow-skinny in core dependencies. Add pyarrow>=23.0.0. Add Windows-specific torch version constraint for gloo bug. Declare Python 3.11–3.13 support and requires-python = ">=3.11, <3.14".
  • Remove DensePose from optional dependencies so the package is publishable to PyPI. DensePose (detectron2, detectron2-densepose) must be installed from GitHub; see docs. Running zamba densepose without them prints install instructions and exits. Add [tool.uv.extra-build-dependencies] for detectron2 (torch) so CI can build it from GitHub.
  • Pin setuptools<82 so the image extra works on Python 3.12 (megadetector’s yolov5 stack requires pkg_resources, which setuptools 82 removed). Add [tool.uv] override-dependencies for protobuf and setuptools.
  • Add zamba.models.config_common with shared types and validators (ModelEnum, MonitorEnum, ZambaBaseModel, RegionEnum, get_filepaths, validate_model_cache_dir, etc.). Move video-specific file/checkpoint logic into zamba.models.config and reuse common helpers.
  • Add zamba.models.instantiation and move instantiate_model, head-replacement, and resume logic out of model_manager for clearer separation.
  • Add lazy model registration in zamba.models.registry (ensure_registered()): video model classes are imported only when needed so the package can be imported without video dependencies. Config validation calls ensure_registered() when resolving checkpoint/model name.
  • Register image and utils sub-apps lazily so zamba --help and non-image/non-utils commands work without image or densepose dependencies. Defer imports of VideoLoaderConfig and config classes into the train, predict, and depth command callbacks.
  • Handle missing DensePose dependencies in the densepose command: catch ImportError when running the model and print install-from-GitHub instructions and exit with code 1.
  • Make NPY cache path hashing stable: use a JSON-serializable, order-invariant representation of the config (including Enum and Path) instead of str(hashed_part). For local absolute paths, use a relative-style path in the cache key.
  • Fix cache cleanup in npy_cache.__del__ by comparing Path(cache_path).parents[0] with Path(tempfile.gettempdir()).
  • Import image classifier lazily from zamba.images to avoid pulling in torch at package import time. Images config and manager use zamba.models.config_common and zamba.models.instantiation instead of config/utils from video.
  • Make megadetector import resilient: try megadetector.detection.run_detector, fall back to detection.run_detector. Make MLflow optional in image training (log warning and continue without it if import or setup fails). Disable torch.compile on Windows in addition to macOS.
  • Add pytest markers video and image and skip test files when the corresponding extra is not installed (conftest collect_ignore based on _HAS_VIDEO / _HAS_IMAGE). Define video-only fixtures and the dummy video model only when video deps are present. Set CUDA_VISIBLE_DEVICES=0 in conftest to avoid DDP issues under pytest-xdist.
  • Add Makefile targets tests-fast (fail on first failure) and test-image-only / test-video-only (isolated venvs with only image or video extra). CI runs these isolation steps and installs image,video extras for the main test matrix. DensePose tests install detectron2 from GitHub with --no-build-isolation instead of using a densepose extra.
  • Update tests to use new config/instantiation imports and markers; add image dataset import check in the install smoke test.
  • Installation docs: document optional extras (video, image), PyPI install (pip install zamba[video] etc.), and Windows (image extra no extra tools; video needs Visual Studio Build Tools and FFmpeg). State FFmpeg only required for video workflows. Contribute page: dev install with .[tests,image,video,docs], DensePose deps from GitHub, make requirements with uv. DensePose doc: install detectron2/detectron2-densepose from GitHub; note that zamba densepose prints instructions if deps missing.
  • Re-enable PyPI publish steps in the release workflow (Test PyPI and Production PyPI). Docs workflows and release job install .[docs] instead of requirements-dev/docs.txt.

pjbull and others added 6 commits February 15, 2026 18:25
* Switch zamba to pixeltable-yolox

Replace the YOLOX dependency in pyproject and add compatibility shims for the updated YOLOX and megadetector package layouts so CLI inference works in the isolated env.

* test fixes on gpu

* Pin npy cache hash test to cpu device
@pjbull pjbull marked this pull request as draft February 16, 2026 21:55
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 16, 2026

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 17, 2026

Codecov Report

❌ Patch coverage is 82.96530% with 54 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.4%. Comparing base (3e9400b) to head (f044755).

Files with missing lines Patch % Lines
zamba/models/config_common.py 85.0% 20 Missing ⚠️
zamba/cli.py 67.7% 10 Missing ⚠️
zamba/models/instantiation.py 85.1% 8 Missing ⚠️
zamba/images/manager.py 71.4% 4 Missing ⚠️
zamba/models/registry.py 69.2% 4 Missing ⚠️
zamba/data/video.py 84.2% 3 Missing ⚠️
zamba/images/data.py 50.0% 2 Missing ⚠️
zamba/object_detection/yolox/yolox_model.py 84.6% 2 Missing ⚠️
zamba/models/densepose/densepose_manager.py 50.0% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           master    #400     +/-   ##
========================================
- Coverage    82.8%   82.4%   -0.4%     
========================================
  Files          37      39      +2     
  Lines        3194    3352    +158     
========================================
+ Hits         2646    2764    +118     
- Misses        548     588     +40     
Files with missing lines Coverage Δ
zamba/images/config.py 92.4% <100.0%> (-0.1%) ⬇️
zamba/models/config.py 97.3% <100.0%> (+1.2%) ⬆️
zamba/models/model_manager.py 84.6% <100.0%> (+0.4%) ⬆️
.../object_detection/yolox/megadetector_lite_yolox.py 99.1% <100.0%> (+<0.1%) ⬆️
zamba/pytorch/dataloaders.py 91.2% <100.0%> (ø)
zamba/pytorch_lightning/video_modules.py 100.0% <100.0%> (ø)
zamba/utils_cli.py 48.3% <100.0%> (ø)
zamba/models/densepose/densepose_manager.py 92.3% <50.0%> (+<0.1%) ⬆️
zamba/images/data.py 84.0% <50.0%> (-1.7%) ⬇️
zamba/object_detection/yolox/yolox_model.py 93.3% <84.6%> (-1.5%) ⬇️
... and 6 more
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@pjbull pjbull marked this pull request as ready for review February 17, 2026 02:02
@pjbull
Copy link
Copy Markdown
Member Author

pjbull commented Feb 17, 2026

@ejm714 This should be ready to go, I ran the test suite on a Windows machine, a Mac, and a linux box with a GPU.

The diff looks big, but almost all of the substantive code changes are just refactoring so that image/video code paths are separable and imports are lazy loaded if need be so that you don't have to have all the deps for both sides available. Other code changes are almost all about stabilizing the test suite across platforms.

Easiest way to start the review may be to look at the pyproject.toml changes, the Makefile changes, and the README changes to see what surface area changed and then the things in the weeds will make more sense.

Don't want to rock the boat if you're demoing this week but could be nice to have the new version for folks pip-installable!

@pjbull pjbull requested a review from ejm714 February 17, 2026 02:06
@pjbull pjbull changed the title WIP: Improved dependency handling Improved dependency handling Feb 18, 2026
Copy link
Copy Markdown
Collaborator

@ejm714 ejm714 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So much good stuff in here! Just a couple small things

$ git clone https://github.com/drivendataorg/zamba.git
$ cd zamba
$ pip install -r requirements-dev.txt
$ uv pip install -e ".[tests,image,video,docs]"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe worth creating a "dev" dependency group for simpler install? https://docs.astral.sh/uv/concepts/projects/dependencies/#nesting-groups

Comment thread docs/docs/install.md


#### FFmpeg version 4
#### FFmpeg > 4.3 (only needed for video workflows)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we saw different predictions with ffmpeg 5 #180 that were sufficient enough to change the top class. Probably needs more testing before we can remove the ceiling

filtered_frames[:, 0, 0, 0]
== np.array(
[90, 80, 70, 60, 44, 67, 73, 5, 54, 64, 65, 34, 93, 72, 56, 50, 87, 83, 47, 88]
[90, 80, 70, 60, 72, 96, 66, 75, 95, 93, 69, 23, 76, 57, 45, 83, 50, 51, 56, 73]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these differences concerning or expected? want to make sure we're not dramatically changing the output of MDLite where it's going to affect overall classification accuracy

filtered_frames[:, 0, 0, 0]
== np.array(
[90, 80, 70, 60, 50, 87, 30, 40, 34, 10, 22, 20, 71, 16, 39, 14, 77, 65, 42, 13]
[90, 80, 70, 60, 50, 13, 30, 40, 49, 10, 34, 14, 52, 36, 81, 99, 79, 24, 0, 20]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same Q as above

DensePose dependencies are not installed. They are not distributed on PyPI.
Install them from GitHub after installing PyTorch:

pip install torch
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth having this be uv pip to match docs?

Comment thread pyproject.toml
]

[tool.uv]
override-dependencies = [
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TWIL!

Comment thread Makefile
pytest tests -vv

## Dev: fail fast on the first test failure
tests-fast: clean-test
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] should we standardize so all test commands are test (singular) not test?

  • make test
  • make test-fast
  • make test-debug
  • make test-image-only
  • make test-video-only

Comment thread Makefile
ZAMBA_RUN_DENSEPOSE_TESTS=1 pytest tests/test_densepose.py tests/test_cli.py::test_densepose_cli_options -vv

## Test image deps in isolation: fresh venv, install image extra only, run image + agnostic tests, cleanup
test-image-only:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this is duplicative of images-tests which i don't think we need any more. can you double check all test commands in the makefile?

Comment thread HISTORY.md
- Import image classifier lazily from `zamba.images` to avoid pulling in torch at package import time. Images config and manager use `zamba.models.config_common` and `zamba.models.instantiation` instead of config/utils from video.
- Make megadetector import resilient: try `megadetector.detection.run_detector`, fall back to `detection.run_detector`. Make MLflow optional in image training (log warning and continue without it if import or setup fails). Disable `torch.compile` on Windows in addition to macOS.
- Add pytest markers `video` and `image` and skip test files when the corresponding extra is not installed (conftest `collect_ignore` based on `_HAS_VIDEO` / `_HAS_IMAGE`). Define video-only fixtures and the dummy video model only when video deps are present. Set `CUDA_VISIBLE_DEVICES=0` in conftest to avoid DDP issues under pytest-xdist.
- Add Makefile targets `tests-fast` (fail on first failure) and `test-image-only` / `test-video-only` (isolated venvs with only image or video extra). CI runs these isolation steps and installs image,video extras for the main test matrix. DensePose tests install detectron2 from GitHub with `--no-build-isolation` instead of using a densepose extra.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flagging that if we rename commands in the makefile, need to reflect that here too

Comment thread docs/docs/install.md
# Installing `zamba`

Zamba has been developed and tested on macOS and Ubuntu Linux for both CPU and
Zamba has been developed and tested on macOS, Ubuntu Linux, and Windows for both CPU and
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from a user-perspective, this file would probably be easier to follow if the departure point was the operating system rather than the install step.

so instead of

step 1
- linux
- macos
- windows

it was

linux
- step 1
- step 2
- step 3

macos
- ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants