Commit 5f05879
Trackers 2.2.0 release (#247)
* Change ByteTrack by ByteTrackTracker in README examples (#203)
* fix ByteTrack code snippets. use ByteTrackTracker over ByteTrack. (#204)
* Add Apache 2.0 license headers to source files (#205)
* chore: Redundant types from docstrings (#206)
* drop redundant types
* update docstrings
* CI: Resolve tag patterns in workflows for consistency (#209)
* Update publish-docs workflow to support 'develop' & custom tags (#196)
* Enhance publish-docs workflow to allow custom ref deployments via workflow_dispatch
* Update .github/workflows/publish-docs.yml
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* 🔄 Bump astral-sh/setup-uv in the github-actions group (#213)
Bumps the github-actions group with 1 update: [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv).
Updates `astral-sh/setup-uv` from 7.2.0 to 7.2.1
- [Release notes](https://github.com/astral-sh/setup-uv/releases)
- [Commits](https://github.com/astral-sh/setup-uv/compare/61cb8a9741eeb8a550a1b8544337180c0fc8476b...803947b9bd8e9f986429fa0c5a41c367cd732b41)
---
updated-dependencies:
- dependency-name: astral-sh/setup-uv
dependency-version: 7.2.1
dependency-type: direct:production
update-type: version-update:semver-patch
dependency-group: github-actions
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* :arrow_up: Bump twine from 6.1.0 to 6.2.0 (#216)
Bumps [twine](https://github.com/pypa/twine) from 6.1.0 to 6.2.0.
- [Release notes](https://github.com/pypa/twine/releases)
- [Changelog](https://github.com/pypa/twine/blob/main/docs/changelog.rst)
- [Commits](https://github.com/pypa/twine/compare/6.1.0...6.2.0)
---
updated-dependencies:
- dependency-name: twine
dependency-version: 6.2.0
dependency-type: direct:development
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* :arrow_up: Bump mkdocs-material from 9.6.13 to 9.7.1 (#217)
Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.6.13 to 9.7.1.
- [Release notes](https://github.com/squidfunk/mkdocs-material/releases)
- [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG)
- [Commits](https://github.com/squidfunk/mkdocs-material/compare/9.6.13...9.7.1)
---
updated-dependencies:
- dependency-name: mkdocs-material
dependency-version: 9.7.1
dependency-type: direct:development
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* :arrow_up: Bump pre-commit from 4.2.0 to 4.5.1 (#218)
Bumps [pre-commit](https://github.com/pre-commit/pre-commit) from 4.2.0 to 4.5.1.
- [Release notes](https://github.com/pre-commit/pre-commit/releases)
- [Changelog](https://github.com/pre-commit/pre-commit/blob/main/CHANGELOG.md)
- [Commits](https://github.com/pre-commit/pre-commit/compare/v4.2.0...v4.5.1)
---
updated-dependencies:
- dependency-name: pre-commit
dependency-version: 4.5.1
dependency-type: direct:development
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* :arrow_up: Bump build from 1.2.2.post1 to 1.4.0 (#219)
Bumps [build](https://github.com/pypa/build) from 1.2.2.post1 to 1.4.0.
- [Release notes](https://github.com/pypa/build/releases)
- [Changelog](https://github.com/pypa/build/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pypa/build/compare/1.2.2.post1...1.4.0)
---
updated-dependencies:
- dependency-name: build
dependency-version: 1.4.0
dependency-type: direct:development
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* :arrow_up: Bump mkdocs-glightbox from 0.4.0 to 0.5.2 (#220)
Bumps [mkdocs-glightbox](https://github.com/blueswen/mkdocs-glightbox) from 0.4.0 to 0.5.2.
- [Release notes](https://github.com/blueswen/mkdocs-glightbox/releases)
- [Changelog](https://github.com/blueswen/mkdocs-glightbox/blob/main/CHANGELOG)
- [Commits](https://github.com/blueswen/mkdocs-glightbox/compare/v0.4.0...v0.5.2)
---
updated-dependencies:
- dependency-name: mkdocs-glightbox
dependency-version: 0.5.2
dependency-type: direct:development
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Add evaluation module with box IoU/IoA calculations (TrackEval Part 1) (#210)
* Add box IoU/IoA calculation module from TrackEval
- Add box_iou() for Intersection over Union calculation
- Add box_ioa() for Intersection over Area calculation
- Support both xyxy and xywh box formats
- Include 32 unit tests covering edge cases and floating point precision
- Adapted from TrackEval with MIT license attribution
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix ruff and mypy linting errors
- Add S101 ignore for test files in pyproject.toml
- Split error test cases into separate test functions
- Remove type: ignore comments by simplifying type hints
- Clean up unused imports
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add Hungarian matching utility for detection assignment (TrackEval Part 2) (#211)
* Add box IoU/IoA calculation module from TrackEval
- Add box_iou() for Intersection over Union calculation
- Add box_ioa() for Intersection over Area calculation
- Support both xyxy and xywh box formats
- Include 32 unit tests covering edge cases and floating point precision
- Adapted from TrackEval with MIT license attribution
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix ruff and mypy linting errors
- Add S101 ignore for test files in pyproject.toml
- Split error test cases into separate test functions
- Remove type: ignore comments by simplifying type hints
- Clean up unused imports
* Add Hungarian matching utility for detection assignment
- Add match_detections() using Jonker-Volgenant algorithm (scipy)
- Maximize similarity with threshold filtering
- Return matched/unmatched indices for GT and tracker detections
- Include 17 unit tests covering edge cases and scipy consistency
- Adapted from TrackEval matching patterns
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix unused variable warnings in tests
* Fix remaining unused variable warning
* Add diverse test cases for match_detections
- Off-diagonal matches (GT0->TR2, GT1->TR0, etc.)
- Swapped/reversed matches
- Sparse non-sequential matches
- Very unbalanced sizes (1 GT vs 5 trackers, 5 GTs vs 1 tracker)
- Only middle/corner elements matching
- Threshold edge cases
- Optimal vs greedy assignment tests
* Remove redundant tests from test_matching.py
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add CLEAR metrics module (MOTA, MOTP, IDSW, MT/PT/ML) (TrackEval Part 3) (#212)
* Add box IoU/IoA calculation module from TrackEval
- Add box_iou() for Intersection over Union calculation
- Add box_ioa() for Intersection over Area calculation
- Support both xyxy and xywh box formats
- Include 32 unit tests covering edge cases and floating point precision
- Adapted from TrackEval with MIT license attribution
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix ruff and mypy linting errors
- Add S101 ignore for test files in pyproject.toml
- Split error test cases into separate test functions
- Remove type: ignore comments by simplifying type hints
- Clean up unused imports
* Add Hungarian matching utility for detection assignment
- Add match_detections() using Jonker-Volgenant algorithm (scipy)
- Maximize similarity with threshold filtering
- Return matched/unmatched indices for GT and tracker detections
- Include 17 unit tests covering edge cases and scipy consistency
- Adapted from TrackEval matching patterns
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix unused variable warnings in tests
* Fix remaining unused variable warning
* Add diverse test cases for match_detections
- Off-diagonal matches (GT0->TR2, GT1->TR0, etc.)
- Swapped/reversed matches
- Sparse non-sequential matches
- Very unbalanced sizes (1 GT vs 5 trackers, 5 GTs vs 1 tracker)
- Only middle/corner elements matching
- Threshold edge cases
- Optimal vs greedy assignment tests
* Remove redundant tests from test_matching.py
* Add CLEAR metrics module for multi-object tracking evaluation
Implement compute_clear_metrics function adapted from TrackEval with
exact numerical parity. The function computes standard CLEAR metrics
including MOTA, MOTP, IDSW, and track quality metrics (MT/PT/ML).
Key implementation details:
- Score matrix construction with IDSW prioritization (1000x bonus)
- Hungarian matching with proper threshold filtering
- ID switch detection based on previous tracker associations
- MT/PT/ML thresholds (>80%, >=20%, <20%)
- Fragmentation counting for track interruptions
- Vectorized GT ID mapping using np.searchsorted
Tests are fully parametrized and only test the public API.
Reference: trackeval/metrics/clear.py:38-129 (eval_sequence method)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add CLEAR metrics module for multi-object tracking evaluation
Implement compute_clear_metrics function adapted from TrackEval with
exact numerical parity. The function computes standard CLEAR metrics
including MOTA, MOTP, IDSW, and track quality metrics (MT/PT/ML).
Key implementation details:
- Score matrix construction with IDSW prioritization (1000x bonus)
- Hungarian matching with proper threshold filtering
- ID switch detection based on previous tracker associations
- MT/PT/ML thresholds (>80%, >=20%, <20%)
- Fragmentation counting for track interruptions
- Vectorized GT ID mapping using np.searchsorted
Tests are fully parametrized and only test the public API.
Reference: trackeval/metrics/clear.py:38-129 (eval_sequence method)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* fix(pre_commit): 🎨 auto format pre-commit hooks
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add MOT format file loading and sequence preparation for evaluation (TrackEval Part 4) (#214)
* Add box IoU/IoA calculation module from TrackEval
- Add box_iou() for Intersection over Union calculation
- Add box_ioa() for Intersection over Area calculation
- Support both xyxy and xywh box formats
- Include 32 unit tests covering edge cases and floating point precision
- Adapted from TrackEval with MIT license attribution
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix ruff and mypy linting errors
- Add S101 ignore for test files in pyproject.toml
- Split error test cases into separate test functions
- Remove type: ignore comments by simplifying type hints
- Clean up unused imports
* Add Hungarian matching utility for detection assignment
- Add match_detections() using Jonker-Volgenant algorithm (scipy)
- Maximize similarity with threshold filtering
- Return matched/unmatched indices for GT and tracker detections
- Include 17 unit tests covering edge cases and scipy consistency
- Adapted from TrackEval matching patterns
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix unused variable warnings in tests
* Fix remaining unused variable warning
* Add diverse test cases for match_detections
- Off-diagonal matches (GT0->TR2, GT1->TR0, etc.)
- Swapped/reversed matches
- Sparse non-sequential matches
- Very unbalanced sizes (1 GT vs 5 trackers, 5 GTs vs 1 tracker)
- Only middle/corner elements matching
- Threshold edge cases
- Optimal vs greedy assignment tests
* Remove redundant tests from test_matching.py
* Add CLEAR metrics module for multi-object tracking evaluation
Implement compute_clear_metrics function adapted from TrackEval with
exact numerical parity. The function computes standard CLEAR metrics
including MOTA, MOTP, IDSW, and track quality metrics (MT/PT/ML).
Key implementation details:
- Score matrix construction with IDSW prioritization (1000x bonus)
- Hungarian matching with proper threshold filtering
- ID switch detection based on previous tracker associations
- MT/PT/ML thresholds (>80%, >=20%, <20%)
- Fragmentation counting for track interruptions
- Vectorized GT ID mapping using np.searchsorted
Tests are fully parametrized and only test the public API.
Reference: trackeval/metrics/clear.py:38-129 (eval_sequence method)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add CLEAR metrics module for multi-object tracking evaluation
Implement compute_clear_metrics function adapted from TrackEval with
exact numerical parity. The function computes standard CLEAR metrics
including MOTA, MOTP, IDSW, and track quality metrics (MT/PT/ML).
Key implementation details:
- Score matrix construction with IDSW prioritization (1000x bonus)
- Hungarian matching with proper threshold filtering
- ID switch detection based on previous tracker associations
- MT/PT/ML thresholds (>80%, >=20%, <20%)
- Fragmentation counting for track interruptions
- Vectorized GT ID mapping using np.searchsorted
Tests are fully parametrized and only test the public API.
Reference: trackeval/metrics/clear.py:38-129 (eval_sequence method)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add MOT format file loading and sequence preparation for E2E evaluation
- Add MOTFrameData dataclass for per-frame detection data
- Add MOTSequenceData dataclass for prepared sequence data ready for metrics
- Add load_mot_file() to parse MOT Challenge format files
- Add prepare_mot_sequence() to compute IoU and remap IDs for evaluation
- Update __init__.py to export new types and functions
- Add comprehensive unit tests (20 tests for io module)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add integration tests for CLEAR metrics validation against TrackEval
- Add test/conftest.py with fixture to download and cache SoccerNet test data
- Add test/eval/test_integration.py with 49 parametrized tests for all sequences
- Add ci-integration-tests.yml workflow that runs only on eval code changes
- Update ci-tests.yml to exclude integration tests from regular CI
- Add integration marker to pyproject.toml pytest config
All 49 sequences pass with exact numerical parity to TrackEval for:
- Integer metrics: CLR_TP, CLR_FN, CLR_FP, IDSW, MT, PT, ML, Frag
- Float metrics: MOTA, MOTP, MTR, PTR, MLR
* Refactor integration tests and improve code quality
- Rename CI job to 'TrackEval Parity Validation' for clarity
- Derive sequence names dynamically from expected_results.json
- Simplify test_io.py from 21 to 10 test cases while maintaining coverage
- Restore useful comments in io.py around ID remapping and IoU computation
- Clean up conftest.py and test_integration.py docstrings
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Update integration tests to use SportsMOT and DanceTrack datasets (#221)
* Update integration tests to use SportsMOT and DanceTrack data
Replace SoccerNet test data with SportsMOT (45 sequences) and DanceTrack
(25 sequences) datasets. The tests now validate CLEAR metrics against
TrackEval results for 70 total sequences.
- Update URLs to new GCS-hosted test data zips
- Add multi-dataset fixture support in conftest.py
- Parametrize tests across both datasets
- Fix metric comparison (new format uses fractions, not percentages)
* Fix mypy conftest module name conflict
Use a pytest fixture instead of directly importing from conftest.py to avoid
mypy seeing the file under two module names ("conftest" and "test.conftest").
* Add type annotation for test_cases variable
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add high-level evaluation API and CLI (TrackEval Part 5) (#215)
* Add box IoU/IoA calculation module from TrackEval
- Add box_iou() for Intersection over Union calculation
- Add box_ioa() for Intersection over Area calculation
- Support both xyxy and xywh box formats
- Include 32 unit tests covering edge cases and floating point precision
- Adapted from TrackEval with MIT license attribution
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix ruff and mypy linting errors
- Add S101 ignore for test files in pyproject.toml
- Split error test cases into separate test functions
- Remove type: ignore comments by simplifying type hints
- Clean up unused imports
* Add Hungarian matching utility for detection assignment
- Add match_detections() using Jonker-Volgenant algorithm (scipy)
- Maximize similarity with threshold filtering
- Return matched/unmatched indices for GT and tracker detections
- Include 17 unit tests covering edge cases and scipy consistency
- Adapted from TrackEval matching patterns
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix unused variable warnings in tests
* Fix remaining unused variable warning
* Add diverse test cases for match_detections
- Off-diagonal matches (GT0->TR2, GT1->TR0, etc.)
- Swapped/reversed matches
- Sparse non-sequential matches
- Very unbalanced sizes (1 GT vs 5 trackers, 5 GTs vs 1 tracker)
- Only middle/corner elements matching
- Threshold edge cases
- Optimal vs greedy assignment tests
* Remove redundant tests from test_matching.py
* Add CLEAR metrics module for multi-object tracking evaluation
Implement compute_clear_metrics function adapted from TrackEval with
exact numerical parity. The function computes standard CLEAR metrics
including MOTA, MOTP, IDSW, and track quality metrics (MT/PT/ML).
Key implementation details:
- Score matrix construction with IDSW prioritization (1000x bonus)
- Hungarian matching with proper threshold filtering
- ID switch detection based on previous tracker associations
- MT/PT/ML thresholds (>80%, >=20%, <20%)
- Fragmentation counting for track interruptions
- Vectorized GT ID mapping using np.searchsorted
Tests are fully parametrized and only test the public API.
Reference: trackeval/metrics/clear.py:38-129 (eval_sequence method)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add CLEAR metrics module for multi-object tracking evaluation
Implement compute_clear_metrics function adapted from TrackEval with
exact numerical parity. The function computes standard CLEAR metrics
including MOTA, MOTP, IDSW, and track quality metrics (MT/PT/ML).
Key implementation details:
- Score matrix construction with IDSW prioritization (1000x bonus)
- Hungarian matching with proper threshold filtering
- ID switch detection based on previous tracker associations
- MT/PT/ML thresholds (>80%, >=20%, <20%)
- Fragmentation counting for track interruptions
- Vectorized GT ID mapping using np.searchsorted
Tests are fully parametrized and only test the public API.
Reference: trackeval/metrics/clear.py:38-129 (eval_sequence method)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add MOT format file loading and sequence preparation for E2E evaluation
- Add MOTFrameData dataclass for per-frame detection data
- Add MOTSequenceData dataclass for prepared sequence data ready for metrics
- Add load_mot_file() to parse MOT Challenge format files
- Add prepare_mot_sequence() to compute IoU and remap IDs for evaluation
- Update __init__.py to export new types and functions
- Add comprehensive unit tests (20 tests for io module)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add integration tests for CLEAR metrics validation against TrackEval
- Add test/conftest.py with fixture to download and cache SoccerNet test data
- Add test/eval/test_integration.py with 49 parametrized tests for all sequences
- Add ci-integration-tests.yml workflow that runs only on eval code changes
- Update ci-tests.yml to exclude integration tests from regular CI
- Add integration marker to pyproject.toml pytest config
All 49 sequences pass with exact numerical parity to TrackEval for:
- Integer metrics: CLR_TP, CLR_FN, CLR_FP, IDSW, MT, PT, ML, Frag
- Float metrics: MOTA, MOTP, MTR, PTR, MLR
* Refactor integration tests and improve code quality
- Rename CI job to 'TrackEval Parity Validation' for clarity
- Derive sequence names dynamically from expected_results.json
- Simplify test_io.py from 21 to 10 test cases while maintaining coverage
- Restore useful comments in io.py around ID remapping and IoU computation
- Clean up conftest.py and test_integration.py docstrings
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add high-level evaluation API and CLI for tracker evaluation
Introduce SDK functions and CLI for evaluating multi-object tracking
results against ground truth using MOT Challenge format data.
SDK:
- evaluate_mot_sequence(): Evaluate single sequence, returns CLEAR metrics
- evaluate_benchmark(): Evaluate multiple sequences with aggregation
CLI (beta):
- `trackers eval --gt <file> --tracker <file>` for single sequence
- `trackers eval --gt-dir <dir> --tracker-dir <dir>` for benchmarks
- Optional rich output with `pip install trackers[cli]`
- Beta warning displayed on CLI usage
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add high-level evaluation API and CLI for tracker evaluation
Introduce SDK functions and CLI for evaluating multi-object tracking
results against ground truth using MOT Challenge format data.
SDK:
- evaluate_mot_sequence(): Evaluate single sequence, returns CLEAR metrics
- evaluate_benchmark(): Evaluate multiple sequences with aggregation
CLI (beta):
- `trackers eval --gt <file> --tracker <file>` for single sequence
- `trackers eval --gt-dir <dir> --tracker-dir <dir>` for benchmarks
- Optional rich output with `pip install trackers[cli]`
- Beta warning displayed on CLI usage
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Update integration tests to use SportsMOT and DanceTrack datasets (#221)
* Update integration tests to use SportsMOT and DanceTrack data
Replace SoccerNet test data with SportsMOT (45 sequences) and DanceTrack
(25 sequences) datasets. The tests now validate CLEAR metrics against
TrackEval results for 70 total sequences.
- Update URLs to new GCS-hosted test data zips
- Add multi-dataset fixture support in conftest.py
- Parametrize tests across both datasets
- Fix metric comparison (new format uses fractions, not percentages)
* Fix mypy conftest module name conflict
Use a pytest fixture instead of directly importing from conftest.py to avoid
mypy seeing the file under two module names ("conftest" and "test.conftest").
* Add type annotation for test_cases variable
* Add smart auto-detection and structured result objects to evaluation API (#222)
* Add smart auto-detection and structured result objects to evaluation API
- Remove data_format parameter from SDK and --data-format flag from CLI
- Add smart detection: auto-detect format (flat vs MOT17), benchmark, split, and tracker name
- Log detection results for transparency; error helpfully when ambiguous
- Add BenchmarkResult and SequenceResult dataclasses with json/table/save/load methods
- Add all CLEAR metrics (MOTA, MOTP, MODA, CLR_Re, CLR_Pr, MTR, PTR, MLR, sMOTA, etc.)
- Simplify test fixtures to use auto-detection instead of explicit format parameters
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Simplify CLI table output to use SDK result.table() method
- Remove custom Rich table formatting that truncated columns
- Use result.table(columns) for consistent, readable output
- Set sensible default columns: MOTA, MOTP, IDSW, CLR_FP, CLR_FN, MT, ML
- Remove unused Rich imports and helper functions
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add HOTA (Higher Order Tracking Accuracy) metrics (TrackEval Part 6) (#223)
* Add box IoU/IoA calculation module from TrackEval
- Add box_iou() for Intersection over Union calculation
- Add box_ioa() for Intersection over Area calculation
- Support both xyxy and xywh box formats
- Include 32 unit tests covering edge cases and floating point precision
- Adapted from TrackEval with MIT license attribution
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix ruff and mypy linting errors
- Add S101 ignore for test files in pyproject.toml
- Split error test cases into separate test functions
- Remove type: ignore comments by simplifying type hints
- Clean up unused imports
* Add Hungarian matching utility for detection assignment
- Add match_detections() using Jonker-Volgenant algorithm (scipy)
- Maximize similarity with threshold filtering
- Return matched/unmatched indices for GT and tracker detections
- Include 17 unit tests covering edge cases and scipy consistency
- Adapted from TrackEval matching patterns
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix unused variable warnings in tests
* Fix remaining unused variable warning
* Add diverse test cases for match_detections
- Off-diagonal matches (GT0->TR2, GT1->TR0, etc.)
- Swapped/reversed matches
- Sparse non-sequential matches
- Very unbalanced sizes (1 GT vs 5 trackers, 5 GTs vs 1 tracker)
- Only middle/corner elements matching
- Threshold edge cases
- Optimal vs greedy assignment tests
* Remove redundant tests from test_matching.py
* Add CLEAR metrics module for multi-object tracking evaluation
Implement compute_clear_metrics function adapted from TrackEval with
exact numerical parity. The function computes standard CLEAR metrics
including MOTA, MOTP, IDSW, and track quality metrics (MT/PT/ML).
Key implementation details:
- Score matrix construction with IDSW prioritization (1000x bonus)
- Hungarian matching with proper threshold filtering
- ID switch detection based on previous tracker associations
- MT/PT/ML thresholds (>80%, >=20%, <20%)
- Fragmentation counting for track interruptions
- Vectorized GT ID mapping using np.searchsorted
Tests are fully parametrized and only test the public API.
Reference: trackeval/metrics/clear.py:38-129 (eval_sequence method)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add CLEAR metrics module for multi-object tracking evaluation
Implement compute_clear_metrics function adapted from TrackEval with
exact numerical parity. The function computes standard CLEAR metrics
including MOTA, MOTP, IDSW, and track quality metrics (MT/PT/ML).
Key implementation details:
- Score matrix construction with IDSW prioritization (1000x bonus)
- Hungarian matching with proper threshold filtering
- ID switch detection based on previous tracker associations
- MT/PT/ML thresholds (>80%, >=20%, <20%)
- Fragmentation counting for track interruptions
- Vectorized GT ID mapping using np.searchsorted
Tests are fully parametrized and only test the public API.
Reference: trackeval/metrics/clear.py:38-129 (eval_sequence method)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add MOT format file loading and sequence preparation for E2E evaluation
- Add MOTFrameData dataclass for per-frame detection data
- Add MOTSequenceData dataclass for prepared sequence data ready for metrics
- Add load_mot_file() to parse MOT Challenge format files
- Add prepare_mot_sequence() to compute IoU and remap IDs for evaluation
- Update __init__.py to export new types and functions
- Add comprehensive unit tests (20 tests for io module)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add integration tests for CLEAR metrics validation against TrackEval
- Add test/conftest.py with fixture to download and cache SoccerNet test data
- Add test/eval/test_integration.py with 49 parametrized tests for all sequences
- Add ci-integration-tests.yml workflow that runs only on eval code changes
- Update ci-tests.yml to exclude integration tests from regular CI
- Add integration marker to pyproject.toml pytest config
All 49 sequences pass with exact numerical parity to TrackEval for:
- Integer metrics: CLR_TP, CLR_FN, CLR_FP, IDSW, MT, PT, ML, Frag
- Float metrics: MOTA, MOTP, MTR, PTR, MLR
* Refactor integration tests and improve code quality
- Rename CI job to 'TrackEval Parity Validation' for clarity
- Derive sequence names dynamically from expected_results.json
- Simplify test_io.py from 21 to 10 test cases while maintaining coverage
- Restore useful comments in io.py around ID remapping and IoU computation
- Clean up conftest.py and test_integration.py docstrings
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add high-level evaluation API and CLI for tracker evaluation
Introduce SDK functions and CLI for evaluating multi-object tracking
results against ground truth using MOT Challenge format data.
SDK:
- evaluate_mot_sequence(): Evaluate single sequence, returns CLEAR metrics
- evaluate_benchmark(): Evaluate multiple sequences with aggregation
CLI (beta):
- `trackers eval --gt <file> --tracker <file>` for single sequence
- `trackers eval --gt-dir <dir> --tracker-dir <dir>` for benchmarks
- Optional rich output with `pip install trackers[cli]`
- Beta warning displayed on CLI usage
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add high-level evaluation API and CLI for tracker evaluation
Introduce SDK functions and CLI for evaluating multi-object tracking
results against ground truth using MOT Challenge format data.
SDK:
- evaluate_mot_sequence(): Evaluate single sequence, returns CLEAR metrics
- evaluate_benchmark(): Evaluate multiple sequences with aggregation
CLI (beta):
- `trackers eval --gt <file> --tracker <file>` for single sequence
- `trackers eval --gt-dir <dir> --tracker-dir <dir>` for benchmarks
- Optional rich output with `pip install trackers[cli]`
- Beta warning displayed on CLI usage
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Update integration tests to use SportsMOT and DanceTrack datasets (#221)
* Update integration tests to use SportsMOT and DanceTrack data
Replace SoccerNet test data with SportsMOT (45 sequences) and DanceTrack
(25 sequences) datasets. The tests now validate CLEAR metrics against
TrackEval results for 70 total sequences.
- Update URLs to new GCS-hosted test data zips
- Add multi-dataset fixture support in conftest.py
- Parametrize tests across both datasets
- Fix metric comparison (new format uses fractions, not percentages)
* Fix mypy conftest module name conflict
Use a pytest fixture instead of directly importing from conftest.py to avoid
mypy seeing the file under two module names ("conftest" and "test.conftest").
* Add type annotation for test_cases variable
* Add smart auto-detection and structured result objects to evaluation API (#222)
* Add smart auto-detection and structured result objects to evaluation API
- Remove data_format parameter from SDK and --data-format flag from CLI
- Add smart detection: auto-detect format (flat vs MOT17), benchmark, split, and tracker name
- Log detection results for transparency; error helpfully when ambiguous
- Add BenchmarkResult and SequenceResult dataclasses with json/table/save/load methods
- Add all CLEAR metrics (MOTA, MOTP, MODA, CLR_Re, CLR_Pr, MTR, PTR, MLR, sMOTA, etc.)
- Simplify test fixtures to use auto-detection instead of explicit format parameters
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Simplify CLI table output to use SDK result.table() method
- Remove custom Rich table formatting that truncated columns
- Use result.table(columns) for consistent, readable output
- Set sensible default columns: MOTA, MOTP, IDSW, CLR_FP, CLR_FN, MT, ML
- Remove unused Rich imports and helper functions
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add HOTA (Higher Order Tracking Accuracy) metrics
- Create trackers/eval/hota.py with compute_hota_metrics() and
aggregate_hota_metrics() functions implementing HOTA algorithm
- Add HOTAMetrics dataclass for storing HOTA results
- Update SequenceResult and BenchmarkResult to support optional HOTA metrics
- Integrate HOTA computation into evaluate_mot_sequence and evaluate_benchmark
- Update CLI to support --metrics HOTA option
- Add comprehensive unit tests for HOTA computation and aggregation
HOTA evaluates tracking at multiple IoU thresholds (0.05-0.95) and computes:
- Detection metrics: DetA, DetRe, DetPr
- Association metrics: AssA, AssRe, AssPr
- Combined HOTA = sqrt(DetA * AssA)
- Localization accuracy: LocA
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add HOTA (Higher Order Tracking Accuracy) metrics
- Create trackers/eval/hota.py with compute_hota_metrics() and
aggregate_hota_metrics() functions implementing HOTA algorithm
- Add HOTAMetrics dataclass for storing HOTA results
- Update SequenceResult and BenchmarkResult to support optional HOTA metrics
- Integrate HOTA computation into evaluate_mot_sequence and evaluate_benchmark
- Update CLI to support --metrics HOTA option
- Add comprehensive unit tests for HOTA computation and aggregation
HOTA evaluates tracking at multiple IoU thresholds (0.05-0.95) and computes:
- Detection metrics: DetA, DetRe, DetPr
- Association metrics: AssA, AssRe, AssPr
- Combined HOTA = sqrt(DetA * AssA)
- Localization accuracy: LocA
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Show all metrics by default in table output
Remove the column limitation that was originally added as a workaround
for Rich table truncation. Now using plain-text tables that don't
truncate, so show all available metrics by default:
- CLEAR-only: 17 metrics
- HOTA+CLEAR: 28 metrics (11 HOTA + 17 CLEAR)
Users can still limit columns via the columns parameter if desired.
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix HOTA aggregation TypeError with numpy arrays
HOTAMetrics.to_dict() was converting numpy arrays to lists for JSON
serialization, but aggregate_hota_metrics() needs numpy arrays for
weighted averaging math.
Added arrays_as_list parameter to to_dict() to control this behavior:
- arrays_as_list=True (default): convert to lists for JSON
- arrays_as_list=False: keep as numpy arrays for aggregation
* Add HOTA verification to integration tests
Update all 4 integration tests to:
- Pass metrics=["CLEAR", "HOTA"] to evaluate_benchmark()
- Verify HOTA metrics (HOTA, DetA, AssA, LocA) against TrackEval expected results
This ensures numerical parity is verified for both CLEAR and HOTA metrics.
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix HOTA expected values in test data
Update test data URLs to use regenerated zips with correct HOTA metrics.
The previous expected_results.json files incorrectly used HOTA(0) (value at
alpha=0.05) instead of the mean HOTA across all alpha thresholds.
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add Identity metrics (IDF1, IDR, IDP) to evaluation module (TrackEval Part 7) (#224)
* Add box IoU/IoA calculation module from TrackEval
- Add box_iou() for Intersection over Union calculation
- Add box_ioa() for Intersection over Area calculation
- Support both xyxy and xywh box formats
- Include 32 unit tests covering edge cases and floating point precision
- Adapted from TrackEval with MIT license attribution
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix ruff and mypy linting errors
- Add S101 ignore for test files in pyproject.toml
- Split error test cases into separate test functions
- Remove type: ignore comments by simplifying type hints
- Clean up unused imports
* Add Hungarian matching utility for detection assignment
- Add match_detections() using Jonker-Volgenant algorithm (scipy)
- Maximize similarity with threshold filtering
- Return matched/unmatched indices for GT and tracker detections
- Include 17 unit tests covering edge cases and scipy consistency
- Adapted from TrackEval matching patterns
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix unused variable warnings in tests
* Fix remaining unused variable warning
* Add diverse test cases for match_detections
- Off-diagonal matches (GT0->TR2, GT1->TR0, etc.)
- Swapped/reversed matches
- Sparse non-sequential matches
- Very unbalanced sizes (1 GT vs 5 trackers, 5 GTs vs 1 tracker)
- Only middle/corner elements matching
- Threshold edge cases
- Optimal vs greedy assignment tests
* Remove redundant tests from test_matching.py
* Add CLEAR metrics module for multi-object tracking evaluation
Implement compute_clear_metrics function adapted from TrackEval with
exact numerical parity. The function computes standard CLEAR metrics
including MOTA, MOTP, IDSW, and track quality metrics (MT/PT/ML).
Key implementation details:
- Score matrix construction with IDSW prioritization (1000x bonus)
- Hungarian matching with proper threshold filtering
- ID switch detection based on previous tracker associations
- MT/PT/ML thresholds (>80%, >=20%, <20%)
- Fragmentation counting for track interruptions
- Vectorized GT ID mapping using np.searchsorted
Tests are fully parametrized and only test the public API.
Reference: trackeval/metrics/clear.py:38-129 (eval_sequence method)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add CLEAR metrics module for multi-object tracking evaluation
Implement compute_clear_metrics function adapted from TrackEval with
exact numerical parity. The function computes standard CLEAR metrics
including MOTA, MOTP, IDSW, and track quality metrics (MT/PT/ML).
Key implementation details:
- Score matrix construction with IDSW prioritization (1000x bonus)
- Hungarian matching with proper threshold filtering
- ID switch detection based on previous tracker associations
- MT/PT/ML thresholds (>80%, >=20%, <20%)
- Fragmentation counting for track interruptions
- Vectorized GT ID mapping using np.searchsorted
Tests are fully parametrized and only test the public API.
Reference: trackeval/metrics/clear.py:38-129 (eval_sequence method)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add MOT format file loading and sequence preparation for E2E evaluation
- Add MOTFrameData dataclass for per-frame detection data
- Add MOTSequenceData dataclass for prepared sequence data ready for metrics
- Add load_mot_file() to parse MOT Challenge format files
- Add prepare_mot_sequence() to compute IoU and remap IDs for evaluation
- Update __init__.py to export new types and functions
- Add comprehensive unit tests (20 tests for io module)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add integration tests for CLEAR metrics validation against TrackEval
- Add test/conftest.py with fixture to download and cache SoccerNet test data
- Add test/eval/test_integration.py with 49 parametrized tests for all sequences
- Add ci-integration-tests.yml workflow that runs only on eval code changes
- Update ci-tests.yml to exclude integration tests from regular CI
- Add integration marker to pyproject.toml pytest config
All 49 sequences pass with exact numerical parity to TrackEval for:
- Integer metrics: CLR_TP, CLR_FN, CLR_FP, IDSW, MT, PT, ML, Frag
- Float metrics: MOTA, MOTP, MTR, PTR, MLR
* Refactor integration tests and improve code quality
- Rename CI job to 'TrackEval Parity Validation' for clarity
- Derive sequence names dynamically from expected_results.json
- Simplify test_io.py from 21 to 10 test cases while maintaining coverage
- Restore useful comments in io.py around ID remapping and IoU computation
- Clean up conftest.py and test_integration.py docstrings
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add high-level evaluation API and CLI for tracker evaluation
Introduce SDK functions and CLI for evaluating multi-object tracking
results against ground truth using MOT Challenge format data.
SDK:
- evaluate_mot_sequence(): Evaluate single sequence, returns CLEAR metrics
- evaluate_benchmark(): Evaluate multiple sequences with aggregation
CLI (beta):
- `trackers eval --gt <file> --tracker <file>` for single sequence
- `trackers eval --gt-dir <dir> --tracker-dir <dir>` for benchmarks
- Optional rich output with `pip install trackers[cli]`
- Beta warning displayed on CLI usage
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add high-level evaluation API and CLI for tracker evaluation
Introduce SDK functions and CLI for evaluating multi-object tracking
results against ground truth using MOT Challenge format data.
SDK:
- evaluate_mot_sequence(): Evaluate single sequence, returns CLEAR metrics
- evaluate_benchmark(): Evaluate multiple sequences with aggregation
CLI (beta):
- `trackers eval --gt <file> --tracker <file>` for single sequence
- `trackers eval --gt-dir <dir> --tracker-dir <dir>` for benchmarks
- Optional rich output with `pip install trackers[cli]`
- Beta warning displayed on CLI usage
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Update integration tests to use SportsMOT and DanceTrack datasets (#221)
* Update integration tests to use SportsMOT and DanceTrack data
Replace SoccerNet test data with SportsMOT (45 sequences) and DanceTrack
(25 sequences) datasets. The tests now validate CLEAR metrics against
TrackEval results for 70 total sequences.
- Update URLs to new GCS-hosted test data zips
- Add multi-dataset fixture support in conftest.py
- Parametrize tests across both datasets
- Fix metric comparison (new format uses fractions, not percentages)
* Fix mypy conftest module name conflict
Use a pytest fixture instead of directly importing from conftest.py to avoid
mypy seeing the file under two module names ("conftest" and "test.conftest").
* Add type annotation for test_cases variable
* Add smart auto-detection and structured result objects to evaluation API (#222)
* Add smart auto-detection and structured result objects to evaluation API
- Remove data_format parameter from SDK and --data-format flag from CLI
- Add smart detection: auto-detect format (flat vs MOT17), benchmark, split, and tracker name
- Log detection results for transparency; error helpfully when ambiguous
- Add BenchmarkResult and SequenceResult dataclasses with json/table/save/load methods
- Add all CLEAR metrics (MOTA, MOTP, MODA, CLR_Re, CLR_Pr, MTR, PTR, MLR, sMOTA, etc.)
- Simplify test fixtures to use auto-detection instead of explicit format parameters
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Simplify CLI table output to use SDK result.table() method
- Remove custom Rich table formatting that truncated columns
- Use result.table(columns) for consistent, readable output
- Set sensible default columns: MOTA, MOTP, IDSW, CLR_FP, CLR_FN, MT, ML
- Remove unused Rich imports and helper functions
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add HOTA (Higher Order Tracking Accuracy) metrics
- Create trackers/eval/hota.py with compute_hota_metrics() and
aggregate_hota_metrics() functions implementing HOTA algorithm
- Add HOTAMetrics dataclass for storing HOTA results
- Update SequenceResult and BenchmarkResult to support optional HOTA metrics
- Integrate HOTA computation into evaluate_mot_sequence and evaluate_benchmark
- Update CLI to support --metrics HOTA option
- Add comprehensive unit tests for HOTA computation and aggregation
HOTA evaluates tracking at multiple IoU thresholds (0.05-0.95) and computes:
- Detection metrics: DetA, DetRe, DetPr
- Association metrics: AssA, AssRe, AssPr
- Combined HOTA = sqrt(DetA * AssA)
- Localization accuracy: LocA
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add HOTA (Higher Order Tracking Accuracy) metrics
- Create trackers/eval/hota.py with compute_hota_metrics() and
aggregate_hota_metrics() functions implementing HOTA algorithm
- Add HOTAMetrics dataclass for storing HOTA results
- Update SequenceResult and BenchmarkResult to support optional HOTA metrics
- Integrate HOTA computation into evaluate_mot_sequence and evaluate_benchmark
- Update CLI to support --metrics HOTA option
- Add comprehensive unit tests for HOTA computation and aggregation
HOTA evaluates tracking at multiple IoU thresholds (0.05-0.95) and computes:
- Detection metrics: DetA, DetRe, DetPr
- Association metrics: AssA, AssRe, AssPr
- Combined HOTA = sqrt(DetA * AssA)
- Localization accuracy: LocA
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Show all metrics by default in table output
Remove the column limitation that was originally added as a workaround
for Rich table truncation. Now using plain-text tables that don't
truncate, so show all available metrics by default:
- CLEAR-only: 17 metrics
- HOTA+CLEAR: 28 metrics (11 HOTA + 17 CLEAR)
Users can still limit columns via the columns parameter if desired.
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix HOTA aggregation TypeError with numpy arrays
HOTAMetrics.to_dict() was converting numpy arrays to lists for JSON
serialization, but aggregate_hota_metrics() needs numpy arrays for
weighted averaging math.
Added arrays_as_list parameter to to_dict() to control this behavior:
- arrays_as_list=True (default): convert to lists for JSON
- arrays_as_list=False: keep as numpy arrays for aggregation
* Add HOTA verification to integration tests
Update all 4 integration tests to:
- Pass metrics=["CLEAR", "HOTA"] to evaluate_benchmark()
- Verify HOTA metrics (HOTA, DetA, AssA, LocA) against TrackEval expected results
This ensures numerical parity is verified for both CLEAR and HOTA metrics.
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix HOTA expected values in test data
Update test data URLs to use regenerated zips with correct HOTA metrics.
The previous expected_results.json files incorrectly used HOTA(0) (value at
alpha=0.05) instead of the mean HOTA across all alpha thresholds.
* Add Identity metrics (IDF1, IDR, IDP) implementation
Implements Identity metrics following TrackEval's algorithm:
- compute_identity_metrics() for single sequence evaluation
- aggregate_identity_metrics() for benchmark aggregation
- IdentityMetrics dataclass for structured results
Updates evaluate_mot_sequence and evaluate_benchmark to support
metrics=["Identity"] alongside CLEAR and HOTA.
Integration tests now verify CLEAR, HOTA, and Identity metrics
against TrackEval precomputed values.
* Add Identity metrics support to CLI
Update --metrics argument to accept Identity alongside CLEAR and HOTA.
* code review pt1
* fix(pre_commit): 🎨 auto format pre-commit hooks
* code review pt2
* Remove unused match_detections utility
The match_detections function was not used by any metric internally.
Each metric (CLEAR, HOTA, Identity) implements specialized matching
logic that cannot be generalized into a shared utility.
Deleted:
- trackers/eval/matching.py
- test/eval/test_matching.py
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Trackeval integration part8 (#226)
* Add box IoU/IoA calculation module from TrackEval
- Add box_iou() for Intersection over Union calculation
- Add box_ioa() for Intersection over Area calculation
- Support both xyxy and xywh box formats
- Include 32 unit tests covering edge cases and floating point precision
- Adapted from TrackEval with MIT license attribution
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix ruff and mypy linting errors
- Add S101 ignore for test files in pyproject.toml
- Split error test cases into separate test functions
- Remove type: ignore comments by simplifying type hints
- Clean up unused imports
* Add Hungarian matching utility for detection assignment
- Add match_detections() using Jonker-Volgenant algorithm (scipy)
- Maximize similarity with threshold filtering
- Return matched/unmatched indices for GT and tracker detections
- Include 17 unit tests covering edge cases and scipy consistency
- Adapted from TrackEval matching patterns
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix unused variable warnings in tests
* Fix remaining unused variable warning
* Add diverse test cases for match_detections
- Off-diagonal matches (GT0->TR2, GT1->TR0, etc.)
- Swapped/reversed matches
- Sparse non-sequential matches
- Very unbalanced sizes (1 GT vs 5 trackers, 5 GTs vs 1 tracker)
- Only middle/corner elements matching
- Threshold edge cases
- Optimal vs greedy assignment tests
* Remove redundant tests from test_matching.py
* Add CLEAR metrics module for multi-object tracking evaluation
Implement compute_clear_metrics function adapted from TrackEval with
exact numerical parity. The function computes standard CLEAR metrics
including MOTA, MOTP, IDSW, and track quality metrics (MT/PT/ML).
Key implementation details:
- Score matrix construction with IDSW prioritization (1000x bonus)
- Hungarian matching with proper threshold filtering
- ID switch detection based on previous tracker associations
- MT/PT/ML thresholds (>80%, >=20%, <20%)
- Fragmentation counting for track interruptions
- Vectorized GT ID mapping using np.searchsorted
Tests are fully parametrized and only test the public API.
Reference: trackeval/metrics/clear.py:38-129 (eval_sequence method)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add CLEAR metrics module for multi-object tracking evaluation
Implement compute_clear_metrics function adapted from TrackEval with
exact numerical parity. The function computes standard CLEAR metrics
including MOTA, MOTP, IDSW, and track quality metrics (MT/PT/ML).
Key implementation details:
- Score matrix construction with IDSW prioritization (1000x bonus)
- Hungarian matching with proper threshold filtering
- ID switch detection based on previous tracker associations
- MT/PT/ML thresholds (>80%, >=20%, <20%)
- Fragmentation counting for track interruptions
- Vectorized GT ID mapping using np.searchsorted
Tests are fully parametrized and only test the public API.
Reference: trackeval/metrics/clear.py:38-129 (eval_sequence method)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add MOT format file loading and sequence preparation for E2E evaluation
- Add MOTFrameData dataclass for per-frame detection data
- Add MOTSequenceData dataclass for prepared sequence data ready for metrics
- Add load_mot_file() to parse MOT Challenge format files
- Add prepare_mot_sequence() to compute IoU and remap IDs for evaluation
- Update __init__.py to export new types and functions
- Add comprehensive unit tests (20 tests for io module)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add integration tests for CLEAR metrics validation against TrackEval
- Add test/conftest.py with fixture to download and cache SoccerNet test data
- Add test/eval/test_integration.py with 49 parametrized tests for all sequences
- Add ci-integration-tests.yml workflow that runs only on eval code changes
- Update ci-tests.yml to exclude integration tests from regular CI
- Add integration marker to pyproject.toml pytest config
All 49 sequences pass with exact numerical parity to TrackEval for:
- Integer metrics: CLR_TP, CLR_FN, CLR_FP, IDSW, MT, PT, ML, Frag
- Float metrics: MOTA, MOTP, MTR, PTR, MLR
* Refactor integration tests and improve code quality
- Rename CI job to 'TrackEval Parity Validation' for clarity
- Derive sequence names dynamically from expected_results.json
- Simplify test_io.py from 21 to 10 test cases while maintaining coverage
- Restore useful comments in io.py around ID remapping and IoU computation
- Clean up conftest.py and test_integration.py docstrings
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add high-level evaluation API and CLI for tracker evaluation
Introduce SDK functions and CLI for evaluating multi-object tracking
results against ground truth using MOT Challenge format data.
SDK:
- evaluate_mot_sequence(): Evaluate single sequence, returns CLEAR metrics
- evaluate_benchmark(): Evaluate multiple sequences with aggregation
CLI (beta):
- `trackers eval --gt <file> --tracker <file>` for single sequence
- `trackers eval --gt-dir <dir> --tracker-dir <dir>` for benchmarks
- Optional rich output with `pip install trackers[cli]`
- Beta warning displayed on CLI usage
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add high-level evaluation API and CLI for tracker evaluation
Introduce SDK functions and CLI for evaluating multi-object tracking
results against ground truth using MOT Challenge format data.
SDK:
- evaluate_mot_sequence(): Evaluate single sequence, returns CLEAR metrics
- evaluate_benchmark(): Evaluate multiple sequences with aggregation
CLI (beta):
- `trackers eval --gt <file> --tracker <file>` for single sequence
- `trackers eval --gt-dir <dir> --tracker-dir <dir>` for benchmarks
- Optional rich output with `pip install trackers[cli]`
- Beta warning displayed on CLI usage
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Update integration tests to use SportsMOT and DanceTrack datasets (#221)
* Update integration tests to use SportsMOT and DanceTrack data
Replace SoccerNet test data with SportsMOT (45 sequences) and DanceTrack
(25 sequences) datasets. The tests now validate CLEAR metrics against
TrackEval results for 70 total sequences.
- Update URLs to new GCS-hosted test data zips
- Add multi-dataset fixture support in conftest.py
- Parametrize tests across both datasets
- Fix metric comparison (new format uses fractions, not percentages)
* Fix mypy conftest module name conflict
Use a pytest fixture instead of directly importing from conftest.py to avoid
mypy seeing the file under two module names ("conftest" and "test.conftest").
* Add type annotation for test_cases variable
* Add smart auto-detection and structured result objects to evaluation API (#222)
* Add smart auto-detection and structured result objects to evaluation API
- Remove data_format parameter from SDK and --data-format flag from CLI
- Add smart detection: auto-detect format (flat vs MOT17), benchmark, split, and tracker name
- Log detection results for transparency; error helpfully when ambiguous
- Add BenchmarkResult and SequenceResult dataclasses with json/table/save/load methods
- Add all CLEAR metrics (MOTA, MOTP, MODA, CLR_Re, CLR_Pr, MTR, PTR, MLR, sMOTA, etc.)
- Simplify test fixtures to use auto-detection instead of explicit format parameters
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Simplify CLI table output to use SDK result.table() method
- Remove custom Rich table formatting that truncated columns
- Use result.table(columns) for consistent, readable output
- Set sensible default columns: MOTA, MOTP, IDSW, CLR_FP, CLR_FN, MT, ML
- Remove unused Rich imports and helper functions
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add HOTA (Higher Order Tracking Accuracy) metrics
- Create trackers/eval/hota.py with compute_hota_metrics() and
aggregate_hota_metrics() functions implementing HOTA algorithm
- Add HOTAMetrics dataclass for storing HOTA results
- Update SequenceResult and BenchmarkResult to support optional HOTA metrics
- Integrate HOTA computation into evaluate_mot_sequence and evaluate_benchmark
- Update CLI to support --metrics HOTA option
- Add comprehensive unit tests for HOTA computation and aggregation
HOTA evaluates tracking at multiple IoU thresholds (0.05-0.95) and computes:
- Detection metrics: DetA, DetRe, DetPr
- Association metrics: AssA, AssRe, AssPr
- Combined HOTA = sqrt(DetA * AssA)
- Localization accuracy: LocA
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Add HOTA (Higher Order Tracking Accuracy) metrics
- Create trackers/eval/hota.py with compute_hota_metrics() and
aggregate_hota_metrics() functions implementing HOTA algorithm
- Add HOTAMetrics dataclass for storing HOTA results
- Update SequenceResult and BenchmarkResult to support optional HOTA metrics
- Integrate HOTA computation into evaluate_mot_sequence and evaluate_benchmark
- Update CLI to support --metrics HOTA option
- Add comprehensive unit tests for HOTA computation and aggregation
HOTA evaluates tracking at multiple IoU thresholds (0.05-0.95) and computes:
- Detection metrics: DetA, DetRe, DetPr
- Association metrics: AssA, AssRe, AssPr
- Combined HOTA = sqrt(DetA * AssA)
- Localization accuracy: LocA
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Show all metrics by default in table output
Remove the column limitation that was originally added as a workaround
for Rich table truncation. Now using plain-text tables that don't
truncate, so show all available metrics by default:
- CLEAR-only: 17 metrics
- HOTA+CLEAR: 28 metrics (11 HOTA + 17 CLEAR)
Users can still limit columns via the columns parameter if desired.
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix HOTA aggregation TypeError with numpy arrays
HOTAMetrics.to_dict() was converting numpy arrays to lists for JSON
serialization, but aggregate_hota_metrics() needs numpy arrays for
weighted averaging math.
Added arrays_as_list parameter to to_dict() to control this behavior:
- arrays_as_list=True (default): convert to lists for JSON
- arrays_as_list=False: keep as numpy arrays for aggregation
* Add HOTA verification to integration tests
Update all 4 integration tests to:
- Pass metrics=["CLEAR", "HOTA"] to evaluate_benchmark()
- Verify HOTA metrics (HOTA, DetA, AssA, LocA) against TrackEval expected results
This ensures numerical parity is verified for both CLEAR and HOTA metrics.
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix HOTA expected values in test data
Update test data URLs to use regenerated zips with correct HOTA metrics.
The previous expected_results.json files incorrectly used HOTA(0) (value at
alpha=0.05) instead of the mean HOTA across all alpha thresholds.
* Add Identity metrics (IDF1, IDR, IDP) implementation
Implements Identity metrics following TrackEval's algorithm:
- compute_identity_metrics() for single sequence evaluation
- aggregate_identity_metrics() for benchmark aggregation
- IdentityMetrics dataclass for structured results
Updates evaluate_mot_sequence and evaluate_benchmark to support
metrics=["Identity"] alongside CLEAR and HOTA.
Integration tests now verify CLEAR, HOTA, and Identity metrics
against TrackEval precomputed values.
* Add Identity metrics support to CLI
Update --metrics argument to accept Identity alongside CLEAR and HOTA.
* code review pt1
* fix(pre_commit): 🎨 auto format pre-commit hooks
* code review pt2
* Remove unused match_detections utility
The match_detections function was not used by any metric internally.
Each metric (CLEAR, HOTA, Identity) implements specialized matching
logic that cannot be generalized into a shared utility.
Deleted:
- trackers/eval/matching.py
- test/eval/test_matching.py
* Add comprehensive evaluation documentation and API reference (TrackEval Part 8)
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Refine evaluation docs and unify docstring style
- Remove premature CLI and Python quickstart pages
- Fix copy-code.js to handle empty continuation lines (>>>)
- Remove broken links to cli.md from evaluate.md
- Remove duplicate API sections from SORT/ByteTrack pages
- Unify docstring style in results.py to match evaluate.py
- Remove examples from results.py docstrings for consistency
- Standardize attribute documentation format (no backticks on names)
* Revamp documentation with Stripe-style formatting
- Rewrite evaluate.md with concise intro, "What you'll learn" section,
clean tabs for Python/CLI, and collapsible troubleshooting
- Apply same style to install.md for consistency
- Add full-width table CSS styling
- Remove help.md page from navigation
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Revamp documentation with Stripe-style formatting
- Rewrite evaluate.md with concise intro, "What you'll learn" section,
clean tabs for Python/CLI, and collapsible troubleshooting
- Apply same style to install.md for consistency
- Add full-width table CSS styling
- Remove help.md page from navigation
* Add comprehensive evaluation documentation and API reference (TrackEval Part 8)
* fix(pre_commit): 🎨 auto format pre-commit hooks
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* bump version from `2.1.0` to `2.2.0rc0`
* Enable mdformat and codespell pre-commit hooks (#227)
* Add TrackEval-compatible evaluation metrics (CLEAR, HOTA, Identity) with SDK and CLI (#225)
* Add evaluation module with box IoU/IoA calculations (TrackEval Part 1) (#210)
* Add box IoU/IoA calculation module from TrackEval
- Add box_iou() for Intersection over Union calculation
- Add box_ioa() for Intersection over Area calculation
- Support both xyxy and xywh box formats
- Include 32 unit tests covering edge cases and floating point precision
- Adapted from TrackEval with MIT license attribution
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix ruff and mypy linting errors
- Add S101 ignore for test files in pyproject.toml
- Split error test cases into separate test functions
- Remove type: ignore comments by simplifying type hints
- Clean up unused imports
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add Hungarian matching utility for detection assignment (TrackEval Part 2) (#211)
* Add box IoU/IoA calculation module from TrackEval
- Add box_iou() for Intersection over Union calculation
- Add box_ioa() for Intersection over Area calculation
- Support both xyxy and xywh box formats
- Include 32 unit tests covering edge cases and floating point precision
- Adapted from TrackEval with MIT license attribution
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix ruff and mypy linting errors
- Add S101 ignore for test files in pyproject.toml
- Split error test cases into separate test functions
- Remove type: ignore comments by simplifying type hints
- Clean up unused imports
* Add Hungarian matching utility for detection assignment
- Add match_detections() using Jonker-Volgenant algorithm (scipy)
- Maximize similarity with threshold filtering
- Return matched/unmatched indices for GT and tracker detections
- Include 17 unit tests covering edge cases and scipy consistency
- Adapted from TrackEval matching patterns
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix unused variable warnings in tests
* Fix remaining unused variable warning
* Add diverse test cases for match_detections
- Off-diagonal matches (GT0->TR2, GT1->TR0, etc.)
- Swapped/reversed matches
- Sparse non-sequential matches
- Very unbalanced sizes (1 GT vs 5 trackers, 5 GTs vs 1 tracker)
- Only middle/corner elements matching
- Threshold edge cases
- Optimal vs greedy assignment tests
* Remove redundant tests from test_matching.py
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add CLEAR metrics module (MOTA, MOTP, IDSW, MT/PT/ML) (TrackEval Part 3) (#212)
* Add box IoU/IoA calculation module from TrackEval
- Add box_iou() for Intersection over Union calculation
- Add box_ioa() for Intersection over Area calculation
- Support both xyxy and xywh box formats
- Include 32 unit tests covering edge cases and floating point precision
- Adapted from TrackEval with MIT license attribution
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix ruff and mypy linting errors
- Add S101 ignore for test files in pyproject.toml
- Split error test cases into separate test functions
- Remove type: ignore comments by simplifying type hints
- Clean up unused imports
* Add Hungarian matching utility for detection assignment
- Add match_detections() using Jonker-Volgenant algorithm (scipy)
- Maximize similarity with threshold filtering
- Return matched/unmatched indices for GT and tracker detections
- Include 17 unit tests covering edge cases and scipy consistency
- Adapted from TrackEval matching patterns
* fix(pre_commit): 🎨 auto format pre-commit hooks
* Fix unused variable warnings in tests
* Fix remaining unused variable warning
* Add diverse test cases for match_detections
- Off-diagonal matches (GT0->TR2, GT1->TR0, etc.)
- Swapped/reversed matches
- Sparse non-sequential matches
- Very unbalanced sizes (1 GT vs 5 trackers, 5 GTs vs 1 tracker)
- Only middle/corner elements matching
- Threshold edge cases
- Optimal vs greedy assignment tests
* Remove redundant tests from test_matching.py
* Add CLEAR metrics module for multi-object tracking evaluation
Implement compute_clear_metrics function adapted from TrackEval with
exact numerical parity. The function computes standard CLEAR metrics
including MOTA, MOTP, IDSW, and track quality metrics (MT/PT/ML).
Key implementation details:
- Score matrix construction with IDSW prioritization (1000x bonus)
- Hungarian matching with proper threshold filtering
- ID switch detection based on previous tracker associations
- MT/PT/ML thresholds (>80%, >=20%, <20%)
- Fragmentation counting for track interruptions
- Vectorized GT ID mapping using np.searchsorted
Tests are fully parametrized and only test the public API.
Reference: trackeval/metrics/clear.py:38-129 (eval_sequence method)
* fix(pre_commit): 🎨 auto format pre-commit hooks…1 parent 4cbf89d commit 5f05879
85 files changed
Lines changed: 13761 additions & 950 deletions
File tree
- .github/workflows
- demo
- docs
- api
- hooks
- javascripts
- learn
- overrides/stylesheets
- trackers
- test
- annotators
- core
- eval
- io_tests
- motion
- scripts
- utils
- trackers
- annotators
- core
- bytetrack
- sort
- eval
- io
- motion
- scripts
- utils
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
29 | | - | |
| 29 | + | |
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
| 30 | + | |
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | | - | |
| 23 | + | |
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
| 31 | + | |
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| |||
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
40 | | - | |
| 40 | + | |
41 | 41 | | |
42 | 42 | | |
43 | 43 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
6 | | - | |
| 5 | + | |
7 | 6 | | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
8 | 12 | | |
9 | 13 | | |
10 | 14 | | |
11 | 15 | | |
12 | 16 | | |
13 | | - | |
14 | | - | |
| 17 | + | |
| 18 | + | |
15 | 19 | | |
16 | 20 | | |
17 | 21 | | |
| |||
29 | 33 | | |
30 | 34 | | |
31 | 35 | | |
| 36 | + | |
32 | 37 | | |
33 | 38 | | |
34 | | - | |
| 39 | + | |
35 | 40 | | |
36 | 41 | | |
37 | 42 | | |
38 | 43 | | |
39 | 44 | | |
40 | | - | |
| 45 | + | |
41 | 46 | | |
42 | 47 | | |
43 | 48 | | |
44 | 49 | | |
45 | 50 | | |
46 | 51 | | |
47 | 52 | | |
48 | | - | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
49 | 60 | | |
50 | 61 | | |
51 | 62 | | |
52 | | - | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
53 | 67 | | |
54 | 68 | | |
55 | 69 | | |
| |||
58 | 72 | | |
59 | 73 | | |
60 | 74 | | |
| 75 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
6 | | - | |
7 | | - | |
8 | | - | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
9 | 12 | | |
10 | 13 | | |
11 | 14 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
6 | | - | |
| 6 | + | |
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
31 | 48 | | |
32 | 49 | | |
33 | 50 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
26 | | - | |
| 26 | + | |
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
| 31 | + | |
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
35 | | - | |
| 35 | + | |
36 | 36 | | |
37 | | - | |
| 37 | + | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
9 | | - | |
10 | | - | |
11 | | - | |
12 | | - | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
16 | 17 | | |
17 | 18 | | |
18 | 19 | | |
| 20 | + | |
19 | 21 | | |
| 22 | + | |
20 | 23 | | |
21 | 24 | | |
22 | | - | |
23 | | - | |
24 | | - | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
25 | 28 | | |
26 | 29 | | |
| 30 | + | |
27 | 31 | | |
28 | 32 | | |
29 | | - | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | | - | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
34 | 38 | | |
35 | 39 | | |
36 | 40 | | |
37 | 41 | | |
38 | 42 | | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
39 | 75 | | |
40 | 76 | | |
41 | 77 | | |
| |||
0 commit comments