Releases: pytorch/tensordict
TensorDict v0.12.2
TensorDict v0.12.2
Patch release with a bug fix for consolidated nested tensors.
Bug Fixes
- Fix
_ragged_idxloss during consolidation of nested tensors, which caused numerical incorrectness when the nested tensor had more than 2 dimensions andragged_idx != 1(#1675)
Installation
pip install tensordict==0.12.2Full Changelog: v0.12.1...v0.12.2
TensorDict v0.12.1
TensorDict v0.12.1
Patch release with a torch.compile bug fix.
Bug Fixes
- Fix
unravel_keysinconsistency that preventedtorch.compilefrom working correctly when called with a single key (#1674)
Installation
pip install tensordict==0.12.1TensorDict v0.12.0
TensorDict v0.12.0
Highlights
TensorDict v0.12.0 introduces TypedTensorDict for schema-enforced tensor dictionaries, a full distributed collectives suite (broadcast, all_reduce, all_gather, scatter), TensorDictStore with Redis/Dragonfly/KeyDB backends, and major torch.compile and performance improvements. The UnbatchedTensor has been rewritten as a proper tensor subclass, and state_dict handling has been overhauled for consistency.
Breaking Changes
UnbatchedTensoris now a__torch_dispatch__-based tensor subclass (was previously a wrapper) (#1638, #1648)state_dictis now flat by default, with auto-detection inload_state_dictfor backwards compatibility- TensorClass
state_dictnow uses logical keys
Features
- TypedTensorDict: Schema-enforced TensorDicts with type annotations, cross-class compatibility, and
torch.compilesupport (#1657, #1659, #1660, #1662, #1663) - TensorDictStore: Redis/Dragonfly/KeyDB-backed TensorDict with TensorClass support, lazy stack storage, and optimized indexed ops
- Distributed collectives:
broadcast,all_reduce,all_gather,scatter, consolidatedsend/recvandinit_remote/from_remote_initwith UCXX transport support (#1611) set_printoptions: Configurable TensorDict repr with verbose mode (#1654, #1655, #1665)torch.funcsupport:jacrev,jacfwd, andhessiannow work with TensorDict (#1613)vmapwith unbatched data: TensorDicts containing unbatched tensors can now be vmapped (#1625)TensorClass.select(as_tensordict=...)parameter (#1544)TensorDictBase.is_non_tensor(key)for consistent non-tensor key detection
Bug Fixes
- Fix
HigherOrderOperatorsupport in__torch_function__(#1668) - Fix
td[key] = []handling (#1666) - Fix
UnbatchedTensor.tolist()(#1664) - Fix
UnbatchedTensorCUDA pickling for multiprocessing (#1656) - Fix
UnbatchedTensorindexing without batch dim, GPU failures, getitem/stack (#1607, #1626, #1633) - Fix
replace()recompiles undertorch.compile(#1605) - Fix
auto_batch_sizeregression withNonTensorStack(#1609) - Fix
NonTensorDatapositional args causing graph breaks undertorch.compile(#1630) - Fix
state_dicterror messages, params forwarding, detach - Pin
pybind11>=2.13for Python 3.13 compatibility
Performance
- Speed up
TensorDict.__init__by bypassingset()dispatch chain (#1551) - Speed up
TensorClass.__init__by bypassing__setattr__and dataclass__init__(#1552) - Use CUDA event sync instead of
torch.cuda.synchronize()in.to()(#1545) - Reduce guards and recompiles for TensorDict under
torch.compile(#1636) - Compile-friendly TensorDict build (#1624)
TensorDict v0.11.0
TensorDict v0.11.0
Highlights
This release brings Python 3.14 support (including free-threaded mode), drops Python 3.9, adds extensive torch.compile compatibility improvements, introduces new tensor manipulation methods, and expands NPU device support.
Breaking Changes
- Dropped Python 3.9 support - Python 3.10+ is now required (#1488)
- Removed deprecated features scheduled for v0.11 (#1526):
pad_sequencedeviceparameter removed (deprecated since v0.5)MemoryMappedTensor._tensorproperty now raisesRuntimeError- use the tensor directly since it's a tensor subclass- Class decoration with context managers like
set_lazy_legacynow raisesRuntimeError lock,unlock,rename_key(without trailing underscore) now raiseRuntimeError- uselock_,unlock_,rename_key_instead
Features
-
Python 3.14 and free-threading support (#1481, #1438) - Fixed race conditions in value caching that caused segfaults under free-threaded Python (
PYTHON_GIL=0). The library now declares it does not require the GIL. -
TensorClassModuleBase(#1473) - New base class for type-safe PyTorch modules that operate on TensorClass instances. Provides compile-time type checking, automatic TensorDict conversion viaas_td_module(), nested TensorClass support, and ONNX export compatibility. -
New tensor manipulation methods - Added several torch-compatible operations:
td.movedim()(#1504)td.swapaxes()td.flip(),td.fliplr()td.roll()td.rot90()td.narrow(),td.tile(),td.broadcast_to()td.atleast_Nd()td.mod()
-
Quantiles support (#1450) - Added quantile computation for TensorDicts
-
Context manager for
td.to()(#1448) -td.to()can now be used as a context manager for temporary device transfers -
key_transformfor reduce ops (#1451) - Added key transformation support for reduction operations -
selected_out_keysforProbabilisticTensorDictSequential(#1497) - Added ability to specify which output keys to keep -
Extended NPU support (#1471, #1460) - Expanded NPU device support across more use cases previously limited to CUDA
Bug Fixes
-
torch.compilecompatibility fixes: -
Robust key setting with memmap tensordicts (#1443) - Major fix for handling key operations with memory-mapped tensordicts
-
Saving and loading metadata (#1444) - Fixed memmap + serialization issues with MetaData typed objects
-
tensordict.catpreserves TensorClass (#1456) -
Names propagation for nested TensorDicts (#1520) - Fixed names propagation with extra batch dimensions
-
Device mismatch in CUDA/NPU scenarios (#1439) - Fixed "expect all tensors on the same device" errors
-
Nested lazy stacks handling (#1461) - Fixed
as_listand other operations with nested lazy stacks -
PyTorch version compatibility (#1498, #1499, #1437) - Various fixes for compatibility with older PyTorch versions
TensorDict 0.10.0: MDS, type annotation and typed `MetaData`
TensorDict 0.10.0 Release Notes
We are excited to announce the release of TensorDict 0.10.0! This release includes significant improvements to type annotations, new features for metadata handling, enhanced tensor operations, and numerous bug fixes that improve the overall stability and usability of the library.
🎉 Highlights
- Typed MetaData: Complete rewrite of metadata handling with full type support (#1428)
- TensorCollection Parent Class: New parent class providing better type annotations and enhanced functionality (#1388)
- Enhanced String Support:
to_struct_arraynow supports string data types (#1410) - Improved Type Safety: Comprehensive type annotation improvements across the entire codebase
- Better TensorClass Support: Enhanced ClassVar support and super() functionality
- MDS data interface: the
to_mdsmethod creates an MDS dataset on your favourite location -- no more painful columns definition etc (#1426). - Support for autograd's
gradfunction (#1417)
✨ New Features
Core Functionality
- [Feature] Typed MetaData (#1428): Complete rewrite of metadata handling system with full type support, enabling better static analysis and runtime type checking
- [Feature] TensorCollection parent class and better type annotation (#1388): New parent class that provides enhanced type annotations and improved inheritance hierarchy
- [Feature] to_struct_array with strings (#1410): Extended
to_struct_arrayfunctionality to handle string data types - [Feature] MDS dataset helper functions (#1426): New helper functions for working with MDS (Multi-Dimensional Scaling) datasets
- [Feature] implement tensor_split (#1386): Added support for
tensor_splitoperation to match PyTorch tensor API - [Feature] accept cap-str as input to set_interaction_type (#1387): Enhanced flexibility in interaction type setting by accepting capitalized strings
- [Feature] Allow in-place modification of lazy stacks (#1384): Enabled in-place modifications for lazy stacked tensors, improving memory efficiency
- [Feature] Ensure super() works with TensorClass (#1381): Fixed super() functionality in TensorClass inheritance chains
- [Feature] Add all everywhere (#1389): Added comprehensive
__all__declarations across all modules for better IDE support and import control
Type System Improvements
- [Typing]
@overloadfor methods that have a reduce arg (#1427): Added proper type overloads for methods with reduce parameters - [BE] A bunch of type annotation improvements (#1409): Comprehensive type annotation improvements across the codebase
- [BE] Better CompatibleType definition (#1404): Enhanced type definitions for better compatibility checking
- [BE] Add _from_tensordict to TensorClass (#1403): Added internal method for TensorClass construction from TensorDict
- [BE] Better type annotation for
__getitem__(#1402): Improved type annotations for indexing operations
🐛 Bug Fixes
Critical Fixes
- [BugFix] Fix stacking typed MetaData (#1429): Fixed issues with stacking operations on typed metadata
- [BugFix] Call synchronization when using the td.to("cpu") operation on third-party devices (#1425): Fixed potential precision issues when transferring tensors from third-party devices to CPU
- [BugFix] Fix missing _maybe_broadcast_other in base.py (#1422): Fixed missing broadcast functionality in base operations
- [BugFix] lock_() consolidated tds to avoid overriding values (#1408): Fixed value override issues in locked TensorDicts during consolidation
TensorClass Fixes
- [BugFix] Args for TC with ClassVar (#1401): Fixed argument handling for TensorClass with ClassVar annotations
- [BugFix] Fix ClassVar support in tensorclass (#1398): Enhanced ClassVar support in tensorclass decorator
- [BugFix] Fix MetaData assignment in tensorclasses (#1394): Fixed metadata assignment issues in TensorClass instances
Type and API Fixes
- [BugFix,TypeHint] Fix type annotations in tensorclass stub file (#1421): Fixed type hints in stub files for better IDE support
- [Bugfix] Fix type annotation for tensordict.keys().iter() (#1413): Fixed iterator type annotations for TensorDict keys
- [Bugfix] Fix
TensorDictModuleWrapperforward (#1415): Fixed forward pass in TensorDictModuleWrapper - [Bugfix] Improve various typing issues (#1424): General improvements to typing across the codebase
Tensor Operations
- [BugFix] repeat_interleave supports tensors (#1391): Fixed
repeat_interleaveto properly support tensor arguments - [BugFix] Fix chunk following split fix (#1377): Fixed chunking operations after split functionality improvements
- [BugFix] Uneven splits (#1376): Fixed handling of uneven tensor splits
- [BugFix] JSON/orjson compatibility (#1373): Improved compatibility between JSON and orjson serialization
🔄 Deprecations and Breaking Changes
- [Deprecation] Upgrade set_list_to_stack behavior (#1382): Updated behavior of
set_list_to_stackwith proper deprecation warnings for the old API
🛠️ Development and Infrastructure
CI and Testing
- [CI] Better versioning (#1433): Improved versioning system for better release management
- [CI] Fix benchmark CI upload with conditional PR testing (#1397): Enhanced CI pipeline for benchmark uploads
- [CI,Tests] Fix tests (#1396): General test fixes and improvements
- [CI] Update OSX target (#1378): Updated macOS build targets
- [CI] Downgrade OSX version in builds (#1375): Adjusted macOS version requirements for broader compatibility
Documentation
- [Doc,CI] Fix installation of the lib for releases in doc CI (#1432): Fixed library installation in documentation CI
- [Doc] Fix doc errors (#1431): General documentation error fixes
- [Doc, CI] Fix Doc CI (#1430): Fixed documentation CI pipeline
- Fix typos in export tutorial (#1405): Corrected typos in export tutorial documentation
👥 Contributors
Special thanks to all the contributors who made this release possible:
- Vincent Moens (@vmoens) - Lead maintainer, major features and bug fixes
- Yichao Zhou (@Yichao-Zhou) - Type system improvements and bug fixes
- Huazhong (@huazhongyang) - Device synchronization fixes
- Yoann Poupart (@Xmaster6y) - TensorDictModuleWrapper fixes
- Chi Zhang (@chz8494) - tensor_split implementation
- Heon Song (@heonsong) - Documentation improvements
📚 Documentation
For comprehensive documentation, tutorials, and examples, visit:
For a complete list of changes, see the full changelog.
If you encounter any issues, please report them on our GitHub Issues page.
v0.9.1: Orjson/Json Interoperability
This minor releases brings the following improvements and bug fixes:
- Fixing orjson / json interoperability #1373
- Downgrade OSX build target to 14 #1378
- Fixing split and chunk #1376 and #1377
Full Changelog: v0.9.0...v0.9.1
v0.9.0
TensorDict 0.9.0 Release Notes
Overview
TensorDict 0.9.0 introduces significant improvements in performance, new features for lazy operations, enhanced CUDA graph support, and various bug fixes. This release focuses on stability improvements and new functionality for distributed and lazy tensor operations.
🚀 New Features
Lazy Operations and Stacking
to_lazystack(): New method to convert TensorDict instances to lazy stacks (#1351) (a5aab97)- Stack name preservation:
tensordict.stacknow preserves names when stacking TensorDict instances (#1348) (2053031) update_batch_sizeinwhere(): Enhancedwhere()operation now supportsupdate_batch_sizeparameter (#1365) (847a86c)tolist_first(): New method for converting TensorDict to list with first-level flattening (#1334) (73fe89b)
Torch Function Integration
torch.maximumsupport: Added support fortorch.maximumoperation in TensorDict (#1362) (85f26e4)- Enhanced loss functions: Added support for
torch.sum,torch.mean,torch.varand loss functions (l1,smooth_l1,mse) (#1361) (17ca2ff)
CUDA Graph Enhancements
CudaGraphModule.state_dict(): New method to access state dictionary of CUDA graph modules (#1346) (909907b)- Improved device handling: Better support for CUDA graph operations on non-zero devices (#1315) (89d05a1)
- Stream management: Enhanced stream handling for CUDA graph operations (#1314) (2fd4843)
Non-Tensor Data Support
NonTensorDataBaseandMetaData: New base classes for handling non-tensor data in TensorDict (#1324) (8d0241d)- Enhanced metadata handling: Improved support for metadata operations
Copy Operations
TensorDict.__copy__(): New method for creating shallow copies of TensorDict instances (#1321) (b6feadd)
Distributed tensordicts
broadcast tensordicts: New functionality for broadcasting TensorDict instances across different shapes (#1307) (2959863)remote_initwith subclasses: Enhanced remote initialization support for TensorDict subclasses (#1308) (5859a2c)return_earlyforisend: New parameter for early return in send operations (#1306) (4012767)
🐛 Bug Fixes
TensorDict Operations
- Fixed "none"/"None" environment variable handling (#1372) (cb104a1)
- Fixed
split_sizevalidation inTensorDict.split()(#1370) (0bb94c0) - Fixed
update_batch_sizewhen source is TD and destination is LTD (#1371) (c8bfda2) - Fixed device argument in TensorDict constructor to respect CUDA current device (#1369) (afcbcec)
- Fixed
new_operations on NonTensorStack (#1366) (5e67c32) - Fixed
tensor_onlyconstruction (#1364) (75e2c26) - Fixed missing
update_batch_sizein lazy stack updates (#1359) (c9f0e40) - Fixed context managers update when a key is in
_non_tensordict(#1353) (c11a95b) - Fixed
tensorclass__enter__and__exit__methods (#1352) (08abb06)
Stacking and Chunking
- Fixed
tensordict.stackforcing all names toNonewhen no match (#1350) (c90df00) - Fixed chunk/split memmap index when dim!=0 (#1345) (297a514)
- Fixed nested key iterations for lazy stacks within tensorclasses (#1344) (2e616f8)
- Fixed leaf check in stack function (#1341) (e03c25e)
- Fixed nested tensorclass
maybe_dense_stacks(#1340) (9c8dd2d) - Fixed chunk of NJTs (Nested JAX Tensors) (#1339) (3477e96)
Compilation and Device Issues
- Fixed compilation of TensorClass with non-tensor + batch-size + device (#1337) (5c98749)
- Fixed
new_*operations for Lazy stacks (#1317) (1c8be19) - Fixed improper name setting in
__setitem__(#1313) (1d642b0) - Fixed
CudaGraphModuleon devices that are not 0 (#1315) (89d05a1) - Fixed lazy stack
isendearly return (#1316) (8d3c470)
Memory and Performance
- Fixed memory leak caused by
_validate_value(#1310) (a36f7f9) - Fixed flatten operation with start=end dim (#1333) (d9972b4)
- Fixed expansion of lazy stacks (#1331) (56b4493)
- Fixed
return_compositedefaults to True only when >1 distribution (#1328) (2c73924)
Distribution and Probabilistic Modules
- Fixed
TDParamscompatibility with export (#1285) (ecdde0b) - Fixed better list assignment in tensorclasses (#1284) (6d8119c)
- Fixed method
_is_list_tensor_compatiblemissing return value (#1277) (a9cc632) - Fixed
.item()warning on tensors that require grad (#1283) (910c953)
⚡ Performance Improvements
- Faster
_get_item: Optimized item retrieval operations (#1288) (1e33a18) - Dedicated validation functions: Improved validation performance (#1281) (604b471)
tensor_onlyfor tensorclass: Enhanced performance for tensor-only operations (#1280) (d4bc34c)- Second attempt at caching validation: Improved caching mechanisms (#1311) (5f26a8b)
- Better property handling in TC: Optimized property operations in TensorClass
🔧 Setup and CI Improvements
- Static linking:
_Cextension now statically linked against Python library (#1304) (af17524) - Better version checking: Improved version validation in smoke tests (#1303) (e84d44f)
- Python 3.13 support: Added support for Python 3.13 nightly builds (#1279) (0eb2ad3)
- Enhanced CI workflows: Improved continuous integration for various platforms
- Simplified setup: Streamlined package setup process (#1286) (fffffe5)
🚨 Deprecations and Breaking Changes
Deprecated Features
NormalParamWrapper: Deprecated in favor oftensordict.nn.NormalParamExtractor- Functional modules:
is_functional,make_functional, andget_functionalhave been removed from tensordict
Future Changes
- List-to-stack behavior: In version 0.10.0, lists will be automatically stacked by default. A
FutureWarningwill be raised if lists are assigned to TensorDict without setting the appropriate context manager.
🛠️ Quality of Life Improvements
- Simplified error handling: Better error messages and handling in TensorDictSequential execution (#1326) (1330b72)
- Enhanced flatten operations: Made flatten operation idempotent (#1332) (49698e2)
- Better list handling: Improved list assignment in TensorDict instances (#1282) (6ad496b)
- Enhanced validation: Better validation functions for different data types
📦 Dependencies
- Python: Support for Python 3.9, 3.10, 3.11, 3.12, and 3.13
- PyTorch: Compatible with PyTorch 1.12 and upward
- Additional: numpy, cloudpickle, packaging, importlib_metadata, orjson (for Python < 3.13)
🔗 Migration Guide
For Users Upgrading from 0.8.0
- Update functional module usage: If using
is_functional,make_functional, orget_functional, these have been removed - NormalParamWrapper replacement: Use
tensordict.nn.NormalParamExtractorinstead ofNormalParamWrapper - List handling: Consider using the new
set_list_to_stackcontext manager for consistent list behavior
For Developers
- The new lazy stacking features provide better memory efficiency for large datasets
- CUDA graph support has been enhanced for better GPU performance
- Non-tensor data handling has been improved with new base classes
🎯 Contributors
Special thanks to all contributors who made this release possible, including:
- Vincent Moens
- Nikolai Karpov
- Jiahao Li
- Faury Louis
- Douglas Boubert
- Albert Bou
📝 Full Changelog
For a complete list of all changes, please refer to the git log from version 0.8.0 to 0.9.0.
v0.8.3: Better CudaGraphModule
This minor release provides some fixes to CudaGraphModule, allowing the module to run on different devices than the default.
It also adds __copy__ to the TensorDict ops, such that copy(td) triggers td.copy(), resulting in a copy of the TD stucture without new memory allocation.
Full Changelog: v0.8.2...v0.8.3
v0.8.2: Fix memory leakage due to validate
This release fixes an apparent memory leak due to the value validation in tensordict.
The leak is apparent, as in it disappears in gc.collect() is invoked.
See #1309 for context.
Full Changelog: v0.8.1...v0.8.2
Minor fix: Statically link _C extension against the Python library
This new minor fixes the _C build pipeline, which was failing on some machines as the extension was build with dynamic linkage against libpython