Skip to content

Releases: pytorch/tensordict

TensorDict v0.12.2

20 Apr 15:13

Choose a tag to compare

TensorDict v0.12.2

Patch release with a bug fix for consolidated nested tensors.

Bug Fixes

  • Fix _ragged_idx loss during consolidation of nested tensors, which caused numerical incorrectness when the nested tensor had more than 2 dimensions and ragged_idx != 1 (#1675)

Installation

pip install tensordict==0.12.2

Full Changelog: v0.12.1...v0.12.2

TensorDict v0.12.1

11 Apr 20:03

Choose a tag to compare

TensorDict v0.12.1

Patch release with a torch.compile bug fix.

Bug Fixes

  • Fix unravel_keys inconsistency that prevented torch.compile from working correctly when called with a single key (#1674)

Installation

pip install tensordict==0.12.1

TensorDict v0.12.0

08 Apr 12:08

Choose a tag to compare

TensorDict v0.12.0

Highlights

TensorDict v0.12.0 introduces TypedTensorDict for schema-enforced tensor dictionaries, a full distributed collectives suite (broadcast, all_reduce, all_gather, scatter), TensorDictStore with Redis/Dragonfly/KeyDB backends, and major torch.compile and performance improvements. The UnbatchedTensor has been rewritten as a proper tensor subclass, and state_dict handling has been overhauled for consistency.

Breaking Changes

  • UnbatchedTensor is now a __torch_dispatch__-based tensor subclass (was previously a wrapper) (#1638, #1648)
  • state_dict is now flat by default, with auto-detection in load_state_dict for backwards compatibility
  • TensorClass state_dict now uses logical keys

Features

  • TypedTensorDict: Schema-enforced TensorDicts with type annotations, cross-class compatibility, and torch.compile support (#1657, #1659, #1660, #1662, #1663)
  • TensorDictStore: Redis/Dragonfly/KeyDB-backed TensorDict with TensorClass support, lazy stack storage, and optimized indexed ops
  • Distributed collectives: broadcast, all_reduce, all_gather, scatter, consolidated send/recv and init_remote/from_remote_init with UCXX transport support (#1611)
  • set_printoptions: Configurable TensorDict repr with verbose mode (#1654, #1655, #1665)
  • torch.func support: jacrev, jacfwd, and hessian now work with TensorDict (#1613)
  • vmap with unbatched data: TensorDicts containing unbatched tensors can now be vmapped (#1625)
  • TensorClass.select(as_tensordict=...) parameter (#1544)
  • TensorDictBase.is_non_tensor(key) for consistent non-tensor key detection

Bug Fixes

  • Fix HigherOrderOperator support in __torch_function__ (#1668)
  • Fix td[key] = [] handling (#1666)
  • Fix UnbatchedTensor.tolist() (#1664)
  • Fix UnbatchedTensor CUDA pickling for multiprocessing (#1656)
  • Fix UnbatchedTensor indexing without batch dim, GPU failures, getitem/stack (#1607, #1626, #1633)
  • Fix replace() recompiles under torch.compile (#1605)
  • Fix auto_batch_size regression with NonTensorStack (#1609)
  • Fix NonTensorData positional args causing graph breaks under torch.compile (#1630)
  • Fix state_dict error messages, params forwarding, detach
  • Pin pybind11>=2.13 for Python 3.13 compatibility

Performance

  • Speed up TensorDict.__init__ by bypassing set() dispatch chain (#1551)
  • Speed up TensorClass.__init__ by bypassing __setattr__ and dataclass __init__ (#1552)
  • Use CUDA event sync instead of torch.cuda.synchronize() in .to() (#1545)
  • Reduce guards and recompiles for TensorDict under torch.compile (#1636)
  • Compile-friendly TensorDict build (#1624)

TensorDict v0.11.0

26 Jan 12:59
92e2383

Choose a tag to compare

TensorDict v0.11.0

Highlights

This release brings Python 3.14 support (including free-threaded mode), drops Python 3.9, adds extensive torch.compile compatibility improvements, introduces new tensor manipulation methods, and expands NPU device support.

Breaking Changes

  • Dropped Python 3.9 support - Python 3.10+ is now required (#1488)
  • Removed deprecated features scheduled for v0.11 (#1526):
    • pad_sequence device parameter removed (deprecated since v0.5)
    • MemoryMappedTensor._tensor property now raises RuntimeError - use the tensor directly since it's a tensor subclass
    • Class decoration with context managers like set_lazy_legacy now raises RuntimeError
    • lock, unlock, rename_key (without trailing underscore) now raise RuntimeError - use lock_, unlock_, rename_key_ instead

Features

  • Python 3.14 and free-threading support (#1481, #1438) - Fixed race conditions in value caching that caused segfaults under free-threaded Python (PYTHON_GIL=0). The library now declares it does not require the GIL.

  • TensorClassModuleBase (#1473) - New base class for type-safe PyTorch modules that operate on TensorClass instances. Provides compile-time type checking, automatic TensorDict conversion via as_td_module(), nested TensorClass support, and ONNX export compatibility.

  • New tensor manipulation methods - Added several torch-compatible operations:

    • td.movedim() (#1504)
    • td.swapaxes()
    • td.flip(), td.fliplr()
    • td.roll()
    • td.rot90()
    • td.narrow(), td.tile(), td.broadcast_to()
    • td.atleast_Nd()
    • td.mod()
  • Quantiles support (#1450) - Added quantile computation for TensorDicts

  • Context manager for td.to() (#1448) - td.to() can now be used as a context manager for temporary device transfers

  • key_transform for reduce ops (#1451) - Added key transformation support for reduction operations

  • selected_out_keys for ProbabilisticTensorDictSequential (#1497) - Added ability to specify which output keys to keep

  • Extended NPU support (#1471, #1460) - Expanded NPU device support across more use cases previously limited to CUDA

  • torch.compile support for tensordict.copy() (#1515, #1516)

Bug Fixes

  • torch.compile compatibility fixes:

    • Fixed del_ in LazyStackedTensorDict (#1500)
    • Fixed td.pop() compilation (#1500)
    • Fixed select with strict=False
    • Fixed names setter incompatibility (#1521)
    • Fixed compiling tensordicts with metadata (#1519)
  • Robust key setting with memmap tensordicts (#1443) - Major fix for handling key operations with memory-mapped tensordicts

  • Saving and loading metadata (#1444) - Fixed memmap + serialization issues with MetaData typed objects

  • tensordict.cat preserves TensorClass (#1456)

  • Names propagation for nested TensorDicts (#1520) - Fixed names propagation with extra batch dimensions

  • Device mismatch in CUDA/NPU scenarios (#1439) - Fixed "expect all tensors on the same device" errors

  • Nested lazy stacks handling (#1461) - Fixed as_list and other operations with nested lazy stacks

  • PyTorch version compatibility (#1498, #1499, #1437) - Various fixes for compatibility with older PyTorch versions

TensorDict 0.10.0: MDS, type annotation and typed `MetaData`

08 Sep 10:08

Choose a tag to compare

TensorDict 0.10.0 Release Notes

We are excited to announce the release of TensorDict 0.10.0! This release includes significant improvements to type annotations, new features for metadata handling, enhanced tensor operations, and numerous bug fixes that improve the overall stability and usability of the library.

🎉 Highlights

  • Typed MetaData: Complete rewrite of metadata handling with full type support (#1428)
  • TensorCollection Parent Class: New parent class providing better type annotations and enhanced functionality (#1388)
  • Enhanced String Support: to_struct_array now supports string data types (#1410)
  • Improved Type Safety: Comprehensive type annotation improvements across the entire codebase
  • Better TensorClass Support: Enhanced ClassVar support and super() functionality
  • MDS data interface: the to_mds method creates an MDS dataset on your favourite location -- no more painful columns definition etc (#1426).
  • Support for autograd's grad function (#1417)

✨ New Features

Core Functionality

  • [Feature] Typed MetaData (#1428): Complete rewrite of metadata handling system with full type support, enabling better static analysis and runtime type checking
  • [Feature] TensorCollection parent class and better type annotation (#1388): New parent class that provides enhanced type annotations and improved inheritance hierarchy
  • [Feature] to_struct_array with strings (#1410): Extended to_struct_array functionality to handle string data types
  • [Feature] MDS dataset helper functions (#1426): New helper functions for working with MDS (Multi-Dimensional Scaling) datasets
  • [Feature] implement tensor_split (#1386): Added support for tensor_split operation to match PyTorch tensor API
  • [Feature] accept cap-str as input to set_interaction_type (#1387): Enhanced flexibility in interaction type setting by accepting capitalized strings
  • [Feature] Allow in-place modification of lazy stacks (#1384): Enabled in-place modifications for lazy stacked tensors, improving memory efficiency
  • [Feature] Ensure super() works with TensorClass (#1381): Fixed super() functionality in TensorClass inheritance chains
  • [Feature] Add all everywhere (#1389): Added comprehensive __all__ declarations across all modules for better IDE support and import control

Type System Improvements

  • [Typing] @overload for methods that have a reduce arg (#1427): Added proper type overloads for methods with reduce parameters
  • [BE] A bunch of type annotation improvements (#1409): Comprehensive type annotation improvements across the codebase
  • [BE] Better CompatibleType definition (#1404): Enhanced type definitions for better compatibility checking
  • [BE] Add _from_tensordict to TensorClass (#1403): Added internal method for TensorClass construction from TensorDict
  • [BE] Better type annotation for __getitem__ (#1402): Improved type annotations for indexing operations

🐛 Bug Fixes

Critical Fixes

  • [BugFix] Fix stacking typed MetaData (#1429): Fixed issues with stacking operations on typed metadata
  • [BugFix] Call synchronization when using the td.to("cpu") operation on third-party devices (#1425): Fixed potential precision issues when transferring tensors from third-party devices to CPU
  • [BugFix] Fix missing _maybe_broadcast_other in base.py (#1422): Fixed missing broadcast functionality in base operations
  • [BugFix] lock_() consolidated tds to avoid overriding values (#1408): Fixed value override issues in locked TensorDicts during consolidation

TensorClass Fixes

  • [BugFix] Args for TC with ClassVar (#1401): Fixed argument handling for TensorClass with ClassVar annotations
  • [BugFix] Fix ClassVar support in tensorclass (#1398): Enhanced ClassVar support in tensorclass decorator
  • [BugFix] Fix MetaData assignment in tensorclasses (#1394): Fixed metadata assignment issues in TensorClass instances

Type and API Fixes

  • [BugFix,TypeHint] Fix type annotations in tensorclass stub file (#1421): Fixed type hints in stub files for better IDE support
  • [Bugfix] Fix type annotation for tensordict.keys().iter() (#1413): Fixed iterator type annotations for TensorDict keys
  • [Bugfix] Fix TensorDictModuleWrapper forward (#1415): Fixed forward pass in TensorDictModuleWrapper
  • [Bugfix] Improve various typing issues (#1424): General improvements to typing across the codebase

Tensor Operations

  • [BugFix] repeat_interleave supports tensors (#1391): Fixed repeat_interleave to properly support tensor arguments
  • [BugFix] Fix chunk following split fix (#1377): Fixed chunking operations after split functionality improvements
  • [BugFix] Uneven splits (#1376): Fixed handling of uneven tensor splits
  • [BugFix] JSON/orjson compatibility (#1373): Improved compatibility between JSON and orjson serialization

🔄 Deprecations and Breaking Changes

  • [Deprecation] Upgrade set_list_to_stack behavior (#1382): Updated behavior of set_list_to_stack with proper deprecation warnings for the old API

🛠️ Development and Infrastructure

CI and Testing

  • [CI] Better versioning (#1433): Improved versioning system for better release management
  • [CI] Fix benchmark CI upload with conditional PR testing (#1397): Enhanced CI pipeline for benchmark uploads
  • [CI,Tests] Fix tests (#1396): General test fixes and improvements
  • [CI] Update OSX target (#1378): Updated macOS build targets
  • [CI] Downgrade OSX version in builds (#1375): Adjusted macOS version requirements for broader compatibility

Documentation

  • [Doc,CI] Fix installation of the lib for releases in doc CI (#1432): Fixed library installation in documentation CI
  • [Doc] Fix doc errors (#1431): General documentation error fixes
  • [Doc, CI] Fix Doc CI (#1430): Fixed documentation CI pipeline
  • Fix typos in export tutorial (#1405): Corrected typos in export tutorial documentation

👥 Contributors

Special thanks to all the contributors who made this release possible:

  • Vincent Moens (@vmoens) - Lead maintainer, major features and bug fixes
  • Yichao Zhou (@Yichao-Zhou) - Type system improvements and bug fixes
  • Huazhong (@huazhongyang) - Device synchronization fixes
  • Yoann Poupart (@Xmaster6y) - TensorDictModuleWrapper fixes
  • Chi Zhang (@chz8494) - tensor_split implementation
  • Heon Song (@heonsong) - Documentation improvements

📚 Documentation

For comprehensive documentation, tutorials, and examples, visit:


For a complete list of changes, see the full changelog.

If you encounter any issues, please report them on our GitHub Issues page.

v0.9.1: Orjson/Json Interoperability

14 Jul 12:53

Choose a tag to compare

This minor releases brings the following improvements and bug fixes:

  • Fixing orjson / json interoperability #1373
  • Downgrade OSX build target to 14 #1378
  • Fixing split and chunk #1376 and #1377

Full Changelog: v0.9.0...v0.9.1

v0.9.0

09 Jul 16:09

Choose a tag to compare

TensorDict 0.9.0 Release Notes

Overview

TensorDict 0.9.0 introduces significant improvements in performance, new features for lazy operations, enhanced CUDA graph support, and various bug fixes. This release focuses on stability improvements and new functionality for distributed and lazy tensor operations.

🚀 New Features

Lazy Operations and Stacking

  • to_lazystack(): New method to convert TensorDict instances to lazy stacks (#1351) (a5aab97)
  • Stack name preservation: tensordict.stack now preserves names when stacking TensorDict instances (#1348) (2053031)
  • update_batch_size in where(): Enhanced where() operation now supports update_batch_size parameter (#1365) (847a86c)
  • tolist_first(): New method for converting TensorDict to list with first-level flattening (#1334) (73fe89b)

Torch Function Integration

  • torch.maximum support: Added support for torch.maximum operation in TensorDict (#1362) (85f26e4)
  • Enhanced loss functions: Added support for torch.sum, torch.mean, torch.var and loss functions (l1, smooth_l1, mse) (#1361) (17ca2ff)

CUDA Graph Enhancements

  • CudaGraphModule.state_dict(): New method to access state dictionary of CUDA graph modules (#1346) (909907b)
  • Improved device handling: Better support for CUDA graph operations on non-zero devices (#1315) (89d05a1)
  • Stream management: Enhanced stream handling for CUDA graph operations (#1314) (2fd4843)

Non-Tensor Data Support

  • NonTensorDataBase and MetaData: New base classes for handling non-tensor data in TensorDict (#1324) (8d0241d)
  • Enhanced metadata handling: Improved support for metadata operations

Copy Operations

  • TensorDict.__copy__(): New method for creating shallow copies of TensorDict instances (#1321) (b6feadd)

Distributed tensordicts

  • broadcast tensordicts: New functionality for broadcasting TensorDict instances across different shapes (#1307) (2959863)
  • remote_init with subclasses: Enhanced remote initialization support for TensorDict subclasses (#1308) (5859a2c)
  • return_early for isend: New parameter for early return in send operations (#1306) (4012767)

🐛 Bug Fixes

TensorDict Operations

  • Fixed "none"/"None" environment variable handling (#1372) (cb104a1)
  • Fixed split_size validation in TensorDict.split() (#1370) (0bb94c0)
  • Fixed update_batch_size when source is TD and destination is LTD (#1371) (c8bfda2)
  • Fixed device argument in TensorDict constructor to respect CUDA current device (#1369) (afcbcec)
  • Fixed new_ operations on NonTensorStack (#1366) (5e67c32)
  • Fixed tensor_only construction (#1364) (75e2c26)
  • Fixed missing update_batch_size in lazy stack updates (#1359) (c9f0e40)
  • Fixed context managers update when a key is in _non_tensordict (#1353) (c11a95b)
  • Fixed tensorclass __enter__ and __exit__ methods (#1352) (08abb06)

Stacking and Chunking

  • Fixed tensordict.stack forcing all names to None when no match (#1350) (c90df00)
  • Fixed chunk/split memmap index when dim!=0 (#1345) (297a514)
  • Fixed nested key iterations for lazy stacks within tensorclasses (#1344) (2e616f8)
  • Fixed leaf check in stack function (#1341) (e03c25e)
  • Fixed nested tensorclass maybe_dense_stacks (#1340) (9c8dd2d)
  • Fixed chunk of NJTs (Nested JAX Tensors) (#1339) (3477e96)

Compilation and Device Issues

  • Fixed compilation of TensorClass with non-tensor + batch-size + device (#1337) (5c98749)
  • Fixed new_* operations for Lazy stacks (#1317) (1c8be19)
  • Fixed improper name setting in __setitem__ (#1313) (1d642b0)
  • Fixed CudaGraphModule on devices that are not 0 (#1315) (89d05a1)
  • Fixed lazy stack isend early return (#1316) (8d3c470)

Memory and Performance

  • Fixed memory leak caused by _validate_value (#1310) (a36f7f9)
  • Fixed flatten operation with start=end dim (#1333) (d9972b4)
  • Fixed expansion of lazy stacks (#1331) (56b4493)
  • Fixed return_composite defaults to True only when >1 distribution (#1328) (2c73924)

Distribution and Probabilistic Modules

  • Fixed TDParams compatibility with export (#1285) (ecdde0b)
  • Fixed better list assignment in tensorclasses (#1284) (6d8119c)
  • Fixed method _is_list_tensor_compatible missing return value (#1277) (a9cc632)
  • Fixed .item() warning on tensors that require grad (#1283) (910c953)

⚡ Performance Improvements

  • Faster _get_item: Optimized item retrieval operations (#1288) (1e33a18)
  • Dedicated validation functions: Improved validation performance (#1281) (604b471)
  • tensor_only for tensorclass: Enhanced performance for tensor-only operations (#1280) (d4bc34c)
  • Second attempt at caching validation: Improved caching mechanisms (#1311) (5f26a8b)
  • Better property handling in TC: Optimized property operations in TensorClass

🔧 Setup and CI Improvements

  • Static linking: _C extension now statically linked against Python library (#1304) (af17524)
  • Better version checking: Improved version validation in smoke tests (#1303) (e84d44f)
  • Python 3.13 support: Added support for Python 3.13 nightly builds (#1279) (0eb2ad3)
  • Enhanced CI workflows: Improved continuous integration for various platforms
  • Simplified setup: Streamlined package setup process (#1286) (fffffe5)

🚨 Deprecations and Breaking Changes

Deprecated Features

  • NormalParamWrapper: Deprecated in favor of tensordict.nn.NormalParamExtractor
  • Functional modules: is_functional, make_functional, and get_functional have been removed from tensordict

Future Changes

  • List-to-stack behavior: In version 0.10.0, lists will be automatically stacked by default. A FutureWarning will be raised if lists are assigned to TensorDict without setting the appropriate context manager.

🛠️ Quality of Life Improvements

  • Simplified error handling: Better error messages and handling in TensorDictSequential execution (#1326) (1330b72)
  • Enhanced flatten operations: Made flatten operation idempotent (#1332) (49698e2)
  • Better list handling: Improved list assignment in TensorDict instances (#1282) (6ad496b)
  • Enhanced validation: Better validation functions for different data types

📦 Dependencies

  • Python: Support for Python 3.9, 3.10, 3.11, 3.12, and 3.13
  • PyTorch: Compatible with PyTorch 1.12 and upward
  • Additional: numpy, cloudpickle, packaging, importlib_metadata, orjson (for Python < 3.13)

🔗 Migration Guide

For Users Upgrading from 0.8.0

  1. Update functional module usage: If using is_functional, make_functional, or get_functional, these have been removed
  2. NormalParamWrapper replacement: Use tensordict.nn.NormalParamExtractor instead of NormalParamWrapper
  3. List handling: Consider using the new set_list_to_stack context manager for consistent list behavior

For Developers

  • The new lazy stacking features provide better memory efficiency for large datasets
  • CUDA graph support has been enhanced for better GPU performance
  • Non-tensor data handling has been improved with new base classes

🎯 Contributors

Special thanks to all contributors who made this release possible, including:

  • Vincent Moens
  • Nikolai Karpov
  • Jiahao Li
  • Faury Louis
  • Douglas Boubert
  • Albert Bou

📝 Full Changelog

For a complete list of all changes, please refer to the git log from version 0.8.0 to 0.9.0.

v0.8.3: Better CudaGraphModule

16 May 15:24

Choose a tag to compare

This minor release provides some fixes to CudaGraphModule, allowing the module to run on different devices than the default.

It also adds __copy__ to the TensorDict ops, such that copy(td) triggers td.copy(), resulting in a copy of the TD stucture without new memory allocation.

Full Changelog: v0.8.2...v0.8.3

v0.8.2: Fix memory leakage due to validate

05 May 20:49

Choose a tag to compare

This release fixes an apparent memory leak due to the value validation in tensordict.
The leak is apparent, as in it disappears in gc.collect() is invoked.
See #1309 for context.

Full Changelog: v0.8.1...v0.8.2

Minor fix: Statically link _C extension against the Python library

30 Apr 12:37

Choose a tag to compare

This new minor fixes the _C build pipeline, which was failing on some machines as the extension was build with dynamic linkage against libpython