Release v0.0.4 · pytorch/rl

What's Changed

[CI, Doc] Update functorch source installation command by @zou3519 in #446
[BugFix] TransformedEnv attributes inheritance by @vmoens in #467
[Feature] Cleanup mocking envs init and new by @vmoens in #469
[Tests] Adding tensordict __repr__ tests by @sladebot in #435
[Logging]: implement MLFlow logging integration by @rayanht in #432
[BugFix] MLFlow import fix by @vmoens in #473
[BugFix] Fixed pip install by @brandonsj in #475
[Features]: Changed _inplace_update cls parameter passing in __new__ by @nicolas-dufour in #464
[Feature]: ModelBased Envs by @nicolas-dufour in #333
[Feature] make ReplayBufferTrainer compatible with storing trajectories by @vmoens in #476
[Tutorial] DQN tutorial by @vmoens in #474
[Feature] reader hooks for GymLike by @vmoens in #478
[BugFix] TensorSpec.zero(None) failure fix by @vmoens in #483
[Feature]: Support for planners and CEM by @nicolas-dufour in #384
[Feature] Replaced device_safe() with device by @ordinskiy in #485
[Feature]: TensorDictPrimer transform by @nicolas-dufour in #456
[Feature]: erase() method for torchrl.timeit by @nicolas-dufour in #480
[Feature] Added support for single collector in sync_async_collector by @nicolas-dufour in #482
[BugFix] removing unwanted device_safe() by @vmoens in #486
[Refactoring] Refactored get_stats_random_rollout by @nicolas-dufour in #481
[Feature] VIP Integration by @JasonMa2016 in #487
[Refactoring] Minor tweaks to recorder and logger by @nicolas-dufour in #489
[Feature]: Deactivate typechecks in envs by @nicolas-dufour in #490
[BugFix] Vectorized td_lambda with gamma tensor does not match the serial version by @vmoens in #400
[BugFix] Fix TensorDictPrimer init by @vmoens in #491
[Feature] Optional auto-reset when done for collectors and batched envs by @vmoens in #492
[BugFix] Defaulting passing_devices to None by @himjohntang in #477
Revert "[BugFix] Defaulting passing_devices to None" by @vmoens in #494
[BugFix] Multi-agent fixes by @vmoens in #488
[BugFix] Defaulting passing_devices to None by @vmoens in #495
[Feature] Lazy initialization of CatTensors by @vmoens in #497
[Cleanup] Removing cuda 10.2 references by @vmoens in #498
[BugFix] Migration to pytorch org by @vmoens in #499
[Refactoring] Import at root to enable vmap monkey-patching by @vmoens in #500
[BugFix] python version for linting checks by @vmoens in #502
[Feature] Replay Buffers refactor by @bamaxw in #330
[Feature] Rename step_tensordict in step_mdp by @romainjln in #512
[Lint] re-instantiate F821 by @vmoens in #516
[BugFix] run_type_checks for TransformedEnvs by @vmoens in #513
[BugFix] making first_dim and last_dim negative in FlattenObservation when a parent is set by @vmoens in #511
[Feature] Add info dict key-spec pairs to observation_spec by @tcbegley in #504
[BugFix] Changing the dm_control import to fail if not installed by @zeenolife in #515
[CI] Add coverage with codecov by @silvestrebahi in #523
Revert "[CI] Add coverage with codecov" by @vmoens in #525
[Quality] Use relative imports for local c++ deps by @apbard in #526
[Feature] Nightly release by @vmoens in #519
[Feature] Add make_tensordict() function by @sicong-huang in #522
[Doc] Misc readme fixes by @GavinPHR in #532
[BugFix] Replacing inference_mode decorator with no_grad to fix state_dict loading error by @GavinPHR in #530
[BugFix] Transformed ParallelEnv meta data are broken when passing to device by @vmoens in #531
[Doc] Add coverage banner by @vmoens in #533
[BugFix] Fix colab link of coding_dqn.ipynb by @Benjamin-eecs in #543
[BugFix] Fix optional imports by @vmoens in #535
[BugFix] Restore missing keys in data collector output by @tcbegley in #521
[Lint] reorganize imports by @apbard in #545
[BugFix] Single-cpu compatibility by @vmoens in #548
[BugFix] vision install and other deps in optdeps by @vmoens in #552
[Feature] Implemented device argument for modules.models by @yushiyangk in #524
[BugFix] Fix ellipsis indexing of 2d TensorDicts by @vmoens in #559
[BugFix] Additive gaussian exploration spec fix by @vmoens in #560
[BugFix] Disabling video step for wandb by @vmoens in #561
[BugFix] Various device fix by @vmoens in #558
[Feature] Allow collectors to accept regular modules as policies by @tcbegley in #546
[BugFix] Fix push binary nightly action by @psolikov in #566
[BugFix] TensorDict comparison by @vmoens in #567
[BugFix] Fix SyncDataCollector reset by @jrobine in #571
[Doc] Banners on README.md by @vmoens in #572
[Feature] Log printing in alphabetical order when creating a replay buffer by @nikhlrao in #573
[BugFix] Add eps to reward normalization by @vmoens in #574
[BugFix] Fix argument for PPOLoss.get_entropy_bonus() by @vmoens in #578
[Feature] Restructure torchrl/objectives by @sgrigory in #580
[Docs] Documentation revamp by @vmoens in #581
[Doc] Publishing on pytorch.org by @vmoens in #582
Revert "[Doc] Publishing on pytorch.org" by @vmoens in #584
[Doc] Publishing on pytorch.org by @vmoens in #585
Revert "[Doc] Publishing on pytorch.org" by @vmoens in #586
[Doc] Publishing on pytorch.org by @vmoens in #587
[Feature] More restrictive tests on docstrings by @vmoens in #457
[BugFix] Wrong stack import in tests by @vmoens in #590
[Feature] Exclude "_" out_keys in tensordictmodel by @jlesuffleur in #589
[Feature]: Dreamer support by @nicolas-dufour in #341
[Doc] Missing doc for prototype RB by @vmoens in #595
[Feature] Update list of supported libraries by @vmoens in #594
[BugFix] Fix timeit count registration by @vmoens in #598
[Naming] Renaming ProbabilisticTensorDictModule keys by @vmoens in #603
[Feature] Categorical encoding for action space by @artkorenev in #593
[BugFix] ReplayBuffer's storage now signal back when changes happen by @paulomarciano in #614
[Doc] Typos in tensordict tutorial by @PaLeroy in #621
[Doc] Integrate knowledge base in docs by @hatala91 in #622
[Doc] Updating docs requirements by @vmoens in #624
[Feature] Make torchrl runnable without functorch and with gym==0.13 by @vmoens in #386
[Feature] Habitat integration by @vmoens in #514
[Feature] Checkpointing by @vmoens in #549
Add support for null dim argument in TensorDict.squeeze by @jgonik in #608
[Version] Updating to torch 1.13 by @vmoens in #627
[Feature] Sub-memmap tensors by @vmoens in #626
[BugFix] copy_ changes the index if the dest and source memmap tensors share the same file location by @vmoens in #631
[Feature] Unfold transforms for folded TransformedEnv by @alexanderlobov in #630
[BugFix] make TensorDictReplayBuffer.extend call super().extend with stacked_td by @vmoens in #634
[BugFix] correct the use of step_mdp method in data collector by @adityagandhamal in #637
[Feature] Added implement_for decorator by @ordinskiy in #618
[Feature] Make DQN compatible with nn.Module by @svarolgunes in #632
[Example] Distributed Replay Buffer Prototype Example Implementation by @adityagoel4512 in #615
[Feature] Benchmark storage types by @adityagoel4512 in #633
[Feature] Remove wild imports in the library by @sosmond in #642
[BugFix] Prevent transform parent from being reassigned by @jasonfkut in #641
[Feature] Too many deepcopy in transforms.py by @romainjln in #625
[Naming] Rename keys_in to in_keys in transforms.py and related modules by @sardaankita in #656
[Refactoring] Refactor dreamer helper in smaller pieces by @vmoens in #662
[Feature] VIPRewardTransform by @vmoens in #658
[BugFix] make_trainer possible bug for on-policy cases by @albertbou92 in #655
[Naming] Fixing key names by @vmoens in #668
[Test] Check dtypes of envs by @vmoens in #666
[Refactor] Relying on the standalone tensordict -- phase 1 by @vmoens in #650
[Doc] More doc on trainers by @vmoens in #663
[BugFix] PPO example GAE import by @albertbou92 in #671
[BugFix] Use GitHub for flake8 pre-commit hook by @vmoens in #679
[BugFix] Update to strict select by @vmoens in #675
[Feature] Auto-compute stats for ObservationNorm by @romainjln in #669
[Doc] _make_collector helper function by @albertbou92 in #678
[Doc] BatchSubSampler class docstrings example by @albertbou92 in #677
[BugFix] PPO objective crashes if advantage_module is None by @albertbou92 in #676
[Refactor] Refactor 'next_' into nested tensordicts by @vmoens in #649
[Doc] More doc about environments by @vmoens in #683
[Doc] Fix missing tensordict install for doc by @vmoens in #685
[CI] Added CircleCI pipeline to test compatibility across supported gym versions by @ordinskiy in #645
[BugFix] ConvNet forward method with tensors of more than 4 dimensions by @albertbou92 in #686
[Feature] add standard_normal for RewardScaling by @adityagandhamal in #682
[Feature] Jumanji envs by @yingchenlin in #674
[Feature] Default collate_fn by @vmoens in #688
[BugFix] Fix Examples by @vmoens in #687
[Refactoring] Replace direct gym version checks with decorated functions (#) by @ordinskiy in #691
Version 0.0.3 by @vmoens in #696
[Docs] Host TensorDict docs inside TorchRL docs by @tcbegley in #693
[BugFix] Fix docs build by @tcbegley in #698
[BugFix] Proper error messages for orphan transform creation by @vmoens in #697
[Feature] Append, init and insert transforms in ReplayBuffer by @altre in #695
[Feature] A2C objective class and train example by @albertbou92 in #680
[Doc, Test] Add A2C script test and doc by @vmoens in #702
[BugFix] Initialising the classes LazyTensorStorage with a nested TensorDict raises error by @albertbou92 in #703
[BugFix] Fix init_random_frames in A2C example test by @vmoens in #706
[Formatting] Upgrade formatting libs by @vmoens in #705
[Doc] Document undefined symbol error with torch version < 1.13 by @nickspell in #707
[Doc] Tuto integration by @vmoens in #681
[Quality] Deprecate .ipynb tutos by @vmoens in #710
[Test] Fix wrong skip message when functorch is installed by @vmoens in #711
[BugFix, Doc] Clone TensorDict docs into _local_build by @tcbegley in #712
[Feature] Migrate to tensordict.nn.TensorDictModule by @tcbegley in #700
[Doc] Fix Tutos TODOs by @vmoens in #713
[BugFix] RoundRobinWriter, possible duplicated code in the extend method by @albertbou92 in #709
[Feature] Add OptimizerHook by @aakhundov in #716
[Feature] Support for in-place functionalization by @tcbegley in #714
[BugFix] Fix TorchRL demo tutorial by @vmoens in #721
[Docs] Update tutorial links in readme by @tcbegley in #724
[Feature] Extend PPO loss helper to allow for more customisation by @albertbou92 in #718
[BugFix] Model maker functions for A2C and PPO fail for discrete action space envs by @albertbou92 in #717
[Minor] docstrings and setup fixes by @vmoens in #726
[BugFix] Avoid wrongfully erasing observation keys from specs in CatTensors by @vmoens in #727
[BugFix] Avoid wrongfully erasing observation keys from tensordict in CatTensors by @vmoens in #729
[Doc] More doc for data collectors by @vmoens in #732
[Feature] Port test_fake_tensordict to torchrl by @vmoens in #731
[Feature] Use ObservationNorm.init_stats for stats computation in example scripts by @romainjln in #715
[BugFix] init_stats over multiple dimensions by @vmoens in #735
[Refactor] logger creation in examples by @acforvs in #733
[Feature] Brax envs by @yingchenlin in #722
[Refactor] Adopt prototype ProbabilisticTensorDictModule and ProbabilisticTensorDictSequential by @tcbegley in #728
[Doc] Link to doc in README by @vmoens in #740
[Feature] Make GAE return a 'value_target' entry by @vmoens in #741
[Feature] SamplerWithoutReplacement by @vmoens in #742
[Doc, CI] Update doc workflow to run on PR and only publishes doc on main. by @EmGarr in #745
[Feature] Better advantage API for higher order derivatives by @vmoens in #744
[Refactor] Cosmetic improvements to advantage modules by @vmoens in #746
[BugFix] Fix NoopReset in parallel settings by @vmoens in #747
[Refactor] Remove env.is_done attribute by @vmoens in #748
[Refactor] Drop prototype imports by @tcbegley in #738
[BugFix] Fixes for speed branch merge on tensordict by @vmoens in #755
[BugFix] Fix size-match unsqueeze deprecation by @vmoens in #750
[Feature] FrameSkipTransform by @vmoens in #749
[BugFix] Better memory management for collectors by @vmoens in #763
Minor cleaning in BaseEnv classes by @matteobettini in #767
Revert "Minor cleaning in BaseEnv classes" by @vmoens in #768
Cleaning in envs common.py by @matteobettini in #769
Making _set_seed abstract by @matteobettini in #770
[Feature] Remove the Nd*TensorSpec classes by @riiswa in #772
[BugFix] Reinstantiate custom value key for multioutput value networks by @vmoens in #754
[Feature] Add Step Counter transform by @riiswa in #756
[BugFix] Batched environments with non empty batch size by @matteobettini in #774
Allow undounded boxes creation from gym spaces by @matteobettini in #778
[BugFix] Doc built cmake error by @vmoens in #780
[Feature] Lazy TensorClass storage by @tcbegley in #752
[BugFix] SyncDataCollector init when device and env_device are different by @albertbou92 in #765
[Feature] RewardSum transform by @albertbou92 in #751
[BugFix] Fix PPO clip by @vmoens in #786
[Feature] MultiDiscreteTensorSpec by @riiswa in #783
[Doc] Doc revamp by @vmoens in #782
[BugFix] ParallelEnv handling of done flag by @matteobettini in #788
[BugFix] Sorting nested keys by @matteobettini in #787
[Doc] README index by @vmoens in #791
Add windows wheel build to CircleCI by @yohann-benchetrit in #759
[Algorithm] MPPI planner by @vmoens in #701
[Doc] Better doc links by @vmoens in #795
[Doc] Missing headers by @vmoens in #796
[Doc] Knowledge base section by @vmoens in #797
[Feature] Vmas library wrapper by @matteobettini in #785
[Doc] Duplicate HabitatEnv entry in docs by @matteobettini in #798
[Feature] MultiDiscreteTensorSpec nvec with several axes by @riiswa in #789
[Refactor] Graduate Replay Buffer prototype by @KamilPiechowiak in #794
[BugFix] Solve R3MTransform init problem by @vmoens in #803
[Refactor] Simplify FlattenObservation default kwargs by @vmoens in #805
[Format] Fix lint by @vmoens in #811
[Doc, BugFix] Fix tutos errors by @vmoens in #817
[Doc] Pretrained models tutorial by @vmoens in #814
[Doc, BugFix] Fix tensordictmodule tutorial by @vmoens in #819
[BugFix] Fix MultOneHotDiscreteTensorSpec.is_in by @riiswa in #818
[Doc] Using R3M with a replay buffer by @vmoens in #820
[CodeQuality] call all() without making a list by @riiswa in #821
[BugFix] [Feature] "_reset" flag for env reset by @matteobettini in #800
[CI] Add unit test workflows for Windows by @yohann-benchetrit in #804
[BugFix] Fix habitat integration and doc by @vmoens in #812
[Minor] Better error reporting by @vmoens in #822
[Minor] Add ninja to deps in toml file by @vmoens in #823
[BugFix] Device of info specs by @vmoens in #824
[BugFix] Fix envs specs and info reading by @vmoens in #825
[Feature] Dtype in vmas tests by @matteobettini in #827
[BugFix] Fix R3M observation spec transform by @vmoens in #830
small change to make @robandpdx a contributor by @robandpdx in #831
[Feature] Exclude and select transforms by @vmoens in #832
[BugFix] Updating Recorder to accomodate "solved" key by @ShahRutav in #833
[BugFIx] Changed "set_count" set in collectors by @matteobettini in #835
[Algorithm] Td3 by @BY571 in #684
[Doc] A Succinct Summary of Reinforcement Learning by @vmoens in #840
[Feature, BugFix] ObservationNorm keep_dims and RewardSum init by @vmoens in #839
[BugFix] Improve done checking of collectors by @matteobettini in #838
[BugFix] Sync with tensordict (meta-tensor deprecation) by @vmoens in #842
[Feature] Refactor CatFrames using a proper preallocated buffer by @vmoens in #847
[CI] Add Github-Actions workflows for Windows wheels & nightly-build by @yohann-benchetrit in #837
[Doc] Fix broken link Dreamer by @atonkamanda in #853
[BugFix] Loading state_dict on uninitialized CatFrames by @vmoens in #855
[Refactor] Move loggers to torchrl.record by @vmoens in #854
[Refactor] specs batch size refactoring by @vmoens in #829
[Feature] Max pool Transform by @albertbou92 in #841
[Feature] Refactor advantages for continuous batches by @vmoens in #848
[BugFix, Doc] Minor fix in doc by @vmoens in #858
[Versioning] Version 0.0.4a by @vmoens in #859
[Feature] Vmas to device by @matteobettini in #850
[BugFix] Fix zero-ing from specs in RewardSum by @vmoens in #860
[Feature] Loading R3M and VIP from ResNet by @vmoens in #863
[Feature] SAC V2 by @vmoens in #864
[BugFix] Avoid collision of "step_count" key from transform and collector by @vmoens in #868
[Refactor] Better init for CatFrames buffers + removing default init values by @vmoens in #874
[Refactor] Minor refactorings to envs by @vmoens in #872
[Refactor] Removing inplace transform attribute by @vmoens in #871
[BugFix] Run checks when creating fake_td by @vmoens in #877
[Refactor] Box device by @vmoens in #881
[Feature] Multithreaded env by @sgrigory in #734
[Refactor] Turn off default advantage normalization in PPO by @vmoens in #887
[CI] Fix habitat-gym imports by @vmoens in #890
[CI] Fix cuda versions by @vmoens in #889
[CI] Fix windows install by @vmoens in #888
MacOS CPU unit test workflow using GitHub Actions by @robandpdx in #886
Linux CPU unit test workflow using GitHub Actions by @robandpdx in #826
[Major, BugFix, Test] Refactor Transforms tests by @vmoens in #878
[Bugfix] Codecov does not cover multiprocessed tests #879 by @kadeng in #893
[CI, BugFix] Fix gym related errors by @vmoens in #895
[WIP] Linux GPU unit test workflow using GitHub Actions by @robandpdx in #885
[BugFix] Compose cloning fix by @vmoens in #899
[Feature] Simplifying collector envs by @vmoens in #870
[CI,Feature] Upgrade to gymnasium by @vmoens in #898
[Doc] Add record utils to doc by @vmoens in #904
[Test] Improve exception message match by @apbard in #906
[BugFix] Dreamer helpers are broken with batched envs by @vmoens in #903
[Feature] RandomCropTensorDict transform by @vmoens in #908
[Versioning] Version 0.0.4b by @vmoens in #909

New Contributors

@sladebot made their first contribution in #435
@rayanht made their first contribution in #432
@brandonsj made their first contribution in #475
@ordinskiy made their first contribution in #485
@JasonMa2016 made their first contribution in #487
@himjohntang made their first contribution in #477
@romainjln made their first contribution in #512
@apbard made their first contribution in #526
@sicong-huang made their first contribution in #522
@psolikov made their first contribution in #566
@jrobine made their first contribution in #571
@nikhlrao made their first contribution in #573
@sgrigory made their first contribution in #580
@jlesuffleur made their first contribution in #589
@artkorenev made their first contribution in #593
@paulomarciano made their first contribution in #614
@hatala91 made their first contribution in #622
@jgonik made their first contribution in #608
@adityagandhamal made their first contribution in #637
@svarolgunes made their first contribution in #632
@adityagoel4512 made their first contribution in #615
@jasonfkut made their first contribution in #641
@sardaankita made their first contribution in #656
@albertbou92 made their first contribution in #655
@yingchenlin made their first contribution in #674
@altre made their first contribution in #695
@nickspell made their first contribution in #707
@aakhundov made their first contribution in #716
@acforvs made their first contribution in #733
@EmGarr made their first contribution in #745
@matteobettini made their first contribution in #767
@riiswa made their first contribution in #772
@yohann-benchetrit made their first contribution in #759
@KamilPiechowiak made their first contribution in #794
@robandpdx made their first contribution in #831
@ShahRutav made their first contribution in #833
@BY571 made their first contribution in #684
@atonkamanda made their first contribution in #853
@kadeng made their first contribution in #893

Full Changelog: v0.0.2a...v0.0.4b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.0.4

What's Changed

New Contributors

Contributors