-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib] Add separate learning rates for policy and alpha
to SAC.
#47078
Merged
sven1977
merged 43 commits into
ray-project:master
from
simonsays1980:add-actor-specific-learning-rate
Aug 21, 2024
+135
−32
Merged
Changes from 2 commits
Commits
Show all changes
43 commits
Select commit
Hold shift + click to select a range
ffe5048
Added spearate learnign rates for policy, critic, and alpha to SAC. T…
simonsays1980 3fc16cf
Merge branch 'master' into add-actor-specific-learning-rate
sven1977 cd5450c
Added an additional 'ciritc_lr', change 'policy_lr' to 'actor_lr', an…
simonsays1980 38cac91
Merge branch 'master' into add-actor-specific-learning-rate
simonsays1980 8f64c7c
Merge branch 'master' into add-actor-specific-learning-rate
simonsays1980 15e2898
Apply suggestions from code review
sven1977 d0679b7
[RLlib; Offline RL] Implement twin-Q net option for CQL. (#47105)
simonsays1980 e15ef64
[core] remove unused GcsAio(Publisher|Subscriber) methods and subclas…
rynewang fc0f1fe
[Core] Fix a bug where we submit the actor creation task to the wrong…
jjyao 387a083
[doc][build] Update all changed files timestamp to latest (#47115)
khluu 326eaae
[serve] split `test_proxy.py` into unit and e2e tests (#47112)
zcin 33d574a
[Utility] add `env_float` utility into `ray._private.ray_constants` (…
hongpeng-guo eff647b
[Data] Fix progress bars not showing % progress (#47120)
scottjlee ca98c7f
[data] change data17 to datal (#47082)
aslonnie 530f511
[ci] change data build for all python versions to arrow 17 (#47121)
can-anyscale cbaad59
[doc][rllib] add missing public api references (#47111)
can-anyscale ce283ad
[Core] Clarify docstring for get_gpu_ids() that it is only called ins…
petern48 88031fe
[doc][rllib] the rest of missing api references + lint checker (#47114)
can-anyscale a137979
Light up Ask AI button When Seach is Open (#47054)
cristianjd aac0a6e
[serve] immediately send ping in router when receiving new replica se…
zcin d560f3e
[data] Add label to indicate if operator is backpressured (#47095)
omatthew98 2626ea7
[Core] Add ray[adag] option to pip install (#47009)
ruisearch42 bb5c322
[Doc] Run pre-commit on tune docs (#47108)
peytondmurray 7ae5621
[release tests] update anyscale service utils (#46397)
zcin 2f77ddd
[core][experimental] Build an operation-based execution schedule for …
kevin85421 9c17e13
[serve] remove warnings about ongoing requests default change (#47085)
zcin 7cc321a
[serve] `__init__` functions have no return values (#47144)
aslonnie 8e6f9cc
Merge branch 'master' into add-actor-specific-learning-rate
simonsays1980 8d98825
Merge branch 'master' into add-actor-specific-learning-rate
simonsays1980 6808cbb
Turned off test 'self_play_with_policy_checkpoint' b/c it was failing…
simonsays1980 3a75d15
Uncommented 'pretrain_bc_single_agent_evaluate_as_multi_agent' b/c hy…
simonsays1980 f1dde3e
Merge branch 'master' into add-actor-specific-learning-rate
simonsays1980 39f90a6
Merge branch 'master' into add-actor-specific-learning-rate
simonsays1980 79f34ff
Switched test for old stack CQL to 'torch-only' b/c 'tf2' fails perma…
simonsays1980 3944b31
Fixed a small bug with uninitialized learning rates on old stack SAC.
simonsays1980 0a7aaa5
Merged master.
simonsays1980 3d670c5
Added actor- and critic-specific learning rates to HalfCheetah tests …
simonsays1980 380e8e6
Merge branch 'master' into add-actor-specific-learning-rate
simonsays1980 179fce4
Fixed error in 'test_worker_failures' due to the base 'lr' not set to…
simonsays1980 d6d4d5a
Fixed error in doc codes not implementing 'lr=None' and adapted learn…
simonsays1980 44336ae
Tuned learning rates on multi-agent SAC example.
simonsays1980 2dbb93d
Merge branch 'master' into add-actor-specific-learning-rate
simonsays1980 351b0a8
Added tuned learning rates to single agent SAC tuned example and Half…
simonsays1980 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestions:
self.critic_lr
and setself.lr
to None by default. This increases clarity and expressiveness. Otherwise, users (and us!) will forever have to open this sac.py file, just to quickly check, which one of the three lrs is the one covered by the defaultself.lr
, and which 2 have their own property.validate()
a quick check for a) new stack and - if yes - b)self.lr
must be None, otherwise raise an informative error explaining that there are 3 different learning rates properties andself.lr
should NOT be used.config.lr
vsconfig.critic_lr
in the respectiveSACLearner
methods.