[WIP] ReinforcementLearning.jl integration #9

rejuvyesh · 2022-03-09T19:48:19Z

I realized that CommonRLInterface.jl never settled on what to do with continuous action spaces, so directly integrating with RLBase from ReinforcementLearning.jl.

Will add tests and examples with PPO and DDPG.

codecov-commenter · 2022-03-09T23:39:13Z

Codecov Report

Merging #9 (eb379f6) into main (f9b2fd1) will decrease coverage by 0.09%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main       #9      +/-   ##
==========================================
- Coverage   92.41%   92.31%   -0.10%     
==========================================
  Files          81       81              
  Lines        3823     3761      -62     
==========================================
- Hits         3533     3472      -61     
+ Misses        290      289       -1

Impacted Files	Coverage Δ
src/Dojo.jl	`100.00% <ø> (ø)`
src/orientation/quaternion.jl	`82.92% <0.00%> (-5.41%)`	⬇️
src/orientation/mapping.jl	`36.36% <0.00%> (-5.31%)`	⬇️
src/contacts/utilities.jl	`40.00% <0.00%> (-2.86%)`	⬇️
src/joints/rotational/input.jl	`42.10% <0.00%> (-1.49%)`	⬇️
src/joints/joint.jl	`88.88% <0.00%> (-0.51%)`	⬇️
src/contacts/impact.jl	`86.20% <0.00%> (-0.46%)`	⬇️
src/bodies/set.jl	`94.54% <0.00%> (-0.37%)`	⬇️
src/joints/rotational/springs.jl	`97.29% <0.00%> (-0.14%)`	⬇️
src/utilities/methods.jl	`96.66% <0.00%> (-0.11%)`	⬇️
... and 17 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5563639...eb379f6. Read the comment docs.

findmyway · 2022-03-12T15:56:38Z

examples/deeprl/cartpole_ppo.jl

+                actor = Chain(
+                    Dense(ns, 256, relu; init = glorot_uniform(rng)),
+                    Dense(256, na; init = glorot_uniform(rng)),
+                ),


Note that you are using the discrete version of PPO here. But the cart pole env here seems to be a continuous version. (The actions space is [-1.0, 1.0]). So you may take reference from https://github.com/JuliaReinforcementLearning/ReinforcementLearning.jl/blob/935f68b6cb378f9929a8d9914eb388e86213c86d/src/ReinforcementLearningExperiments/deps/experiments/experiments/Policy%20Gradient/JuliaRL_PPO_Pendulum.jl#L43-L50

Good point! Thanks for checking in. Although currently I also need to define the reward/cost function for cartpole on Dojo side.

janbruedigam · 2023-04-12T08:24:42Z

We should probably rethink the interface to ReinforcementLearning.jl once their updates are done (JuliaReinforcementLearning/ReinforcementLearning.jl#614)

rejuvyesh force-pushed the jkg/rlbase branch from fa08c49 to 1043fc3 Compare March 9, 2022 23:39

rejuvyesh force-pushed the jkg/rlbase branch from c656da1 to 137d64a Compare March 10, 2022 01:26

rejuvyesh mentioned this pull request Mar 10, 2022

MultiThreadEnv with custom (continuous) action spaces fails JuliaReinforcementLearning/ReinforcementLearning.jl#596

Closed

rejuvyesh force-pushed the jkg/rlbase branch 2 times, most recently from 22e4549 to b606aa1 Compare March 11, 2022 19:07

findmyway reviewed Mar 13, 2022

View reviewed changes

rejuvyesh and others added 7 commits March 13, 2022 02:45

Towards ReinforcementLearning.jl integration

2c277fb

get basic interface working

2dad1d9

add an example

b12adf4

add simple cartpole example; issue is with RL.jl

6200f6c

cartpole is meaningless as no reward defined

4af3538

fix space related definitions

b7b11e0

fix ppo policy

08188a6

rejuvyesh force-pushed the jkg/rlbase branch from dc065e2 to 08188a6 Compare March 13, 2022 02:47

rejuvyesh added 2 commits March 13, 2022 03:01

fixes for RL integration, still errors

e57b198

try some more fixes

eb379f6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] ReinforcementLearning.jl integration #9

[WIP] ReinforcementLearning.jl integration #9

rejuvyesh commented Mar 9, 2022 •

edited

Loading

codecov-commenter commented Mar 9, 2022 •

edited

Loading

findmyway Mar 12, 2022

rejuvyesh Mar 13, 2022

janbruedigam commented Apr 12, 2023

[WIP] ReinforcementLearning.jl integration #9

Are you sure you want to change the base?

[WIP] ReinforcementLearning.jl integration #9

Conversation

rejuvyesh commented Mar 9, 2022 • edited Loading

codecov-commenter commented Mar 9, 2022 • edited Loading

Codecov Report

findmyway Mar 12, 2022

Choose a reason for hiding this comment

rejuvyesh Mar 13, 2022

Choose a reason for hiding this comment

janbruedigam commented Apr 12, 2023

rejuvyesh commented Mar 9, 2022 •

edited

Loading

codecov-commenter commented Mar 9, 2022 •

edited

Loading