Skip to content

Commit

Permalink
Merge 30c0bb1 into 25928f2
Browse files Browse the repository at this point in the history
  • Loading branch information
Gaiejj authored May 26, 2023
2 parents 25928f2 + 30c0bb1 commit abf9a0b
Show file tree
Hide file tree
Showing 109 changed files with 2,330 additions and 135 deletions.
14 changes: 13 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Features

- Feat(off-policy): support off-policy pid and update performance for navigation by [@Jiayi Zhou](https://github.com/Gaiejj) in PR [#245](https://github.com/OmniSafeAI/omnisafe/pull/245).

- Style(model-based): fix mypy and polish api docstring by [@Jiayi Zhou](https://github.com/Gaiejj) in PR [#244](https://github.com/OmniSafeAI/omnisafe/pull/244).

- Feat: improve test coverage and clear redundant code by [@Jiayi Zhou](https://github.com/Gaiejj) in PR [#238](https://github.com/OmniSafeAI/omnisafe/pull/238).

- Feat: update benchmarks and provide configs for reproducing results by [@Jiayi Zhou](https://github.com/Gaiejj) in PR [#238](https://github.com/OmniSafeAI/omnisafe/pull/236).

- Feat: add CODEOWNERS and refine ISSUE TEMPLATE by [@Jiaming Ji](https://github.com/zmsn-2077) in PR [#233](https://github.com/PKU-Alignment/omnisafe/pull/233).

- Style: support mypy checking and update docstring style by [@Jiayi Zhou](https://github.com/Gaiejj) in PR [#221](https://github.com/PKU-Alignment/omnisafe/pull/221).
Expand All @@ -27,13 +35,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Documentations

- Docs: polish algorithms tutorial by [@Jiayi Zhou](https://github.com/Gaiejj) in PR [#242](https://github.com/OmniSafeAI/omnisafe/pull/242).

- Docs: change link to PKU-Alignment by [@Jiayi Zhou](https://github.com/Gaiejj) in PR [#239](https://github.com/OmniSafeAI/omnisafe/pull/239).

- Docs: polish readme by [@Jiaming Ji](https://github.com/zmsn-2077) in PR [#231](https://github.com/PKU-Alignment/omnisafe/pull/231).

- Docs: polish algorithm tutorial and update API docs by [@Jiayi Zhou](https://github.com/Gaiejj) in PR [#225](https://github.com/PKU-Alignment/omnisafe/pull/225).

### Fixes

### Refactor
- Fix: fix adapter device and exp grid by [@Jiayi Zhou](https://github.com/Gaiejj) in PR [#243](https://github.com/OmniSafeAI/omnisafe/pull/243).

## v0.4.0

Expand Down
10 changes: 7 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,9 +195,9 @@ python train_policy.py --algo PPOLag --env-id SafetyPointGoal1-v0 --parallel 1 -
</thead>
<tbody>
<tr>
<td rowspan="4">On Policy</td>
<td rowspan="5">On Policy</td>
<td rowspan="2">Primal Dual</td>
<td>TRPOLag; PPOLag; PDO; RCPO; OnCRPO</td>
<td>TRPOLag; PPOLag; PDO; RCPO</td>
</tr>
<tr>
<td>TRPOPID; CPPOPID</td>
Expand All @@ -210,13 +210,17 @@ python train_policy.py --algo PPOLag --env-id SafetyPointGoal1-v0 --parallel 1 -
<td>Penalty Function</td>
<td>IPO; P3O</td>
</tr>
<tr>
<td>Primal</td>
<td>OnCRPO</td>
</tr>
<tr>
<td rowspan="2">Off Policy</td>
<td rowspan="2">Primal-Dual</td>
<td>SACLag; DDPGLag; TD3Lag</td>
</tr>
<tr>
<td><span style="font-weight:400;font-style:normal">SACPID; TD3PID; DDPGPID; CVPO</span></td>
<td><span style="font-weight:400;font-style:normal">SACPID; TD3PID; DDPGPID</span></td>
</tr>
<tr>
<td rowspan="3">Model-based</td>
Expand Down
3 changes: 3 additions & 0 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,9 @@ The OmniSafe Safety-Gymnasium Benchmark evaluates the effectiveness of OmniSafe'
- **[Preprint 2019]<sup>[[1]](#footnote1)</sup>** [The Lagrangian version of DDPG (DDPGLag)](https://cdn.openai.com/safexp-short.pdf)
- **[Preprint 2019]<sup>[[1]](#footnote1)</sup>** [The Lagrangian version of TD3 (TD3Lag)](https://cdn.openai.com/safexp-short.pdf)
- **[Preprint 2019]<sup>[[1]](#footnote1)</sup>** [The Lagrangian version of SAC (SACLag)](https://cdn.openai.com/safexp-short.pdf)
- **[ICML 2020]** [Responsive Safety in Reinforcement Learning by PID Lagrangian Methods (DDPGPID)](https://arxiv.org/abs/2007.03964)
- **[ICML 2020]** [Responsive Safety in Reinforcement Learning by PID Lagrangian Methods (TD3PID)](https://arxiv.org/abs/2007.03964)
- **[ICML 2020]** [Responsive Safety in Reinforcement Learning by PID Lagrangian Methods (SACPID)](https://arxiv.org/abs/2007.03964)

> **More details can be refer to [Off Policy Experiment](https://github.com/PKU-Alignment/omnisafe/tree/main/benchmarks/off-policy/README.md).**
Expand Down
Loading

0 comments on commit abf9a0b

Please sign in to comment.