Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLib] AssertionError using Simplex with default concentration #45804

Closed
ema-pe opened this issue Jun 7, 2024 · 1 comment · Fixed by #47880
Closed

[RLib] AssertionError using Simplex with default concentration #45804

ema-pe opened this issue Jun 7, 2024 · 1 comment · Fixed by #47880
Labels
bug Something that is supposed to be working; but isn't P1 Issue that should be fixed within a few weeks rllib RLlib related issues rllib-env rllib env related issues

Comments

@ema-pe
Copy link

ema-pe commented Jun 7, 2024

What happened + What you expected to happen

Hi, I'm using the simplex space defined in RRLib as an action space for an environment. I want an action space that contains a single point with three coordinates whose values are between [0,1] and sum to 1. The Simplex constructor takes shape and concentration parameters:

  • I set shape to (1,3) (1 independent 3d Dirichlet)
  • I set the concentration to np.array([1, 1, 1]) (uniform).

The concentration is the default as calculated in the constructor. The problem is that I cannot initialize this environment because it throws an AssertionError exception from the Simplex constructor.

I think the problem is in simplex.py, in the __init__ function. Why is there an assertion concentration.shape == shape[:-1]? Why is there [:-1] in the assertion? For shape=(1,3) and concentration=np.array([1,1,1]):

  • shape[:-1] is (1,)
  • concentration.shape is (3,)

And this throws the exception, but it should not.

class Simplex(gym.Space):
"""Represents a d - 1 dimensional Simplex in R^d.
That is, all coordinates are in [0, 1] and sum to 1.
The dimension d of the simplex is assumed to be shape[-1].
Additionally one can specify the underlying distribution of
the simplex as a Dirichlet distribution by providing concentration
parameters. By default, sampling is uniform, i.e. concentration is
all 1s.
Example usage:
self.action_space = spaces.Simplex(shape=(3, 4))
--> 3 independent 4d Dirichlet with uniform concentration
"""
def __init__(self, shape, concentration=None, dtype=np.float32):
assert type(shape) in [tuple, list]
super().__init__(shape, dtype)
self.dim = self.shape[-1]
if concentration is not None:
assert (
concentration.shape == shape[:-1]
), f"{concentration.shape} vs {shape[:-1]}"
self.concentration = concentration
else:
self.concentration = np.array([1] * self.dim)

Versions / Dependencies

  • Python 3.10.14
  • Fedora 39 (Server Edition) (kernel 6.8.7-200.fc39.x86_64)
  • Ray 2.23.0
  • Gymnasium 0.29.1
  • Torch 2.3.0
  • Pandas 2.2.2
  • Numpy 1.26.4

Reproduction script

Run the following test script called main_ray and it will crash with an AssertionError exception.

import numpy as np

import gymnasium as gym

from ray.rllib.utils.spaces.simplex import Simplex

class SimplexTestEnv(gym.Env):
    def __init__(self, env_config):
        self.action_space = Simplex(shape=(1,3), concentration=np.array([1, 1, 1]))
        self.observation_space = gym.spaces.Box(shape=(1,), low=-1, high=1)

    def reset(self):
        return np.zeros(1)

    def step(self, action):
        return np.zeros(1), 0, False, False, info


def main():
    env = SimplexTestEnv({})

    result = env.step(np.zeros(1))

if __name__ == "__main__":
    main()

The program crashes with the following stack trace:

(.env2.23) emanuele@fedora-t4:~/ray/test-simplex$ python main_ray.py 
Traceback (most recent call last):
  File "/home/emanuele/ray/test-simplex/main_ray.py", line 25, in <module>
    main()
  File "/home/emanuele/ray/test-simplex/main_ray.py", line 20, in main
    env = SimplexTestEnv({})
  File "/home/emanuele/ray/test-simplex/main_ray.py", line 9, in __init__
    self.action_space = Simplex(shape=(1,3), concentration=np.array([1, 1, 1]))
  File "/home/emanuele/ray/.env2.23/lib64/python3.10/site-packages/ray/rllib/utils/spaces/simplex.py", line 32, in __init__
    concentration.shape == shape[:-1]
AssertionError: (3,) vs (1,)
(.env2.23) emanuele@fedora-t4:~/ray/test-simplex$

Issue Severity

High: It blocks me from completing my task.

@ema-pe ema-pe added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jun 7, 2024
@anyscalesam anyscalesam added the rllib RLlib related issues label Jun 12, 2024
@simonsays1980
Copy link
Collaborator

@ema-pe Thanks for raising this issue and sorry that you bumped into it. I opened a PR that should fix it. Waiting for review by my colleague.

@simonsays1980 simonsays1980 added P1 Issue that should be fixed within a few weeks rllib-env rllib env related issues and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't P1 Issue that should be fixed within a few weeks rllib RLlib related issues rllib-env rllib env related issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants