Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rllib] AlphaZero brokes in compute_action() #13822

Closed
2 tasks done
lairning opened this issue Jan 30, 2021 · 2 comments
Closed
2 tasks done

[rllib] AlphaZero brokes in compute_action() #13822

lairning opened this issue Jan 30, 2021 · 2 comments
Labels
bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component)

Comments

@lairning
Copy link

lairning commented Jan 30, 2021

What is the problem?

RLLIB / AlphaZero brokes in compute_action()

  • Ubuntu 19.10
  • python=3.7.9=h7579374_0
  • pytorch=1.7.1=py3.7_cpu_0
  • ray=1.1.0

Reproduction (REQUIRED)

import argparse

import ray
from ray.tune.registry import register_env
from ray.rllib.contrib.alpha_zero.models.custom_torch_models import DenseModel
from ray.rllib.contrib.alpha_zero.environments.cartpole import CartPole
from ray.rllib.contrib.alpha_zero.core.alpha_zero_trainer import AlphaZeroTrainer
from ray.rllib.models.catalog import ModelCatalog

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--training-iteration", default=2, type=int)
    args = parser.parse_args()
    ray.init()

    ModelCatalog.register_custom_model("dense_model", DenseModel)
    register_env("CartPoleEnv", lambda _: CartPole())

    config = {
        "num_workers"       : 0,
        "rollout_fragment_length": 50,
        "train_batch_size"  : 500,
        "sgd_minibatch_size": 64,
        "lr"                : 1e-4,
        "num_sgd_iter"      : 1,
        "mcts_config"       : {
            "puct_coefficient"   : 1.5,
            "num_simulations"    : 100,
            "temperature"        : 1.0,
            "dirichlet_epsilon"  : 0.20,
            "dirichlet_noise"    : 0.03,
            "argmax_tree_policy" : False,
            "add_dirichlet_noise": True,
        },
        "ranked_rewards"    : {
            "enable": True,
        },
        "model"             : {
            "custom_model": "dense_model",
        },
    }

    agent = AlphaZeroTrainer(config=config, env="CartPoleEnv")

    for _ in range(args.training_iteration):
        agent.train()

    env = CartPole()
    episode_reward = 0
    done = False
    obs = env.reset
    while not done:
        print(obs)
        action = agent.compute_action(obs)
        obs, episode_reward, done, info = env.step(action)

    print(episode_reward)

    ray.shutdown()
  • I have verified my script runs in a clean environment and reproduces the issue.
  • I have verified the issue also occurs with the latest wheels.

Error Message

Traceback (most recent call last):
  File "alpha0_err1.py", line 58, in <module>
    action = agent.compute_action(obs)
  File "/home/md/miniconda3/envs/simpy/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 819, in compute_action
    policy_id].transform(observation)
  File "/home/md/miniconda3/envs/simpy/lib/python3.7/site-packages/ray/rllib/models/preprocessors.py", line 240, in transform
    self.write(observation, array, 0)
  File "/home/md/miniconda3/envs/simpy/lib/python3.7/site-packages/ray/rllib/models/preprocessors.py", line 247, in write
    observation = OrderedDict(sorted(observation.items()))
AttributeError: 'function' object has no attribute 'items'
@lairning lairning added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jan 30, 2021
@lairning lairning changed the title RLLIB / AlphaZero brokes in compute_action() [rllib] AlphaZero brokes in compute_action() Jan 30, 2021
@lairning
Copy link
Author

Same as issue #13177

@andras-kth
Copy link

Have you tried changing obs = env.reset to obs = env.reset()?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component)
Projects
None yet
Development

No branches or pull requests

2 participants