Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tune] Cannot optimize a metric nested in result #14374

Closed
2 tasks done
ghost opened this issue Feb 26, 2021 · 0 comments · Fixed by #14375
Closed
2 tasks done

[tune] Cannot optimize a metric nested in result #14374

ghost opened this issue Feb 26, 2021 · 0 comments · Fixed by #14375
Labels
bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component)

Comments

@ghost
Copy link

ghost commented Feb 26, 2021

What is the problem?

The script at the bottom failed because it pass evaluation/episode_reward_mean as the metric to optimize. In the reported dict result, episode_reward_mean is nested under evaluation.
After digging a bit into the source code, I think this kind of metric could be optimize in tune. But, the validation process failed because it use nested result.

Ray version and other system information (Python version, TensorFlow version, OS):
ray: 2.0.0

Reproduction (REQUIRED)

Please provide a short code snippet (less than 50 lines if possible) that can be copy-pasted to reproduce the issue. The snippet should have no external library dependencies (i.e., use fake or mock data / environments):

import ray
from ray.rllib.agents.dqn import dqn
from ray import tune


if __name__ == "__main__":
    ray.init(local_mode=True, num_cpus=2)

    config = {
        "env": "CartPole-v1",
        "framework": "torch",

        "timesteps_per_iteration": 10,
        "evaluation_interval": 1,
        "evaluation_num_episodes": 1,
    }

    analysis = tune.run(
        dqn.DQNTrainer,
        config=config,
        metric="evaluation/episode_reward_mean",
        mode="max",
        num_samples=10,
        stop={
            "evaluation/episode_reward_mean": 20
        }
    )

    ray.shutdown()

If the code snippet cannot be run by itself, the issue will be closed with "needs-repro-script".

  • I have verified my script runs in a clean environment and reproduces the issue.
  • I have verified the issue also occurs with the latest wheels.
@ghost ghost added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Feb 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

0 participants