-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent interface for algorithms & single training script #79
Comments
We now have a SB3-like |
We've made a lot of progress since this issue was first opened but there's still room for improvement so I'm leaving this open for now. |
I think this issue is required before we can address #587 and start a new take on #602. This is the current relationship between scripts and algorithms
There is a There is a We should probably have an ingredient for each algorithm. This makes constructing more complex experiments (such as warm-starting) easier. |
Note
I feel torn here. It's nice to be able to do benchmarks with a variety of algorithms using a consistent interface. Having one script with different subcommands seems more natural here. You could split it up into different scripts, but it seems like a bit of a nightmare to maintain a consistent interface across many files, although perhaps doable if we can factor out most things into ingredients? However, the algorithms do have different enough fundamental properties we can't literally have the same interface for all. Current approach was a compromise grouping similar algorithms together: e.g. AIRL/GAIL are almost identical except for the parametric form of reward network. But I agree the way they (and BC/DAgger) are currently merged is pretty ad-hoc. Before we do a complete rewrite of the scripts I do think it's worth reflecting whether we want to keep building on top of Sacred. I do like it but the project is sadly basically abandonware at this point. |
Good point, that each IMO the consistent interface should result from a consistent design of the ingredients and their configuration. The scripts just pull together the ingredients to form experiments. They allow to run an experiment with the given ingredients but also to explore what configuration values are needed for it. When I run Throwing together I totally agree that we should review the usage of |
That's a good point, I think I'm convinced it's worth splitting up scripts, so long as we can avoid substantial code duplication by splitting things into ingredients.
I agree there's a lot of hacks. I still feel a bit uncertain about this, probably easier to discuss with a concrete idea of how to split them up and what can be shared via ingredients etc. It would be nice to have an easy way to use shared configs between AIRL/GAIL as the right hyperparameters are often quite similar, although this use case is less common now we're using automatically tuned hyperparameters (the configs saved in
Abandoned was too strong a word. The original maintainers of Sacred stepped back from the project and did not handover cleanly. Currently @thequilo has been doing a heroic job reviewing and fixing outstanding issues, and things are improving now that he's got access to make PyPI releases. However, the situation is a bit precarious: if @thequilo leaves there's no ones, and he still doesn't have full permissions on the project. IDSIA/sacred#871 (comment) and IDSIA/sacred#879 have some relevant discussion. Thanks for looking into alternatives. From a quick skim of Guild AI's docs agree it seems promising, though some design decisions I feel hesitant about (e.g. treating global constants as flags by default...) |
@AdamGleave is not wrong, there is a risk of sacred becoming unmaintained when I step out. As I am currently a Ph.D. student, and it's completely unclear what will happen after I finished my Ph.D., this might happen in the not-so-far future. But, since my lab is heavily using sacred for all experiment tracking, I'm sure we'll find someone from my lab to continue to maintain sacred, given that Qwlouse gives us access. |
Thanks for clarifying @thequilo -- good to know someone else in your lab is likely to pick it up. And good luck wrapping up your PhD! |
I had a deeper look at Guild AI, ML Flow and neptune.ai. They all are mostly geared to logging, analysis and reproducibility. In comparison to sacred they lack the (hierarchical) configuration management system with experiments and (reusable) ingredients. I think we have two options here: 1. Stick with sacred... and hope that it stays maintained. Maybe put some of our time into maintaining it if necessary, maybe push the rl-baselines-zoo to also use sacred. Pro
Con
2. Use any of the other frameworks... (probably Guild AI) and use their configuration system (or something else like argparse/click) Pro
Con
|
I'll try to look at this next week. @qxcv any thoughts on above, I think you've used Sacred a fair bit? |
I don't have a strong opinion on MLFlow/Guild AI/neptune.ai. I have used and like click (for simple stuff), especially over argparse. As for Sacred, I only used it for the EIRLI project. I've since moved away from using Sacred because (1) I only needed config parsing, and not the more sophisticated features (like tracking git revisions, logging to S3/wandb/whatever, etc.), (2) I couldn't get the configs-as-functions stuff to play well with type checking, and (3) I found argument parsing/option override behaviour hard to reason about. My current go-to is Hydra with structured configs. Structured configs mean that your config is just a dataclass. This was an improvement for me, but still has shortcomings, like poor support for union types and a slightly awkward API for turning on/off "magic" features. Also it may not do some of the things you want if you're already using all the features in Sacred. It has 7k GitHub stars & is used/maintained by Facebook, though, so I expect that it will get better over time. It also has a surprising number of plugins/integrations (submitit/Slurm, Ray, etc.). |
I think we mostly use sacred for the configuration management and then W&B for logging? @AdamGleave you probably know better what other features of sacred are used but in the code it is mostly about configuration. I will have a look at Hydra thanks for the hint! What made you move from click to Hydra? |
Mostly the fact that I could declare my config as a (nested) dataclass & pass around the appropriate dataclass for each part of the code. Doing that with Click would require a lot of code duplication (repeating arguments as both Click options and dataclass members) and some extra parsing magic (to handle the nested options). The Sacred-like experiment management features in Hydra (automatically creating experiment directories, automatically setting up the |
I think the idea of putting configs in data-classes is interesting. Edit: looks like Hydra would let us get rid of the |
I'm in favor of trying to port one script to Hydra to try it out. Eliminating One feature Hydra seems to be missing is logging Git commit hash and other things needed for reproducibility. That said, seems like one could add that via callbacks, and WandB already does that so I don't think we need our config framework to do it. |
Feel free to have a look at #703 to see how it turned out. |
Wow, thanks for all that work converting things to Hydra! It's using a lot of patterns that I didn't even know existed, like instantiating objects with |
Currently we don't have any CLI script for behavioral cloning or the density baseline. I envisage this codebase as being particularly useful in being able to rapidly benchmark against a wide variety of algorithms. For this to be possible, we need to have (as far as is possible) a consistent interface between different imitation learning algorithms, similarly to how Stable Baselines has a similar interface for all RL algorithms implemented. Ideally we'd then also have
imitation.scripts.train
work for all of them, although there'll clearly need to be som algorithm-specific configs.The text was updated successfully, but these errors were encountered: