-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[air] Deprecate MlflowTrainableMixin, move to setup_mlflow() function #31295
Conversation
Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Kai Fricke <[email protected]>
Should we raise a warning if both the callback and |
Signed-off-by: Kai Fricke <[email protected]>
It's hard to detect (as we are in different processes). Maybe it works even? :-D |
It probably will, but I imagine it could be an easy mistake to make (I doubt you'd want to log the same metrics twice). Still, saying that in the docs should be enough. |
|
||
@PublicAPI(stability="alpha") | ||
def setup_mlflow( | ||
config: Optional[Dict] = None, rank_zero_only: bool = True, **kwargs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does config need to be passed in here if all the configuration is passed in through kwargs? Is this primarily for backwards compatibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to actually have all the configuration args listed explicitly in the API instead of kwargs, and don't require that any configurations need to be passed in via the config
dict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had this in a previous iteration and the reason I went to config
was mainly to keep consistency with wandb_setup
.
I'm happy to move back to pure kwargs and not parsing a config, but I think we should then also do this for weights and biases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the main reason we do this for W&B is because we pass the other config parameters as run metadata (wand.init
takes a config
parameter). We could do the same in mlflow to justify the config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the PR to also log parameters. Let's have a chat offline which API we want to go with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Context:
- We’ve got rid of the @wandb_mixin decorator in favor of a setup_wandb(config) function for W&B
- We’re doing the same for MlFlow
- The question is, should we use a config-API setup_mlflow(config) or and explicit API setup_mlflow(experiment_name, experiment_id, tracking_uri, …)
Pro config:
- Same way we used to configure the mixins
- Required for W&B so it logs the config to the run (otherwise users have to do this manually. They are used to passing it to wandb.init() )
- We should be consistent between wandb and mlflow mixins
Pro args:
- Explicit about which args are supported
- Passing the mixin configs via config was a workaround because this was the easiest way to pass these settings and still retain some flexibility. But with the explicit setup_xyz() APIs this workaround isn’t needed anymore
Proposal:
- How about we go with setup_mlflow(config, experiment_name, experiment_id) (similar for wandb). If a mlflow or - wandb config key is set, we print a deprecation warnings and tell people to pass the configs as proper arguments instead.
- This will get rid of the “special” config keys but we can still log the rest of the config automatically.
Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Kai Fricke <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @krfricke, lgtm! Would be good to get another review from @xwjiang2010, @Yard1, or @justinvyu before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! just some nits
Co-authored-by: Amog Kamsetty <[email protected]> Signed-off-by: Kai Fricke <[email protected]>
Signed-off-by: Kai Fricke <[email protected]>
…#31295) Following #29828, this PR introduces a `setup_mlflow()` method to replace the previous `wandb_mixin` decorator. The current mlflow mixin can't be used with Ray AIR trainers. By adding a `setup_mlflow()` function that can just be called in (data parallel) function trainers, we follow the recent changes to wandb and provide a similar interface. `setup_mlflow` also follows the wandb integration by acting on the configuration dict. Lastly, this PR moves AIR integration tests to the correct subdirectory and combines the old `test_mlflow.py` into the new `test_integration_mlflow.py`. Signed-off-by: Kai Fricke <[email protected]> Signed-off-by: Kai Fricke <[email protected]> Co-authored-by: Amog Kamsetty <[email protected]>
…ray-project#31295) Following ray-project#29828, this PR introduces a `setup_mlflow()` method to replace the previous `wandb_mixin` decorator. The current mlflow mixin can't be used with Ray AIR trainers. By adding a `setup_mlflow()` function that can just be called in (data parallel) function trainers, we follow the recent changes to wandb and provide a similar interface. `setup_mlflow` also follows the wandb integration by acting on the configuration dict. Lastly, this PR moves AIR integration tests to the correct subdirectory and combines the old `test_mlflow.py` into the new `test_integration_mlflow.py`. Signed-off-by: Kai Fricke <[email protected]> Signed-off-by: Kai Fricke <[email protected]> Co-authored-by: Amog Kamsetty <[email protected]> Signed-off-by: tmynn <[email protected]>
Working with MLFow Projects in combination with ray is quite buggy. When running a Mlflow project, a mlflow session will be started. Then ray will put all mlflow logs into the same run. So the losses, parameters, ... are just overwritten. Same as in #19909 Also there are imho some errors in the setup. When trying to put everything in seperate runs when calling My suggestion is:
|
Ok, this seems to be an issue of the following:
Currently the config-dict has the least priority. This means, when any defaults are generated in the code itself, the config dict will be ignored. An example is the following:
There the experiment_id and experiment_name are set with the trial_id and trial_name. Imho there is a mixup of the experiment_name and run_name. Defaulting the run_name to the trial_name makes sense for me, however for the experiment_name this makes no sense. We should stick there to experiment_id=0 (default Experiment for mlflow). Setting the experiment_id to the default_trial_id makes no sense, because the convention of ray is not the same for mlflow. Also we should switch the order of the or statement, so the config-dict raises in priority. Lastly, since tune gets configured with dicts, I would stick to the call with a dictionary. Additionally same with wandb the config is logged as a dict. |
…ks (#33782) These were soft deprecated in favor as their AIR counterparts in the following PRs: (Ray 2.2) #29828 (Ray 2.3) #31295 For Ray 2.4, these will be hard deprecated. --------- Signed-off-by: Matthew Deng <[email protected]>
…ks (ray-project#33782) These were soft deprecated in favor as their AIR counterparts in the following PRs: (Ray 2.2) ray-project#29828 (Ray 2.3) ray-project#31295 For Ray 2.4, these will be hard deprecated. --------- Signed-off-by: Matthew Deng <[email protected]> Signed-off-by: elliottower <[email protected]>
…ks (ray-project#33782) These were soft deprecated in favor as their AIR counterparts in the following PRs: (Ray 2.2) ray-project#29828 (Ray 2.3) ray-project#31295 For Ray 2.4, these will be hard deprecated. --------- Signed-off-by: Matthew Deng <[email protected]> Signed-off-by: Jack He <[email protected]>
Why are these changes needed?
Following #29828, this PR introduces a
setup_mlflow()
method to replace the previouswandb_mixin
decorator.The current mlflow mixin can't be used with Ray AIR trainers. By adding a
setup_mlflow()
function that can just be called in (data parallel) function trainers, we follow the recent changes to wandb and provide a similar interface.setup_mlflow
also follows the wandb integration by acting on the configuration dict.Lastly, this PR moves AIR integration tests to the correct subdirectory and combines the old
test_mlflow.py
into the newtest_integration_mlflow.py
.Related issue number
Closes #27966
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.