Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add log_model param to MLFlowlogger #9187

Closed

Conversation

AlessioQuercia
Copy link
Contributor

What does this PR do?

This PR adds a log_model parameter to MLFlowLogger as the one in WandbLogger (#9138).

The log_model parameter can be set to:

  • "all": checkpoints are logged during training.
  • True: checkpoints are logged at the end of training, except when
    :paramref: '~pytorch_lightning.callbacks.model_checkpoint.ModelCheckpoint.save_top_k' == "-1" which also logs every checkpoint during training.
  • False: no checkpoint is logged.

In particular, the function behavior is not exactly the same as in WandbLogger, since MLFlowClient().log_artifact takes different parameters as input:

  • In the WandbLogger, the artifact is logged together with a metadata dictionary and some aliases.
  • In the MLFlowLogger, it is not possible to log metadata and aliases together with the artifact, therefore (to keep consistency) a temporary folder is created, metadata.yml and aliases.txt are created inside of it, then the temporary folder is logged on mlflow together with the checkpoint artifact in the model/checkpoints/CHECKPOINT_NAME/ folder.

Does your PR introduce any breaking changes? If yes, please list them.

None.

Before submitting

  • Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

Notes on tests:

When I run the tests locally with make test, the output is: 1 failed, 1860 passed, 400 skipped, 3 xfailed, 5859 warnings. The failed test is the following one:

FAILED tests/plugins/environments/test_slurm_environment.py::test_default_attributes - Failed: DID NOT RAISE <class 'KeyError'>

However, when I run the tests locally with the same command as the one in the makefile (python -m coverage run --source pytorch_lightning -m pytest pytorch_lightning tests pl_examples -v), the output is: 1861 passed, 400 skipped, 3 xfailed, 5859 warnings.

Moreover, if I run the test which fails with make test singularly with python -m coverage run --source pytorch_lightning -m pytest tests/plugins/environments/test_slurm_environment.py -v , the test passes.

I assume the error might be given by the fact that I am using Ubuntu on WSL2 on Windows, but I am not sure. Anyway, I think the tests should pass on remote.

Did you have fun?

Coding this small PR was funny 😃

- Changed permissions to files (644)
- Added missing imports in pytorch_lightning/loggers/mlflow.py:
ModelCheckpoint, ReferenceType:
CHANGELOG.md Outdated Show resolved Hide resolved
pytorch_lightning/loggers/mlflow.py Outdated Show resolved Hide resolved
pytorch_lightning/loggers/mlflow.py Outdated Show resolved Hide resolved
pytorch_lightning/loggers/mlflow.py Outdated Show resolved Hide resolved
pytorch_lightning/loggers/mlflow.py Outdated Show resolved Hide resolved
tests/loggers/test_mlflow.py Outdated Show resolved Hide resolved
AlessioQuercia and others added 4 commits August 30, 2021 01:12
Co-authored-by: thomas chaton <[email protected]>
Co-authored-by: Rohit Gupta <[email protected]>
Co-authored-by: Rohit Gupta <[email protected]>
Co-authored-by: Rohit Gupta <[email protected]>
@awaelchli awaelchli added logger Related to the Loggers feature Is an improvement or enhancement labels Aug 30, 2021
_scan_and_log_checkpoints moved from wandb.py and mlflow.py to base.py,
and composed of:
- _scan_checkpoints()
- empty _log_checkpoints()

Add to mlflow.py:
- mlflow-specific _log_checkpoints()

Add to wandb.py:
- wandb-specific _log_checkpoints()
- Removed unused import (NOT INTRODUCED IN THIS PR)
@stale
Copy link

stale bot commented Oct 4, 2021

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need further help see our docs: https://pytorch-lightning.readthedocs.io/en/latest/generated/CONTRIBUTING.html#pull-request or ask the assistance of a core contributor here or on Slack. Thank you for your contributions.

@stale stale bot added the won't fix This will not be worked on label Oct 4, 2021
@AlessioQuercia
Copy link
Contributor Author

It is still work in progress, I am waiting for review to #9312 in order to proceed for this one.

@stale stale bot removed the won't fix This will not be worked on label Oct 4, 2021
@stale
Copy link

stale bot commented Oct 18, 2021

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need further help see our docs: https://pytorch-lightning.readthedocs.io/en/latest/generated/CONTRIBUTING.html#pull-request or ask the assistance of a core contributor here or on Slack. Thank you for your contributions.

@stale stale bot added the won't fix This will not be worked on label Oct 18, 2021
@AlessioQuercia
Copy link
Contributor Author

It is still work in progress, I am waiting for review to #9312 in order to proceed for this one.

@stale stale bot removed the won't fix This will not be worked on label Oct 19, 2021
@tchaton tchaton added this to the v1.6 milestone Nov 1, 2021
@carmocca carmocca removed this from the 1.6 milestone Mar 28, 2022
@stale
Copy link

stale bot commented Apr 16, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need further help see our docs: https://pytorch-lightning.readthedocs.io/en/latest/generated/CONTRIBUTING.html#pull-request or ask the assistance of a core contributor here or on Slack. Thank you for your contributions.

@stale stale bot added the won't fix This will not be worked on label Apr 16, 2022
@carmocca carmocca removed the won't fix This will not be worked on label Apr 22, 2022
@stale
Copy link

stale bot commented Jun 6, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need further help see our docs: https://pytorch-lightning.readthedocs.io/en/latest/generated/CONTRIBUTING.html#pull-request or ask the assistance of a core contributor here or on Slack. Thank you for your contributions.

@stale stale bot added the won't fix This will not be worked on label Jun 6, 2022
@carmocca carmocca added community This PR is from the community and removed won't fix This will not be worked on labels Jun 6, 2022
@stale
Copy link

stale bot commented Jun 23, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need further help see our docs: https://pytorch-lightning.readthedocs.io/en/latest/generated/CONTRIBUTING.html#pull-request or ask the assistance of a core contributor here or on Slack. Thank you for your contributions.

@stale stale bot added the won't fix This will not be worked on label Jun 23, 2022
@stale
Copy link

stale bot commented Jul 10, 2022

This pull request is going to be closed. Please feel free to reopen it create a new from the actual master.

@stale stale bot closed this Jul 10, 2022
@AlessioQuercia
Copy link
Contributor Author

The base PR (#9312) for this PR has been merged into master, is this PR still needed?

@awaelchli
Copy link
Contributor

@AlessioQuercia yes, I think continuing with #9138 is possible. We can reopen this if you would like to rebase it, or opening a fresh PR would also be fine.

@awaelchli awaelchli reopened this Oct 16, 2022
@AlessioQuercia
Copy link
Contributor Author

AlessioQuercia commented Oct 22, 2022

@AlessioQuercia yes, I think continuing with #9138 is possible. We can reopen this if you would like to rebase it, or opening a fresh PR would also be fine.

I created a new PR (#15246) as I think I messed up when pulling the updated master into this branch. I think we can close this one then.

@stale stale bot removed the won't fix This will not be worked on label Oct 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community This PR is from the community feature Is an improvement or enhancement logger: mlflow logger Related to the Loggers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants