From f72240c5db8d9deb35afa0e26edd286ce86730b6 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Tue, 7 Feb 2023 17:11:13 -0800 Subject: [PATCH 01/24] Change directory from api_docs -> api Signed-off-by: Justin Yu --- doc/source/tune/{api_docs/overview.rst => api/api.rst} | 0 doc/source/tune/{api_docs => api}/cli.rst | 0 doc/source/tune/{api_docs => api}/client.rst | 0 doc/source/tune/{api_docs => api}/env.rst | 0 doc/source/tune/{api_docs => api}/execution.rst | 0 doc/source/tune/{api_docs => api}/integration.rst | 0 doc/source/tune/{api_docs => api}/internals.rst | 0 doc/source/tune/{api_docs => api}/logging.rst | 0 doc/source/tune/{api_docs => api}/reporters.rst | 0 doc/source/tune/{api_docs => api}/result_grid.rst | 0 doc/source/tune/{api_docs => api}/schedulers.rst | 0 doc/source/tune/{api_docs => api}/search_space.rst | 0 doc/source/tune/{api_docs => api}/sklearn.rst | 0 doc/source/tune/{api_docs => api}/stoppers.rst | 0 doc/source/tune/{api_docs => api}/suggestion.rst | 0 doc/source/tune/{api_docs => api}/syncing.rst | 0 doc/source/tune/{api_docs => api}/trainable.rst | 0 17 files changed, 0 insertions(+), 0 deletions(-) rename doc/source/tune/{api_docs/overview.rst => api/api.rst} (100%) rename doc/source/tune/{api_docs => api}/cli.rst (100%) rename doc/source/tune/{api_docs => api}/client.rst (100%) rename doc/source/tune/{api_docs => api}/env.rst (100%) rename doc/source/tune/{api_docs => api}/execution.rst (100%) rename doc/source/tune/{api_docs => api}/integration.rst (100%) rename doc/source/tune/{api_docs => api}/internals.rst (100%) rename doc/source/tune/{api_docs => api}/logging.rst (100%) rename doc/source/tune/{api_docs => api}/reporters.rst (100%) rename doc/source/tune/{api_docs => api}/result_grid.rst (100%) rename doc/source/tune/{api_docs => api}/schedulers.rst (100%) rename doc/source/tune/{api_docs => api}/search_space.rst (100%) rename doc/source/tune/{api_docs => api}/sklearn.rst (100%) rename doc/source/tune/{api_docs => api}/stoppers.rst (100%) rename doc/source/tune/{api_docs => api}/suggestion.rst (100%) rename doc/source/tune/{api_docs => api}/syncing.rst (100%) rename doc/source/tune/{api_docs => api}/trainable.rst (100%) diff --git a/doc/source/tune/api_docs/overview.rst b/doc/source/tune/api/api.rst similarity index 100% rename from doc/source/tune/api_docs/overview.rst rename to doc/source/tune/api/api.rst diff --git a/doc/source/tune/api_docs/cli.rst b/doc/source/tune/api/cli.rst similarity index 100% rename from doc/source/tune/api_docs/cli.rst rename to doc/source/tune/api/cli.rst diff --git a/doc/source/tune/api_docs/client.rst b/doc/source/tune/api/client.rst similarity index 100% rename from doc/source/tune/api_docs/client.rst rename to doc/source/tune/api/client.rst diff --git a/doc/source/tune/api_docs/env.rst b/doc/source/tune/api/env.rst similarity index 100% rename from doc/source/tune/api_docs/env.rst rename to doc/source/tune/api/env.rst diff --git a/doc/source/tune/api_docs/execution.rst b/doc/source/tune/api/execution.rst similarity index 100% rename from doc/source/tune/api_docs/execution.rst rename to doc/source/tune/api/execution.rst diff --git a/doc/source/tune/api_docs/integration.rst b/doc/source/tune/api/integration.rst similarity index 100% rename from doc/source/tune/api_docs/integration.rst rename to doc/source/tune/api/integration.rst diff --git a/doc/source/tune/api_docs/internals.rst b/doc/source/tune/api/internals.rst similarity index 100% rename from doc/source/tune/api_docs/internals.rst rename to doc/source/tune/api/internals.rst diff --git a/doc/source/tune/api_docs/logging.rst b/doc/source/tune/api/logging.rst similarity index 100% rename from doc/source/tune/api_docs/logging.rst rename to doc/source/tune/api/logging.rst diff --git a/doc/source/tune/api_docs/reporters.rst b/doc/source/tune/api/reporters.rst similarity index 100% rename from doc/source/tune/api_docs/reporters.rst rename to doc/source/tune/api/reporters.rst diff --git a/doc/source/tune/api_docs/result_grid.rst b/doc/source/tune/api/result_grid.rst similarity index 100% rename from doc/source/tune/api_docs/result_grid.rst rename to doc/source/tune/api/result_grid.rst diff --git a/doc/source/tune/api_docs/schedulers.rst b/doc/source/tune/api/schedulers.rst similarity index 100% rename from doc/source/tune/api_docs/schedulers.rst rename to doc/source/tune/api/schedulers.rst diff --git a/doc/source/tune/api_docs/search_space.rst b/doc/source/tune/api/search_space.rst similarity index 100% rename from doc/source/tune/api_docs/search_space.rst rename to doc/source/tune/api/search_space.rst diff --git a/doc/source/tune/api_docs/sklearn.rst b/doc/source/tune/api/sklearn.rst similarity index 100% rename from doc/source/tune/api_docs/sklearn.rst rename to doc/source/tune/api/sklearn.rst diff --git a/doc/source/tune/api_docs/stoppers.rst b/doc/source/tune/api/stoppers.rst similarity index 100% rename from doc/source/tune/api_docs/stoppers.rst rename to doc/source/tune/api/stoppers.rst diff --git a/doc/source/tune/api_docs/suggestion.rst b/doc/source/tune/api/suggestion.rst similarity index 100% rename from doc/source/tune/api_docs/suggestion.rst rename to doc/source/tune/api/suggestion.rst diff --git a/doc/source/tune/api_docs/syncing.rst b/doc/source/tune/api/syncing.rst similarity index 100% rename from doc/source/tune/api_docs/syncing.rst rename to doc/source/tune/api/syncing.rst diff --git a/doc/source/tune/api_docs/trainable.rst b/doc/source/tune/api/trainable.rst similarity index 100% rename from doc/source/tune/api_docs/trainable.rst rename to doc/source/tune/api/trainable.rst From 4c547570260f59d46e47ea8a32c943e2d0b9ca41 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Tue, 7 Feb 2023 22:58:25 -0800 Subject: [PATCH 02/24] gitignore for autogenerated api refs Signed-off-by: Justin Yu --- .gitignore | 2 ++ 1 file changed, 2 insertions(+) diff --git a/.gitignore b/.gitignore index 9268fa948c4e..04f5ae96a6c5 100644 --- a/.gitignore +++ b/.gitignore @@ -119,6 +119,8 @@ scripts/nodes.txt /doc/_build /doc/source/_static/thumbs /doc/source/tune/generated_guides/ +/doc/source/*/api/doc/ +/doc/source/cluster/running-applications/job-submission/doc/ # User-specific stuff: .idea/**/workspace.xml From d08c28fd11bb8becb159edf52a7bc805d9e1bf58 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Tue, 7 Feb 2023 22:58:50 -0800 Subject: [PATCH 03/24] Restructure + update trainable api refs Signed-off-by: Justin Yu --- doc/source/_toc.yml | 2 +- doc/source/ray-references/api.rst | 2 +- doc/source/tune/api/search_space.rst | 2 +- doc/source/tune/api/trainable.rst | 73 +++++++++---------- doc/source/tune/faq.rst | 2 +- doc/source/tune/tutorials/tune-resources.rst | 2 +- .../tune/tutorials/tune-search-spaces.rst | 2 +- .../tutorials/tune_get_data_in_and_out.md | 2 +- 8 files changed, 40 insertions(+), 47 deletions(-) diff --git a/doc/source/_toc.yml b/doc/source/_toc.yml index 4975ca63a112..1e226a8b20b1 100644 --- a/doc/source/_toc.yml +++ b/doc/source/_toc.yml @@ -282,7 +282,7 @@ parts: - file: tune/examples/exercises title: "Exercises" - file: tune/faq - - file: tune/api_docs/overview.rst + - file: tune/api/api.rst - file: serve/index title: Ray Serve diff --git a/doc/source/ray-references/api.rst b/doc/source/ray-references/api.rst index 2e97eac3dfba..c80175f43689 100644 --- a/doc/source/ray-references/api.rst +++ b/doc/source/ray-references/api.rst @@ -8,7 +8,7 @@ API References ../ray-air/package-ref.rst ../data/api/api.rst ../train/api.rst - ../tune/api_docs/overview.rst + ../tune/api/api.rst ../serve/package-ref.rst ../rllib/package_ref/index.rst ../workflows/api/api.rst diff --git a/doc/source/tune/api/search_space.rst b/doc/source/tune/api/search_space.rst index 0ab08b9edb16..092dd11e5943 100644 --- a/doc/source/tune/api/search_space.rst +++ b/doc/source/tune/api/search_space.rst @@ -19,7 +19,7 @@ This section covers the functions you can use to define your search spaces. .. tip:: Avoid passing large objects as values in the search space, as that will incur a performance overhead. - Use :ref:`tune-with-parameters` to pass large objects in or load them inside your trainable + Use :func:`tune.with_parameters ` to pass large objects in or load them inside your trainable from disk (making sure that all nodes have access to the files) or cloud storage. See :ref:`tune-bottlenecks` for more information. diff --git a/doc/source/tune/api/trainable.rst b/doc/source/tune/api/trainable.rst index 177fc60b6296..53d243beeb70 100644 --- a/doc/source/tune/api/trainable.rst +++ b/doc/source/tune/api/trainable.rst @@ -240,7 +240,7 @@ Trainables can themselves be distributed. If your trainable function / class cre that also consume CPU / GPU resources, you will want to add more bundles to the :class:`PlacementGroupFactory` to reserve extra resource slots. For example, if a trainable class requires 1 GPU itself, but also launches 4 actors, each using another GPU, -then you should use :ref:`tune-with-resources` like this: +then you should use :func:`tune.with_resources ` like this: .. code-block:: python :emphasize-lines: 4-10 @@ -267,56 +267,49 @@ It is also possible to specify memory (``"memory"``, in bytes) and custom resour session (Function API) ---------------------- -.. autofunction:: ray.air.session.report - :noindex: +.. currentmodule:: ray -.. autofunction:: ray.air.session.get_checkpoint - :noindex: +.. autosummary:: + :toctree: doc/ -.. autofunction:: ray.air.session.get_trial_name - :noindex: + air.session.report + air.session.get_checkpoint + air.session.get_trial_name + air.session.get_trial_id + air.session.get_trial_resources + air.session.get_trial_dir -.. autofunction:: ray.air.session.get_trial_id - :noindex: - -.. autofunction:: ray.air.session.get_trial_resources - :noindex: +.. _tune-trainable-docstring: -.. autofunction:: ray.air.session.get_trial_dir - :noindex: +Trainable (Class API) +--------------------- -.. _tune-trainable-docstring: +.. autosummary:: + :toctree: doc/ -tune.Trainable (Class API) --------------------------- + tune.Trainable + tune.Trainable.setup + tune.Trainable.save_checkpoint + tune.Trainable.load_checkpoint + tune.Trainable.step + tune.Trainable.reset_config + tune.Trainable.cleanup + tune.Trainable.default_resource_request -.. autoclass:: ray.tune.Trainable - :member-order: groupwise - :private-members: - :members: + .. :member-order: groupwise + .. :private-members: + .. :members: .. _tune-util-ref: Utilities --------- -.. autofunction:: ray.tune.utils.wait_for_gpu - -.. autofunction:: ray.tune.utils.diagnose_serialization - -.. autofunction:: ray.tune.utils.validate_save_restore - - -.. _tune-with-parameters: - -tune.with_parameters --------------------- - -.. autofunction:: ray.tune.with_parameters - -.. _tune-with-resources: - -tune.with_resources --------------------- +.. autosummary:: + :toctree: doc/ -.. autofunction:: ray.tune.with_resources \ No newline at end of file + tune.with_parameters + tune.with_resources + tune.utils.wait_for_gpu + tune.utils.diagnose_serialization + tune.utils.validate_save_restore diff --git a/doc/source/tune/faq.rst b/doc/source/tune/faq.rst index 20ff41119fc6..ca4db42ef5df 100644 --- a/doc/source/tune/faq.rst +++ b/doc/source/tune/faq.rst @@ -341,7 +341,7 @@ are efficiently stored and retrieved on your cluster machines. :func:`tune.with_parameters() ` also works with class trainables. Please see -:ref:`here for further details ` and examples. +:func:`tune.with_parameters() ` for more details and examples. How can I reproduce experiments? diff --git a/doc/source/tune/tutorials/tune-resources.rst b/doc/source/tune/tutorials/tune-resources.rst index aca47a7a2f4f..540533582c5e 100644 --- a/doc/source/tune/tutorials/tune-resources.rst +++ b/doc/source/tune/tutorials/tune-resources.rst @@ -18,7 +18,7 @@ of CPUs (cores) on your machine. ) results = tuner.fit() -You can override this per trial resources with :ref:`tune-with-resources`. Here you can +You can override this per trial resources with :func:`tune.with_resources `. Here you can specify your resource requests using either a dictionary, a :class:`~ray.air.config.ScalingConfig`, or a :class:`PlacementGroupFactory ` object. In any case, Ray Tune will try to start a placement group for each trial. diff --git a/doc/source/tune/tutorials/tune-search-spaces.rst b/doc/source/tune/tutorials/tune-search-spaces.rst index d10dd84b3a6e..3ae9df9493cb 100644 --- a/doc/source/tune/tutorials/tune-search-spaces.rst +++ b/doc/source/tune/tutorials/tune-search-spaces.rst @@ -130,7 +130,7 @@ for a total of 90 trials, each with randomly sampled values of ``alpha`` and ``b .. tip:: Avoid passing large objects as values in the search space, as that will incur a performance overhead. - Use :ref:`tune-with-parameters` to pass large objects in or load them inside your trainable + Use :func:`tune.with_parameters ` to pass large objects in or load them inside your trainable from disk (making sure that all nodes have access to the files) or cloud storage. See :ref:`tune-bottlenecks` for more information. diff --git a/doc/source/tune/tutorials/tune_get_data_in_and_out.md b/doc/source/tune/tutorials/tune_get_data_in_and_out.md index 01e12d05571a..2366e3d3b3dd 100644 --- a/doc/source/tune/tutorials/tune_get_data_in_and_out.md +++ b/doc/source/tune/tutorials/tune_get_data_in_and_out.md @@ -116,7 +116,7 @@ tuner = Tuner( TL;DR - use the `tune.with_parameters` util function to specify large constant parameters. ``` -If we have large objects that are constant across Trials, we can use the [`tune.with_parameters`](tune-with-parameters) utility to pass them into the Trainable directly. The objects will be stored in the [Ray object store](serialization-guide) so that each Trial worker may access them to obtain a local copy to use in its process. +If we have large objects that are constant across Trials, we can use the {func}`tune.with_parameters ` utility to pass them into the Trainable directly. The objects will be stored in the [Ray object store](serialization-guide) so that each Trial worker may access them to obtain a local copy to use in its process. ```{tip} Objects put into the Ray object store must be serializable. From a740de3d33440688bbe586da9a5bc245af6f7c69 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Tue, 7 Feb 2023 23:10:45 -0800 Subject: [PATCH 04/24] update tune execution Signed-off-by: Justin Yu --- doc/source/tune/api/execution.rst | 35 +++++++++++++++++++------------ doc/source/tune/api/trainable.rst | 3 --- 2 files changed, 22 insertions(+), 16 deletions(-) diff --git a/doc/source/tune/api/execution.rst b/doc/source/tune/api/execution.rst index 834b5c669f78..8d6011258a48 100644 --- a/doc/source/tune/api/execution.rst +++ b/doc/source/tune/api/execution.rst @@ -1,26 +1,35 @@ -Tune Execution (Tuner, tune.Experiment) -======================================= +Tune Execution (Tuner) +====================== .. _tune-run-ref: Tuner ----- -.. autofunction:: ray.tune.Tuner +.. currentmodule:: ray.tune -tune.run_experiments --------------------- +.. autosummary:: + :toctree: doc/ + + Tuner + Tuner.fit -.. autofunction:: ray.tune.run_experiments +Restoring a Tuner +~~~~~~~~~~~~~~~~~ -tune.Experiment ---------------- +.. autosummary:: + :toctree: doc/ -.. autofunction:: ray.tune.Experiment + Tuner.restore + Tuner.can_restore + Tuner.get_results -.. _tune-sync-config: -tune.SyncConfig ---------------- +tune.run_experiments +-------------------- + +.. autosummary:: + :toctree: doc/ -.. autofunction:: ray.tune.SyncConfig + run_experiments + Experiment diff --git a/doc/source/tune/api/trainable.rst b/doc/source/tune/api/trainable.rst index 53d243beeb70..816f67d4368d 100644 --- a/doc/source/tune/api/trainable.rst +++ b/doc/source/tune/api/trainable.rst @@ -296,9 +296,6 @@ Trainable (Class API) tune.Trainable.cleanup tune.Trainable.default_resource_request - .. :member-order: groupwise - .. :private-members: - .. :members: .. _tune-util-ref: From 0e37dfb84cbb08dc5edde036f50d33b03deb155d Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Tue, 7 Feb 2023 23:42:00 -0800 Subject: [PATCH 05/24] update search space + searchers Signed-off-by: Justin Yu --- doc/source/tune/api/search_space.rst | 82 ++++---------- doc/source/tune/api/suggestion.rst | 160 +++++++++++++++++++-------- 2 files changed, 136 insertions(+), 106 deletions(-) diff --git a/doc/source/tune/api/search_space.rst b/doc/source/tune/api/search_space.rst index 092dd11e5943..49fba30b2dba 100644 --- a/doc/source/tune/api/search_space.rst +++ b/doc/source/tune/api/search_space.rst @@ -5,9 +5,6 @@ Tune Search Space API .. _tune-sample-docs: -Random Distributions API ------------------------- - This section covers the functions you can use to define your search spaces. .. caution:: @@ -83,70 +80,35 @@ For a high-level overview, see this example: "grid": tune.grid_search([32, 64, 128]) } -tune.uniform -~~~~~~~~~~~~ - -.. autofunction:: ray.tune.uniform - -tune.quniform -~~~~~~~~~~~~~ - -.. autofunction:: ray.tune.quniform - -tune.loguniform -~~~~~~~~~~~~~~~ - -.. autofunction:: ray.tune.loguniform - -tune.qloguniform -~~~~~~~~~~~~~~~~ - -.. autofunction:: ray.tune.qloguniform - -tune.randn -~~~~~~~~~~ - -.. autofunction:: ray.tune.randn +.. currentmodule:: ray -tune.qrandn -~~~~~~~~~~~ - -.. autofunction:: ray.tune.qrandn - -tune.randint -~~~~~~~~~~~~ - -.. autofunction:: ray.tune.randint - -tune.qrandint -~~~~~~~~~~~~~ - -.. autofunction:: ray.tune.qrandint - -tune.lograndint -~~~~~~~~~~~~~~~ - -.. autofunction:: ray.tune.lograndint - -tune.qlograndint -~~~~~~~~~~~~~~~~ - -.. autofunction:: ray.tune.qlograndint +Random Distributions API +------------------------ -tune.choice -~~~~~~~~~~~ +.. autosummary:: + :toctree: doc/ -.. autofunction:: ray.tune.choice + tune.uniform + tune.quniform + tune.loguniform + tune.qloguniform + tune.randn + tune.qrandn + tune.randint + tune.qrandint + tune.lograndint + tune.qlograndint + tune.choice -tune.sample_from -~~~~~~~~~~~~~~~~ -.. autofunction:: ray.tune.sample_from +Grid Search and Custom Function APIs +------------------------------------ -Grid Search API ---------------- +.. autosummary:: + :toctree: doc/ -.. autofunction:: ray.tune.grid_search + tune.grid_search + tune.sample_from References ---------- diff --git a/doc/source/tune/api/suggestion.rst b/doc/source/tune/api/suggestion.rst index 63d949106c01..af5c7c6b4dcc 100644 --- a/doc/source/tune/api/suggestion.rst +++ b/doc/source/tune/api/suggestion.rst @@ -12,66 +12,87 @@ You can utilize these search algorithms as follows: .. code-block:: python from ray import tune - from ray.tune.search.hyperopt import HyperOptSearch - tuner = tune.Tuner(my_function, tune_config=tune.TuneConfig(search_alg=HyperOptSearch(...))) + from ray.air import session + from ray.tune.search.optuna import OptunaSearch + + def train_fn(config): + # This objective function is just for demonstration purposes + session.report({"loss": config["param"]}) + + tuner = tune.Tuner( + train_fn, + tune_config=tune.TuneConfig( + search_alg=OptunaSearch(), + num_samples=100, + metric="loss", + mode="min", + ), + param_space={"param": tune.uniform(0, 1)}, + ) results = tuner.fit() -Saving and Restoring Tune Runs ------------------------------- +Saving and Restoring Tune Search Algorithms +------------------------------------------- .. TODO: what to do about this section? It doesn't really belong here and is not worth its own guide. .. TODO: at least check that this pseudo-code runs. Certain search algorithms have ``save/restore`` implemented, -allowing reuse of learnings across multiple tuning runs. +allowing reuse of searchers that are fitted on the results of multiple tuning runs. .. code-block:: python search_alg = HyperOptSearch() tuner_1 = tune.Tuner( - trainable, - tune_config=tune.TuneConfig(search_alg=search_alg)) + train_fn, + tune_config=tune.TuneConfig(search_alg=search_alg) + ) results_1 = tuner_1.fit() search_alg.save("./my-checkpoint.pkl") - # Restore the saved state onto another search algorithm + # Restore the saved state onto another search algorithm, + # in a new tuning script search_alg2 = HyperOptSearch() search_alg2.restore("./my-checkpoint.pkl") tuner_2 = tune.Tuner( - trainable, - tune_config=tune.TuneConfig(search_alg=search_alg2)) + train_fn, + tune_config=tune.TuneConfig(search_alg=search_alg2) + ) results_2 = tuner_2.fit() -Tune automatically saves its state inside the current experiment folder ("Result Dir") during tuning. +Tune automatically saves searcher state inside the current experiment folder during tuning. +See ``Result logdir: ...`` in the output logs for this location. Note that if you have two Tune runs with the same experiment folder, the previous state checkpoint will be overwritten. You can avoid this by making sure ``air.RunConfig(name=...)`` is set to a unique -identifier. +identifier: .. code-block:: python search_alg = HyperOptSearch() tuner_1 = tune.Tuner( - cost, + train_fn, tune_config=tune.TuneConfig( num_samples=5, - search_alg=search_alg), + search_alg=search_alg, + ), run_config=air.RunConfig( - verbose=0, name="my-experiment-1", - local_dir="~/my_results" - )) + local_dir="~/my_results", + ) + ) results = tuner_1.fit() search_alg2 = HyperOptSearch() search_alg2.restore_from_dir( - os.path.join("~/my_results", "my-experiment-1")) + os.path.join("~/my_results", "my-experiment-1") + ) .. _tune-basicvariant: @@ -86,23 +107,32 @@ The :class:`BasicVariantGenerator `. -.. autoclass:: ray.tune.search.basic_variant.BasicVariantGenerator +.. currentmodule:: ray.tune.search + +.. autosummary:: + :toctree: doc/ + + basic_variant.BasicVariantGenerator .. _tune-ax: Ax (tune.search.ax.AxSearch) ---------------------------- -.. autoclass:: ray.tune.search.ax.AxSearch +.. autosummary:: + :toctree: doc/ + + ax.AxSearch .. _bayesopt: Bayesian Optimization (tune.search.bayesopt.BayesOptSearch) ----------------------------------------------------------- +.. autosummary:: + :toctree: doc/ -.. autoclass:: ray.tune.search.bayesopt.BayesOptSearch - :members: save, restore + bayesopt.BayesOptSearch .. _`BayesianOptimization search space specification`: https://github.com/fmfn/BayesianOptimization/blob/master/examples/advanced-tour.ipynb @@ -125,7 +155,10 @@ In order to use this search algorithm, you will need to install ``HpBandSter`` a See the `BOHB paper `_ for more details. -.. autoclass:: ray.tune.search.bohb.TuneBOHB +.. autosummary:: + :toctree: doc/ + + bohb.TuneBOHB .. _BlendSearch: @@ -144,7 +177,10 @@ In order to use this search algorithm, you will need to install ``flaml``: See the `BlendSearch paper `_ and documentation in FLAML `BlendSearch documentation `_ for more details. -.. autoclass:: ray.tune.search.flaml.BlendSearch +.. autosummary:: + :toctree: doc/ + + flaml.BlendSearch .. _CFO: @@ -164,39 +200,50 @@ In order to use this search algorithm, you will need to install ``flaml``: See the `CFO paper `_ and documentation in FLAML `CFO documentation `_ for more details. -.. autoclass:: ray.tune.search.flaml.CFO +.. autosummary:: + :toctree: doc/ + + flaml.CFO .. _Dragonfly: Dragonfly (tune.search.dragonfly.DragonflySearch) ------------------------------------------------- -.. autoclass:: ray.tune.search.dragonfly.DragonflySearch - :members: save, restore +.. autosummary:: + :toctree: doc/ + + dragonfly.DragonflySearch .. _tune-hebo: HEBO (tune.search.hebo.HEBOSearch) ---------------------------------- -.. autoclass:: ray.tune.search.hebo.HEBOSearch - :members: save, restore +.. autosummary:: + :toctree: doc/ + + hebo.HEBOSearch .. _tune-hyperopt: HyperOpt (tune.search.hyperopt.HyperOptSearch) ---------------------------------------------- -.. autoclass:: ray.tune.search.hyperopt.HyperOptSearch - :members: save, restore +.. autosummary:: + :toctree: doc/ + + hyperopt.HyperOptSearch .. _nevergrad: Nevergrad (tune.search.nevergrad.NevergradSearch) ------------------------------------------------- -.. autoclass:: ray.tune.search.nevergrad.NevergradSearch - :members: save, restore +.. autosummary:: + :toctree: doc/ + + nevergrad.NevergradSearch .. _`Nevergrad README's Optimization section`: https://github.com/facebookresearch/nevergrad/blob/master/docs/optimization.rst#choosing-an-optimizer @@ -205,7 +252,10 @@ Nevergrad (tune.search.nevergrad.NevergradSearch) Optuna (tune.search.optuna.OptunaSearch) ---------------------------------------- -.. autoclass:: ray.tune.search.optuna.OptunaSearch +.. autosummary:: + :toctree: doc/ + + optuna.OptunaSearch .. _`Optuna samplers`: https://optuna.readthedocs.io/en/stable/reference/samplers.html @@ -217,15 +267,20 @@ SigOpt (tune.search.sigopt.SigOptSearch) You will need to use the `SigOpt experiment and space specification `__ to specify your search space. -.. autoclass:: ray.tune.search.sigopt.SigOptSearch +.. autosummary:: + :toctree: doc/ + + sigopt.SigOptSearch .. _skopt: Scikit-Optimize (tune.search.skopt.SkOptSearch) ----------------------------------------------- -.. autoclass:: ray.tune.search.skopt.SkOptSearch - :members: save, restore +.. autosummary:: + :toctree: doc/ + + skopt.SkOptSearch .. _`skopt Optimizer object`: https://scikit-optimize.github.io/stable/modules/generated/skopt.Optimizer.html#skopt.Optimizer @@ -234,8 +289,10 @@ Scikit-Optimize (tune.search.skopt.SkOptSearch) ZOOpt (tune.search.zoopt.ZOOptSearch) ------------------------------------- -.. autoclass:: ray.tune.search.zoopt.ZOOptSearch - :members: save, restore +.. autosummary:: + :toctree: doc/ + + zoopt.ZOOptSearch .. _repeater: @@ -255,7 +312,10 @@ will run ``repeat`` trials of the configuration. It will then average the .. warning:: It is recommended to not use ``Repeater`` with a TrialScheduler. Early termination can negatively affect the average reported metric. -.. autoclass:: ray.tune.search.Repeater +.. autosummary:: + :toctree: doc/ + + Repeater .. _limiter: @@ -265,7 +325,10 @@ ConcurrencyLimiter (tune.search.ConcurrencyLimiter) Use ``ray.tune.search.ConcurrencyLimiter`` to limit the amount of concurrency when using a search algorithm. This is useful when a given optimization algorithm does not parallelize very well (like a naive Bayesian Optimization). -.. autoclass:: ray.tune.search.ConcurrencyLimiter +.. autosummary:: + :toctree: doc/ + + ConcurrencyLimiter .. _byo-algo: @@ -274,11 +337,13 @@ Custom Search Algorithms (tune.search.Searcher) If you are interested in implementing or contributing a new Search Algorithm, provide the following interface: -.. autoclass:: ray.tune.search.Searcher - :members: - :private-members: - :show-inheritance: +.. autosummary:: + :toctree: doc/ + Searcher + Searcher.suggest + Searcher.save + Searcher.restore If contributing, make sure to add test cases and an entry in the function described below. @@ -290,4 +355,7 @@ There is also a shim function that constructs the search algorithm based on the This can be useful if the search algorithm you want to use changes often (e.g., specifying the search algorithm via a CLI option or config file). -.. automethod:: ray.tune.create_searcher +.. autosummary:: + :toctree: doc/ + + create_searcher From 994b5549233bb08171491fe88fd88519da51e72f Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Tue, 7 Feb 2023 23:46:02 -0800 Subject: [PATCH 06/24] Add TuneConfig to execution api ref section Signed-off-by: Justin Yu --- doc/source/tune/api/execution.rst | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/doc/source/tune/api/execution.rst b/doc/source/tune/api/execution.rst index 8d6011258a48..584131fde480 100644 --- a/doc/source/tune/api/execution.rst +++ b/doc/source/tune/api/execution.rst @@ -14,6 +14,14 @@ Tuner Tuner Tuner.fit +Tuner Configuration +~~~~~~~~~~~~~~~~~~~ + +.. autosummary:: + :toctree: doc/ + + TuneConfig + Restoring a Tuner ~~~~~~~~~~~~~~~~~ From 0a30fe59b5a5a649ae8ad8a948a7b769ef22217d Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Wed, 8 Feb 2023 00:03:21 -0800 Subject: [PATCH 07/24] Update schedulers Signed-off-by: Justin Yu --- doc/source/tune/api/schedulers.rst | 153 ++++++++++++++++++++--------- doc/source/tune/api/suggestion.rst | 2 + 2 files changed, 107 insertions(+), 48 deletions(-) diff --git a/doc/source/tune/api/schedulers.rst b/doc/source/tune/api/schedulers.rst index e53353e42399..a7ee7772092a 100644 --- a/doc/source/tune/api/schedulers.rst +++ b/doc/source/tune/api/schedulers.rst @@ -13,9 +13,26 @@ Trainable and is maximized or minimized according to ``mode``. .. code-block:: python from ray import tune - tuner = tune.Tuner( ... , tune_config=tune.TuneConfig(scheduler=Scheduler(metric="accuracy", mode="max"))) + from ray.air import session + from tune.schedulers import ASHAScheduler + + def train_fn(config): + # This objective function is just for demonstration purposes + session.report({"loss": config["param"]}) + + tuner = tune.Tuner( + train_fn, + tune_config=tune.TuneConfig( + scheduler=ASHAScheduler(), + metric="loss", + mode="min", + num_samples=10, + ), + param_space={"param": tune.uniform(0, 1)}, + ) results = tuner.fit() +.. currentmodule:: ray.tune.schedulers .. _tune-scheduler-hyperband: @@ -28,15 +45,21 @@ setting the ``scheduler`` parameter of ``tune.TuneConfig``, which is taken in by .. code-block:: python from ray import tune + from tune.schedulers import ASHAScheduler + asha_scheduler = ASHAScheduler( time_attr='training_iteration', - metric='episode_reward_mean', - mode='max', + metric='loss', + mode='min', max_t=100, grace_period=10, reduction_factor=3, - brackets=1) - tuner = tune.Tuner( ... , tune_config=tune.TuneConfig(scheduler=asha_scheduler)) + brackets=1, + ) + tuner = tune.Tuner( + train_fn, + tune_config=tune.TuneConfig(scheduler=asha_scheduler), + ) results = tuner.fit() Compared to the original version of HyperBand, this implementation provides better @@ -48,9 +71,11 @@ Even though the original paper mentions a bracket count of 3, discussions with t that the value should be left to 1 bracket. This is the default used if no value is provided for the ``brackets`` argument. -.. autoclass:: ray.tune.schedulers.AsyncHyperBandScheduler +.. autosummary:: + :toctree: doc/ -.. autoclass:: ray.tune.schedulers.ASHAScheduler + AsyncHyperBandScheduler + ASHAScheduler .. _tune-original-hyperband: @@ -60,7 +85,10 @@ HyperBand (tune.schedulers.HyperBandScheduler) Tune implements the `standard version of HyperBand `__. **We recommend using the ASHA Scheduler over the standard HyperBand scheduler.** -.. autoclass:: ray.tune.schedulers.HyperBandScheduler +.. autosummary:: + :toctree: doc/ + + HyperBandScheduler HyperBand Implementation Details @@ -105,7 +133,10 @@ Median Stopping Rule (tune.schedulers.MedianStoppingRule) The Median Stopping Rule implements the simple strategy of stopping a trial if its performance falls below the median of other trials at similar points in time. -.. autoclass:: ray.tune.schedulers.MedianStoppingRule +.. autosummary:: + :toctree: doc/ + + MedianStoppingRule .. _tune-scheduler-pbt: @@ -117,23 +148,25 @@ This can be enabled by setting the ``scheduler`` parameter of ``tune.TuneConfig` .. code-block:: python + from ray import tune + from ray.tune.schedulers import PopulationBasedTraining + pbt_scheduler = PopulationBasedTraining( time_attr='training_iteration', - metric='mean_accuracy', - mode='max', - perturbation_interval=600.0, + metric='loss', + mode='min', + perturbation_interval=1, hyperparam_mutations={ "lr": [1e-3, 5e-4, 1e-4, 5e-5, 1e-5], - "alpha": lambda: random.uniform(0.0, 1.0), - ... + "alpha": tune.uniform(0.0, 1.0), } ) tuner = tune.Tuner( - ..., + train_fn, tune_config=tune.TuneConfig( num_samples=4, - scheduler=pbt_scheduler - ) + scheduler=pbt_scheduler, + ), ) tuner.fit() @@ -150,7 +183,10 @@ Take a look at :doc:`/tune/examples/pbt_visualization/pbt_visualization` to get of how PBT operates. :doc:`/tune/examples/pbt_guide` gives more examples of PBT usage. -.. autoclass:: ray.tune.schedulers.PopulationBasedTraining +.. autosummary:: + :toctree: doc/ + + PopulationBasedTraining .. _tune-scheduler-pbt-replay: @@ -165,20 +201,26 @@ config according to the obtained schedule. .. code-block:: python + from ray import tune + from ray.tune.schedulers import PopulationBasedTrainingReplay + replay = PopulationBasedTrainingReplay( experiment_dir="~/ray_results/pbt_experiment/", - trial_id="XXXXX_00001") + trial_id="XXXXX_00001" + ) tuner = tune.Tuner( - ..., + train_fn, tune_config=tune.TuneConfig(scheduler=replay) - ) + ) results = tuner.fit() See :ref:`here for an example ` on how to use the replay utility in practice. -.. autoclass:: ray.tune.schedulers.PopulationBasedTrainingReplay +.. autosummary:: + :toctree: doc/ + PopulationBasedTrainingReplay .. _tune-scheduler-pb2: @@ -203,15 +245,16 @@ PB2 can be enabled by setting the ``scheduler`` parameter of ``tune.TuneConfig`` from ray.tune.schedulers.pb2 import PB2 pb2_scheduler = PB2( - time_attr='time_total_s', - metric='mean_accuracy', - mode='max', - perturbation_interval=600.0, - hyperparam_bounds={ - "lr": [1e-3, 1e-5], - "alpha": [0.0, 1.0], - ... - }) + time_attr='time_total_s', + metric='mean_accuracy', + mode='max', + perturbation_interval=600.0, + hyperparam_bounds={ + "lr": [1e-3, 1e-5], + "alpha": [0.0, 1.0], + ... + } + ) tuner = tune.Tuner( ... , tune_config=tune.TuneConfig(scheduler=pb2_scheduler)) results = tuner.fit() @@ -227,7 +270,10 @@ With that in mind, you can run this :doc:`PB2 PPO example ` for package requirements, examples, and d An example of this in use can be found here: :doc:`/tune/examples/includes/bohb_example`. -.. autoclass:: ray.tune.schedulers.HyperBandForBOHB + +.. autosummary:: + :toctree: doc/ + + HyperBandForBOHB .. _tune-resource-changing-scheduler: @@ -265,28 +315,32 @@ It wraps around another scheduler and uses its decisions. An example of this in use can be found here: :doc:`/tune/examples/includes/xgboost_dynamic_resources_example`. -.. autoclass:: ray.tune.schedulers.ResourceChangingScheduler +.. autosummary:: + :toctree: doc/ -DistributeResources -~~~~~~~~~~~~~~~~~~~ + ResourceChangingScheduler + resource_changing_scheduler.DistributeResources + resource_changing_scheduler.DistributeResourcesToTopJob -.. autoclass:: ray.tune.schedulers.resource_changing_scheduler.DistributeResources +FIFOScheduler (Default Scheduler) +--------------------------------- -DistributeResourcesToTopJob -~~~~~~~~~~~~~~~~~~~~~~~~~~~ +.. autosummary:: + :toctree: doc/ -.. autoclass:: ray.tune.schedulers.resource_changing_scheduler.DistributeResourcesToTopJob + FIFOScheduler -FIFOScheduler -------------- +TrialScheduler Interface +------------------------ -.. autoclass:: ray.tune.schedulers.FIFOScheduler +.. autosummary:: + :toctree: doc/ -TrialScheduler --------------- + TrialScheduler + TrialScheduler.choose_trial_to_run + TrialScheduler.on_trial_result + TrialScheduler.on_trial_complete -.. autoclass:: ray.tune.schedulers.TrialScheduler - :members: Shim Instantiation (tune.create_scheduler) ------------------------------------------ @@ -295,4 +349,7 @@ There is also a shim function that constructs the scheduler based on the provide This can be useful if the scheduler you want to use changes often (e.g., specifying the scheduler via a CLI option or config file). -.. automethod:: ray.tune.create_scheduler +.. autosummary:: + :toctree: doc/ + + create_scheduler diff --git a/doc/source/tune/api/suggestion.rst b/doc/source/tune/api/suggestion.rst index af5c7c6b4dcc..fdb997e748d7 100644 --- a/doc/source/tune/api/suggestion.rst +++ b/doc/source/tune/api/suggestion.rst @@ -344,6 +344,8 @@ If you are interested in implementing or contributing a new Search Algorithm, pr Searcher.suggest Searcher.save Searcher.restore + Searcher.on_trial_result + Searcher.on_trial_complete If contributing, make sure to add test cases and an entry in the function described below. From d686153a85ebf9866af197dbd0878795129e78ea Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Wed, 8 Feb 2023 00:18:22 -0800 Subject: [PATCH 08/24] update reporters, results, stoppers Signed-off-by: Justin Yu --- doc/source/tune/api/reporters.rst | 29 ++++++++++-------- doc/source/tune/api/result_grid.rst | 29 +++++++++++++----- doc/source/tune/api/stoppers.rst | 47 +++++++++++------------------ 3 files changed, 57 insertions(+), 48 deletions(-) diff --git a/doc/source/tune/api/reporters.rst b/doc/source/tune/api/reporters.rst index ade3263deb1c..b163d5695099 100644 --- a/doc/source/tune/api/reporters.rst +++ b/doc/source/tune/api/reporters.rst @@ -88,21 +88,26 @@ The default reporting style can also be overridden more broadly by extending the results = tuner.fit() -CLIReporter ------------ +.. currentmodule:: ray.tune -.. autoclass:: ray.tune.CLIReporter - :members: add_metric_column +Reporter Interface (tune.ProgressReporter) +------------------------------------------ -JupyterNotebookReporter ------------------------ +.. autosummary:: + :toctree: doc/ + + ProgressReporter + ProgressReporter.report + ProgressReporter.should_report -.. autoclass:: ray.tune.JupyterNotebookReporter - :members: add_metric_column +Tune Built-in Reporters +----------------------- -ProgressReporter ----------------- +.. autosummary:: + :toctree: doc/ -.. autoclass:: ray.tune.ProgressReporter - :members: + CLIReporter + CLIReporter.add_metric_column + JupyterNotebookReporter + JupyterNotebookReporter.add_metric_column diff --git a/doc/source/tune/api/result_grid.rst b/doc/source/tune/api/result_grid.rst index 9333494237bb..2e6b043dc6cb 100644 --- a/doc/source/tune/api/result_grid.rst +++ b/doc/source/tune/api/result_grid.rst @@ -3,23 +3,38 @@ .. _result-grid-docstring: ResultGrid (tune.ResultGrid) ----------------------------- +============================ -.. autoclass:: ray.tune.ResultGrid - :members: +.. currentmodule:: ray + +.. autosummary:: + :toctree: doc/ + + tune.ResultGrid + tune.ResultGrid.get_best_result + tune.ResultGrid.get_dataframe .. _result-docstring: Result (air.Result) ------------------- -.. autoclass:: ray.air.Result - :members: +.. autosummary:: + :toctree: doc/ + + air.Result .. _exp-analysis-docstring: ExperimentAnalysis (tune.ExperimentAnalysis) -------------------------------------------- -.. autoclass:: ray.tune.ExperimentAnalysis - :members: +.. note:: + + An experiment analysis is the output of the ``tune.run`` API. + It's now recommended to use ``Tuner.fit``, which outputs a ``ResultGrid`` object. + +.. autosummary:: + :toctree: doc/ + + tune.ExperimentAnalysis diff --git a/doc/source/tune/api/stoppers.rst b/doc/source/tune/api/stoppers.rst index 30edfa793e35..28dfa38d8044 100644 --- a/doc/source/tune/api/stoppers.rst +++ b/doc/source/tune/api/stoppers.rst @@ -1,6 +1,6 @@ .. _tune-stoppers: -Tune Stopping mechanisms (tune.stopper) +Tune Stopping Mechanisms (tune.stopper) ======================================= In addition to Trial Schedulers like :ref:`ASHA `, where a number of @@ -13,40 +13,29 @@ inherit from the :class:`Stopper ` class. Other stopping behaviors are described :ref:`in the user guide `. -.. contents:: - :local: - :depth: 1 - .. _tune-stop-ref: -Stopper (tune.Stopper) ----------------------- - -.. autoclass:: ray.tune.Stopper - :members: __call__, stop_all - -MaximumIterationStopper (tune.stopper.MaximumIterationStopper) --------------------------------------------------------------- +Stopper Interface (tune.Stopper) +-------------------------------- -.. autoclass:: ray.tune.stopper.MaximumIterationStopper +.. currentmodule:: ray.tune.stopper -ExperimentPlateauStopper (tune.stopper.ExperimentPlateauStopper) ----------------------------------------------------------------- +.. autosummary:: + :toctree: doc/ -.. autoclass:: ray.tune.stopper.ExperimentPlateauStopper + Stopper + Stopper.__call__ + Stopper.stop_all -TrialPlateauStopper (tune.stopper.TrialPlateauStopper) ------------------------------------------------------- - -.. autoclass:: ray.tune.stopper.TrialPlateauStopper - -TimeoutStopper (tune.stopper.TimeoutStopper) --------------------------------------------- - -.. autoclass:: ray.tune.stopper.TimeoutStopper +Tune Built-in Stoppers +---------------------- -CombinedStopper (tune.stopper.CombinedStopper) ----------------------------------------------- +.. autosummary:: + :toctree: doc/ -.. autoclass:: ray.tune.stopper.CombinedStopper + MaximumIterationStopper + ExperimentPlateauStopper + TrialPlateauStopper + TimeoutStopper + CombinedStopper From f8ea9931f7acd726d6036e8cbe2e486dbbf7e3eb Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Wed, 8 Feb 2023 00:50:47 -0800 Subject: [PATCH 09/24] update syncing, logging, sklearn Signed-off-by: Justin Yu --- doc/source/tune/api/logging.rst | 52 +++++++++++++++++++++------------ doc/source/tune/api/sklearn.rst | 24 ++++++++++++--- doc/source/tune/api/syncing.rst | 37 ++++++++++++++++++----- 3 files changed, 82 insertions(+), 31 deletions(-) diff --git a/doc/source/tune/api/logging.rst b/doc/source/tune/api/logging.rst index ec9a3f9e0f90..e5eb8732b626 100644 --- a/doc/source/tune/api/logging.rst +++ b/doc/source/tune/api/logging.rst @@ -31,40 +31,54 @@ relevant ones (like accuracy, loss, etc.). .. image:: ../images/ray-tune-viskit.png -TBXLogger ---------- +.. currentmodule:: ray -.. autoclass:: ray.tune.logger.TBXLoggerCallback +Tune Built-in Loggers +--------------------- -JsonLogger ----------- +.. autosummary:: + :toctree: doc/ -.. autoclass:: ray.tune.logger.JsonLoggerCallback + tune.logger.JsonLoggerCallback + tune.logger.CSVLoggerCallback + tune.logger.TBXLoggerCallback -CSVLogger ---------- -.. autoclass:: ray.tune.logger.CSVLoggerCallback - -MLFlowLogger ------------- +MLFlow Integration: MLFlowLoggerCallback +---------------------------------------- Tune also provides a logger for `MLflow `_. You can install MLflow via ``pip install mlflow``. You can see the :doc:`tutorial here `. -WandbLogger ------------ +.. autosummary:: + :toctree: doc/ + + air.integrations.mlflow.MLflowLoggerCallback + +Wandb Integration: WandbLoggerCallback +-------------------------------------- Tune also provides a logger for `Weights & Biases `_. You can install Wandb via ``pip install wandb``. -You can see the :doc:`tutorial here ` +You can see the :doc:`tutorial here `. +.. autosummary:: + :toctree: doc/ + + air.integrations.wandb.WandbLoggerCallback .. _logger-interface: -LoggerCallback --------------- +LoggerCallback Interface +------------------------ + +.. autosummary:: + :toctree: doc/ -.. autoclass:: ray.tune.logger.LoggerCallback - :members: log_trial_start, log_trial_restore, log_trial_save, log_trial_result, log_trial_end + tune.logger.LoggerCallback + tune.logger.LoggerCallback.log_trial_start + tune.logger.LoggerCallback.log_trial_restore + tune.logger.LoggerCallback.log_trial_save + tune.logger.LoggerCallback.log_trial_result + tune.logger.LoggerCallback.log_trial_end diff --git a/doc/source/tune/api/sklearn.rst b/doc/source/tune/api/sklearn.rst index 2214303f77c4..d34c0b6d2888 100644 --- a/doc/source/tune/api/sklearn.rst +++ b/doc/source/tune/api/sklearn.rst @@ -8,13 +8,29 @@ Tune Scikit-Learn API (tune.sklearn) TuneGridSearchCV ---------------- -.. autoclass:: ray.tune.sklearn.TuneGridSearchCV - :members: +.. currentmodule:: ray.tune.sklearn + +.. autosummary:: + :toctree: doc/ + + TuneGridSearchCV + TuneGridSearchCV.fit + TuneGridSearchCV.score + TuneGridSearchCV.score_samples + TuneGridSearchCV.get_params + TuneGridSearchCV.set_params .. _tunesearchcv-docs: TuneSearchCV ------------ -.. autoclass:: ray.tune.sklearn.TuneSearchCV - :members: +.. autosummary:: + :toctree: doc/ + + TuneSearchCV + TuneSearchCV.fit + TuneSearchCV.score + TuneSearchCV.score_samples + TuneSearchCV.get_params + TuneSearchCV.set_params diff --git a/doc/source/tune/api/syncing.rst b/doc/source/tune/api/syncing.rst index 17e030794e26..72543d37556c 100644 --- a/doc/source/tune/api/syncing.rst +++ b/doc/source/tune/api/syncing.rst @@ -3,18 +3,39 @@ Syncing in Tune (tune.SyncConfig, tune.Syncer) .. _tune-syncconfig: -SyncConfig ----------- +.. currentmodule:: ray.tune.syncer -.. autoclass:: ray.tune.syncer.SyncConfig - :members: +Tune Syncing Configuration +-------------------------- +.. autosummary:: + :toctree: doc/ + SyncConfig .. _tune-syncer: -Syncer ------- +Remote Storage Syncer Interface (tune.Syncer) +--------------------------------------------- + +.. autosummary:: + :toctree: doc/ + + Syncer + Syncer.sync_up + Syncer.sync_down + Syncer.delete + Syncer.wait + Syncer.wait_or_retry + + +Tune Built-in Syncers +--------------------- + +.. autosummary:: + :toctree: doc/ + + SyncerCallback + _DefaultSyncer + _BackgroundSyncer -.. autoclass:: ray.tune.syncer.Syncer - :members: From 1d1e9a346ab0ca820bb51ff59248641d618e0878 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Wed, 8 Feb 2023 01:07:53 -0800 Subject: [PATCH 10/24] update integrations Signed-off-by: Justin Yu --- doc/source/tune/api/integration.rst | 62 ++++++++++++++++++----------- 1 file changed, 38 insertions(+), 24 deletions(-) diff --git a/doc/source/tune/api/integration.rst b/doc/source/tune/api/integration.rst index 63daaa76b4f0..cdbab7e8da4d 100644 --- a/doc/source/tune/api/integration.rst +++ b/doc/source/tune/api/integration.rst @@ -3,17 +3,17 @@ External library integrations for Ray Tune (tune.integration) ============================================================= -.. contents:: - :local: - :depth: 1 - +.. currentmodule:: ray Comet (tune.integration.comet) ------------------------------------------- -:ref:`See also here `. +:ref:`See here for an example. ` + +.. autosummary:: + :toctree: doc/ -.. autoclass:: ray.air.integrations.comet.CometLoggerCallback + air.integrations.comet.CometLoggerCallback :noindex: .. _tune-integration-keras: @@ -21,9 +21,11 @@ Comet (tune.integration.comet) Keras (tune.integration.keras) ------------------------------------------------------ -.. autoclass:: ray.tune.integration.keras.TuneReportCallback +.. autosummary:: + :toctree: doc/ -.. autoclass:: ray.tune.integration.keras.TuneReportCheckpointCallback + tune.integration.keras.TuneReportCallback + tune.integration.keras.TuneReportCheckpointCallback .. _tune-integration-mlflow: @@ -31,12 +33,14 @@ Keras (tune.integration.keras) MLflow (tune.integration.mlflow) -------------------------------- -:ref:`See also here `. +:ref:`See here for an example. ` -.. autoclass:: ray.air.integrations.mlflow.MLflowLoggerCallback - :noindex: +.. autosummary:: + :toctree: doc/ -.. autofunction:: ray.air.integrations.mlflow.setup_mlflow + air.integrations.mlflow.MLflowLoggerCallback + :noindex: + air.integrations.mlflow.setup_mlflow .. _tune-integration-mxnet: @@ -44,9 +48,11 @@ MLflow (tune.integration.mlflow) MXNet (tune.integration.mxnet) ------------------------------ -.. autoclass:: ray.tune.integration.mxnet.TuneReportCallback +.. autosummary:: + :toctree: doc/ -.. autoclass:: ray.tune.integration.mxnet.TuneCheckpointCallback + tune.integration.mxnet.TuneReportCallback + tune.integration.mxnet.TuneCheckpointCallback .. _tune-integration-pytorch-lightning: @@ -54,21 +60,25 @@ MXNet (tune.integration.mxnet) PyTorch Lightning (tune.integration.pytorch_lightning) ------------------------------------------------------ -.. autoclass:: ray.tune.integration.pytorch_lightning.TuneReportCallback +.. autosummary:: + :toctree: doc/ -.. autoclass:: ray.tune.integration.pytorch_lightning.TuneReportCheckpointCallback + tune.integration.pytorch_lightning.TuneReportCallback + tune.integration.pytorch_lightning.TuneReportCheckpointCallback .. _tune-integration-wandb: Weights and Biases (tune.integration.wandb) ------------------------------------------- -:ref:`See also here `. +:ref:`See here for an example. ` -.. autoclass:: ray.air.integrations.wandb.WandbLoggerCallback - :noindex: +.. autosummary:: + :toctree: doc/ -.. autofunction:: ray.air.integrations.wandb.setup_wandb + air.integrations.wandb.WandbLoggerCallback + :noindex: + air.integrations.wandb.setup_wandb .. _tune-integration-xgboost: @@ -76,9 +86,11 @@ Weights and Biases (tune.integration.wandb) XGBoost (tune.integration.xgboost) ---------------------------------- -.. autoclass:: ray.tune.integration.xgboost.TuneReportCallback +.. autosummary:: + :toctree: doc/ -.. autoclass:: ray.tune.integration.xgboost.TuneReportCheckpointCallback + tune.integration.xgboost.TuneReportCallback + tune.integration.xgboost.TuneReportCheckpointCallback .. _tune-integration-lightgbm: @@ -86,6 +98,8 @@ XGBoost (tune.integration.xgboost) LightGBM (tune.integration.lightgbm) ------------------------------------ -.. autoclass:: ray.tune.integration.lightgbm.TuneReportCallback +.. autosummary:: + :toctree: doc/ -.. autoclass:: ray.tune.integration.lightgbm.TuneReportCheckpointCallback + tune.integration.lightgbm.TuneReportCallback + tune.integration.lightgbm.TuneReportCheckpointCallback From 745654d4887f39f0a1c89fbce3d280a231297a23 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Wed, 8 Feb 2023 01:08:02 -0800 Subject: [PATCH 11/24] Update sync config ref Signed-off-by: Justin Yu --- doc/source/tune/api/syncing.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/doc/source/tune/api/syncing.rst b/doc/source/tune/api/syncing.rst index 72543d37556c..688b1a12c2fa 100644 --- a/doc/source/tune/api/syncing.rst +++ b/doc/source/tune/api/syncing.rst @@ -1,10 +1,11 @@ Syncing in Tune (tune.SyncConfig, tune.Syncer) ============================================== -.. _tune-syncconfig: .. currentmodule:: ray.tune.syncer +.. _tune-sync-config: + Tune Syncing Configuration -------------------------- From 82b88f69fb2c7770640c6d7da436ef5e626f351d Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Wed, 8 Feb 2023 16:46:34 -0800 Subject: [PATCH 12/24] Fix invalid link refs Signed-off-by: Justin Yu --- doc/source/tune/api/suggestion.rst | 8 -------- python/ray/tune/search/bayesopt/bayesopt_search.py | 10 ++++++++-- python/ray/tune/search/optuna/optuna_search.py | 2 ++ python/ray/tune/search/skopt/skopt_search.py | 6 +++--- 4 files changed, 13 insertions(+), 13 deletions(-) diff --git a/doc/source/tune/api/suggestion.rst b/doc/source/tune/api/suggestion.rst index fdb997e748d7..bce657c9ecc5 100644 --- a/doc/source/tune/api/suggestion.rst +++ b/doc/source/tune/api/suggestion.rst @@ -134,8 +134,6 @@ Bayesian Optimization (tune.search.bayesopt.BayesOptSearch) bayesopt.BayesOptSearch -.. _`BayesianOptimization search space specification`: https://github.com/fmfn/BayesianOptimization/blob/master/examples/advanced-tour.ipynb - .. _suggest-TuneBOHB: BOHB (tune.search.bohb.TuneBOHB) @@ -245,8 +243,6 @@ Nevergrad (tune.search.nevergrad.NevergradSearch) nevergrad.NevergradSearch -.. _`Nevergrad README's Optimization section`: https://github.com/facebookresearch/nevergrad/blob/master/docs/optimization.rst#choosing-an-optimizer - .. _tune-optuna: Optuna (tune.search.optuna.OptunaSearch) @@ -257,8 +253,6 @@ Optuna (tune.search.optuna.OptunaSearch) optuna.OptunaSearch -.. _`Optuna samplers`: https://optuna.readthedocs.io/en/stable/reference/samplers.html - .. _sigopt: SigOpt (tune.search.sigopt.SigOptSearch) @@ -282,8 +276,6 @@ Scikit-Optimize (tune.search.skopt.SkOptSearch) skopt.SkOptSearch -.. _`skopt Optimizer object`: https://scikit-optimize.github.io/stable/modules/generated/skopt.Optimizer.html#skopt.Optimizer - .. _zoopt: ZOOpt (tune.search.zoopt.ZOOptSearch) diff --git a/python/ray/tune/search/bayesopt/bayesopt_search.py b/python/ray/tune/search/bayesopt/bayesopt_search.py index 160bab1918d6..d02752b52cbd 100644 --- a/python/ray/tune/search/bayesopt/bayesopt_search.py +++ b/python/ray/tune/search/bayesopt/bayesopt_search.py @@ -53,8 +53,14 @@ class BayesOptSearch(Searcher): pip install bayesian-optimization - This algorithm requires setting a search space using the - `BayesianOptimization search space specification`_. + Initializing this search algorithm with a ``space`` requires that it's + in the ``BayesianOptimization`` search space format. Otherwise, you + should instead pass in a Tune search space into ``Tuner(param_space=...)``, + and the search space will be automatically converted for you. + + See this `BayesianOptimization example notebook + `_ + for an example. Args: space: Continuous search space. Parameters will be sampled from diff --git a/python/ray/tune/search/optuna/optuna_search.py b/python/ray/tune/search/optuna/optuna_search.py index 2462ac1af482..b5d9ca80ded0 100644 --- a/python/ray/tune/search/optuna/optuna_search.py +++ b/python/ray/tune/search/optuna/optuna_search.py @@ -120,6 +120,8 @@ class OptunaSearch(Searcher): draw hyperparameter configurations. Defaults to ``MOTPESampler`` for multi-objective optimization with Optuna<2.9.0, and ``TPESampler`` in every other case. + See https://optuna.readthedocs.io/en/stable/reference/samplers.html + for available Optuna samplers. .. warning:: Please note that with Optuna 2.10.0 and earlier diff --git a/python/ray/tune/search/skopt/skopt_search.py b/python/ray/tune/search/skopt/skopt_search.py index a207f938f1df..c3d8acf752aa 100644 --- a/python/ray/tune/search/skopt/skopt_search.py +++ b/python/ray/tune/search/skopt/skopt_search.py @@ -43,10 +43,10 @@ class SkOptSearch(Searcher): pip install scikit-optimize - This Search Algorithm requires you to pass in a `skopt Optimizer object`_. + This Search Algorithm requires you to pass in a `skopt Optimizer object + `_. - This searcher will automatically filter out any NaN, inf or -inf - results. + This searcher will automatically filter out any NaN, inf or -inf results. Parameters: optimizer: Optimizer provided From 93503c8e795cc0de49f52eca2a80cb3ba0272ead Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Wed, 8 Feb 2023 16:57:36 -0800 Subject: [PATCH 13/24] Strip parent modules for integrations section Signed-off-by: Justin Yu --- doc/source/tune/api/integration.rst | 36 ++++++++++++++--------------- 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/doc/source/tune/api/integration.rst b/doc/source/tune/api/integration.rst index cdbab7e8da4d..3565752e447c 100644 --- a/doc/source/tune/api/integration.rst +++ b/doc/source/tune/api/integration.rst @@ -5,7 +5,7 @@ External library integrations for Ray Tune (tune.integration) .. currentmodule:: ray -Comet (tune.integration.comet) +Comet (air.integrations.comet) ------------------------------------------- :ref:`See here for an example. ` @@ -13,7 +13,7 @@ Comet (tune.integration.comet) .. autosummary:: :toctree: doc/ - air.integrations.comet.CometLoggerCallback + ~air.integrations.comet.CometLoggerCallback :noindex: .. _tune-integration-keras: @@ -24,13 +24,13 @@ Keras (tune.integration.keras) .. autosummary:: :toctree: doc/ - tune.integration.keras.TuneReportCallback - tune.integration.keras.TuneReportCheckpointCallback + ~tune.integration.keras.TuneReportCallback + ~tune.integration.keras.TuneReportCheckpointCallback .. _tune-integration-mlflow: -MLflow (tune.integration.mlflow) +MLflow (air.integrations.mlflow) -------------------------------- :ref:`See here for an example. ` @@ -38,9 +38,9 @@ MLflow (tune.integration.mlflow) .. autosummary:: :toctree: doc/ - air.integrations.mlflow.MLflowLoggerCallback + ~air.integrations.mlflow.MLflowLoggerCallback :noindex: - air.integrations.mlflow.setup_mlflow + ~air.integrations.mlflow.setup_mlflow .. _tune-integration-mxnet: @@ -51,8 +51,8 @@ MXNet (tune.integration.mxnet) .. autosummary:: :toctree: doc/ - tune.integration.mxnet.TuneReportCallback - tune.integration.mxnet.TuneCheckpointCallback + ~tune.integration.mxnet.TuneReportCallback + ~tune.integration.mxnet.TuneCheckpointCallback .. _tune-integration-pytorch-lightning: @@ -63,12 +63,12 @@ PyTorch Lightning (tune.integration.pytorch_lightning) .. autosummary:: :toctree: doc/ - tune.integration.pytorch_lightning.TuneReportCallback - tune.integration.pytorch_lightning.TuneReportCheckpointCallback + ~tune.integration.pytorch_lightning.TuneReportCallback + ~tune.integration.pytorch_lightning.TuneReportCheckpointCallback .. _tune-integration-wandb: -Weights and Biases (tune.integration.wandb) +Weights and Biases (air.integrations.wandb) ------------------------------------------- :ref:`See here for an example. ` @@ -76,9 +76,9 @@ Weights and Biases (tune.integration.wandb) .. autosummary:: :toctree: doc/ - air.integrations.wandb.WandbLoggerCallback + ~air.integrations.wandb.WandbLoggerCallback :noindex: - air.integrations.wandb.setup_wandb + ~air.integrations.wandb.setup_wandb .. _tune-integration-xgboost: @@ -89,8 +89,8 @@ XGBoost (tune.integration.xgboost) .. autosummary:: :toctree: doc/ - tune.integration.xgboost.TuneReportCallback - tune.integration.xgboost.TuneReportCheckpointCallback + ~tune.integration.xgboost.TuneReportCallback + ~tune.integration.xgboost.TuneReportCheckpointCallback .. _tune-integration-lightgbm: @@ -101,5 +101,5 @@ LightGBM (tune.integration.lightgbm) .. autosummary:: :toctree: doc/ - tune.integration.lightgbm.TuneReportCallback - tune.integration.lightgbm.TuneReportCheckpointCallback + ~tune.integration.lightgbm.TuneReportCallback + ~tune.integration.lightgbm.TuneReportCheckpointCallback From 223964f3a79497ce631aa3fb4cb0c26b27fcaf75 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Wed, 8 Feb 2023 17:00:34 -0800 Subject: [PATCH 14/24] Add noindex for session apis in tune Signed-off-by: Justin Yu --- doc/source/tune/api/trainable.rst | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/doc/source/tune/api/trainable.rst b/doc/source/tune/api/trainable.rst index 816f67d4368d..76388a9edf64 100644 --- a/doc/source/tune/api/trainable.rst +++ b/doc/source/tune/api/trainable.rst @@ -273,11 +273,17 @@ session (Function API) :toctree: doc/ air.session.report + :noindex: air.session.get_checkpoint + :noindex: air.session.get_trial_name + :noindex: air.session.get_trial_id + :noindex: air.session.get_trial_resources + :noindex: air.session.get_trial_dir + :noindex: .. _tune-trainable-docstring: From d5fb51609a7927e5eb8542713fb544347bb48819 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Wed, 8 Feb 2023 17:20:23 -0800 Subject: [PATCH 15/24] Add callbacks api section Signed-off-by: Justin Yu --- doc/source/tune/api/api.rst | 1 + doc/source/tune/api/callbacks.rst | 55 +++++++++++++++++++++++++++++++ 2 files changed, 56 insertions(+) create mode 100644 doc/source/tune/api/callbacks.rst diff --git a/doc/source/tune/api/api.rst b/doc/source/tune/api/api.rst index a2581cf88495..38df69d4389f 100644 --- a/doc/source/tune/api/api.rst +++ b/doc/source/tune/api/api.rst @@ -23,6 +23,7 @@ on `Github`_. reporters.rst syncing.rst logging.rst + callbacks.rst env.rst sklearn.rst integration.rst diff --git a/doc/source/tune/api/callbacks.rst b/doc/source/tune/api/callbacks.rst new file mode 100644 index 000000000000..d4211c7025d4 --- /dev/null +++ b/doc/source/tune/api/callbacks.rst @@ -0,0 +1,55 @@ +.. _tune-callbacks: + +Tune Callbacks (tune.Callback) +============================== + +See :doc:`this user guide ` for more details. + +.. seealso:: + + :doc:`Tune's built-in loggers ` use the ``Callback`` interface. + + +Callback Interface +------------------ + +Callback Initialization and Setup +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. currentmodule:: ray.tune +.. autosummary:: + :toctree: doc/ + + Callback + Callback.setup + + +Callback Hooks +~~~~~~~~~~~~~~ + +.. autosummary:: + :toctree: doc/ + + Callback.on_checkpoint + Callback.on_experiment_end + Callback.on_step_begin + Callback.on_step_end + Callback.on_trial_complete + Callback.on_trial_error + Callback.on_trial_restore + Callback.on_trial_result + Callback.on_trial_save + Callback.on_trial_start + + +Stateful Callbacks +~~~~~~~~~~~~~~~~~~ + +The following methods must be overriden for stateful callbacks to be saved/restored +properly by Tune. + +.. autosummary:: + :toctree: doc/ + + Callback.get_state + Callback.set_state From 4d6ee1d2fb10ffb008f885d8923a1134fc50a2a8 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Wed, 8 Feb 2023 17:20:30 -0800 Subject: [PATCH 16/24] Add some misc improvements Signed-off-by: Justin Yu --- doc/source/tune/api/execution.rst | 6 +++--- doc/source/tune/api/syncing.rst | 4 ++++ 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/doc/source/tune/api/execution.rst b/doc/source/tune/api/execution.rst index 584131fde480..50154462a703 100644 --- a/doc/source/tune/api/execution.rst +++ b/doc/source/tune/api/execution.rst @@ -1,5 +1,5 @@ -Tune Execution (Tuner) -====================== +Tune Execution (tune.Tuner) +=========================== .. _tune-run-ref: @@ -13,6 +13,7 @@ Tuner Tuner Tuner.fit + Tuner.get_results Tuner Configuration ~~~~~~~~~~~~~~~~~~~ @@ -30,7 +31,6 @@ Restoring a Tuner Tuner.restore Tuner.can_restore - Tuner.get_results tune.run_experiments diff --git a/doc/source/tune/api/syncing.rst b/doc/source/tune/api/syncing.rst index 688b1a12c2fa..d7f24fd44920 100644 --- a/doc/source/tune/api/syncing.rst +++ b/doc/source/tune/api/syncing.rst @@ -1,6 +1,10 @@ Syncing in Tune (tune.SyncConfig, tune.Syncer) ============================================== +.. seealso:: + + See :doc:`this user guide ` for more details and examples. + .. currentmodule:: ray.tune.syncer From 7d095f9d971f4cc2d4f8a98d0600ce14d02d7280 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Wed, 8 Feb 2023 17:27:51 -0800 Subject: [PATCH 17/24] Move placement group ref + split trainable utilities into categories Signed-off-by: Justin Yu --- doc/source/tune/api/callbacks.rst | 2 +- doc/source/tune/api/internals.rst | 17 ----------------- doc/source/tune/api/trainable.rst | 24 ++++++++++++++++++++++-- 3 files changed, 23 insertions(+), 20 deletions(-) diff --git a/doc/source/tune/api/callbacks.rst b/doc/source/tune/api/callbacks.rst index d4211c7025d4..5cee421c13ba 100644 --- a/doc/source/tune/api/callbacks.rst +++ b/doc/source/tune/api/callbacks.rst @@ -1,4 +1,4 @@ -.. _tune-callbacks: +.. _tune-callbacks-docs: Tune Callbacks (tune.Callback) ============================== diff --git a/doc/source/tune/api/internals.rst b/doc/source/tune/api/internals.rst index 3b90dd7ed1d9..b4f590961118 100644 --- a/doc/source/tune/api/internals.rst +++ b/doc/source/tune/api/internals.rst @@ -23,23 +23,6 @@ Trial .. autoclass:: ray.tune.experiment.trial.Trial -.. _tune-callbacks-docs: - -Callbacks ---------- - -.. autoclass:: ray.tune.callback.Callback - :members: - - -.. _resources-docstring: - -PlacementGroupFactory ---------------------- - -.. autoclass:: ray.tune.execution.placement_groups.PlacementGroupFactory - - Registry -------- diff --git a/doc/source/tune/api/trainable.rst b/doc/source/tune/api/trainable.rst index 76388a9edf64..3b46f22694c1 100644 --- a/doc/source/tune/api/trainable.rst +++ b/doc/source/tune/api/trainable.rst @@ -305,14 +305,34 @@ Trainable (Class API) .. _tune-util-ref: -Utilities ---------- +Tune Trainable Utilities +------------------------- + +Tune Data Ingestion Utilities +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. autosummary:: :toctree: doc/ tune.with_parameters + + +Tune Resource Assignment Utilities +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autosummary:: + :toctree: doc/ + tune.with_resources + ~tune.execution.placement_groups.PlacementGroupFactory tune.utils.wait_for_gpu + + +Tune Trainable Debugging Utilities +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. autosummary:: + :toctree: doc/ + tune.utils.diagnose_serialization tune.utils.validate_save_restore From 4dabaf05bbb5dd856eda6807920ab2b51e31cec9 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Wed, 8 Feb 2023 17:29:02 -0800 Subject: [PATCH 18/24] Don't have special case for doc/clusters gitignore Signed-off-by: Justin Yu --- .gitignore | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/.gitignore b/.gitignore index 04f5ae96a6c5..56b81a8f39c8 100644 --- a/.gitignore +++ b/.gitignore @@ -119,8 +119,7 @@ scripts/nodes.txt /doc/_build /doc/source/_static/thumbs /doc/source/tune/generated_guides/ -/doc/source/*/api/doc/ -/doc/source/cluster/running-applications/job-submission/doc/ +/doc/source/**/doc/ # User-specific stuff: .idea/**/workspace.xml From 930e2f26360c5e570bdb83bba320285876860417 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Wed, 8 Feb 2023 17:33:32 -0800 Subject: [PATCH 19/24] Add some stuff to tune internals Signed-off-by: Justin Yu --- doc/source/tune/api/internals.rst | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/doc/source/tune/api/internals.rst b/doc/source/tune/api/internals.rst index b4f590961118..d33717ec9ef0 100644 --- a/doc/source/tune/api/internals.rst +++ b/doc/source/tune/api/internals.rst @@ -3,6 +3,13 @@ Tune Internals .. _raytrialexecutor-docstring: +TunerInternal +--------------- + +.. autoclass:: ray.tune.impl.tuner_internal.TunerInternal + :members: + + RayTrialExecutor ---------------- @@ -15,6 +22,7 @@ TrialRunner ----------- .. autoclass:: ray.tune.execution.trial_runner.TrialRunner + :members: .. _trial-docstring: @@ -22,6 +30,14 @@ Trial ----- .. autoclass:: ray.tune.experiment.trial.Trial + :members: + +FunctionTrainable +----------------- + +.. autoclass:: ray.tune.trainable.function_trainable.FunctionTrainable + +.. autofunction:: ray.tune.trainable.function_trainable.wrap_function Registry From 8c69afda58ddffecaf366b65760398b102cc22c8 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Wed, 8 Feb 2023 18:35:35 -0800 Subject: [PATCH 20/24] Fix result grid section title Signed-off-by: Justin Yu --- doc/source/tune/api/api.rst | 2 +- doc/source/tune/api/result_grid.rst | 6 +++++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/doc/source/tune/api/api.rst b/doc/source/tune/api/api.rst index 38df69d4389f..6d2e3d36c4b7 100644 --- a/doc/source/tune/api/api.rst +++ b/doc/source/tune/api/api.rst @@ -14,12 +14,12 @@ on `Github`_. :maxdepth: 2 execution.rst + result_grid.rst trainable.rst search_space.rst suggestion.rst schedulers.rst stoppers.rst - result_grid.rst reporters.rst syncing.rst logging.rst diff --git a/doc/source/tune/api/result_grid.rst b/doc/source/tune/api/result_grid.rst index 2e6b043dc6cb..abda7956af8f 100644 --- a/doc/source/tune/api/result_grid.rst +++ b/doc/source/tune/api/result_grid.rst @@ -2,8 +2,11 @@ .. _result-grid-docstring: +Tune Experiment Results (tune.ResultGrid) +========================================= + ResultGrid (tune.ResultGrid) -============================ +---------------------------- .. currentmodule:: ray @@ -26,6 +29,7 @@ Result (air.Result) .. _exp-analysis-docstring: + ExperimentAnalysis (tune.ExperimentAnalysis) -------------------------------------------- From 79c2ccac533ab6ab2da805059816c801f00b2a77 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Wed, 8 Feb 2023 23:42:36 -0800 Subject: [PATCH 21/24] Set noindex on the air package ref for now Signed-off-by: Justin Yu --- doc/source/ray-air/package-ref.rst | 25 ++------- doc/source/tune/api/integration.rst | 78 ++++++++++++++++------------- 2 files changed, 48 insertions(+), 55 deletions(-) diff --git a/doc/source/ray-air/package-ref.rst b/doc/source/ray-air/package-ref.rst index a1e90d55b2cc..2fa48650e45a 100644 --- a/doc/source/ray-air/package-ref.rst +++ b/doc/source/ray-air/package-ref.rst @@ -159,6 +159,7 @@ Training Session .. automodule:: ray.air.session :members: + :noindex: Trainer Configs ############### @@ -364,26 +365,8 @@ Reinforcement Learning (RLlib) .. _air-builtin-callbacks: -Monitoring Integrations +Integrations ~~~~~~~~~~~~~~~~~~~~~~~ -Comet -##### - -.. autoclass:: ray.air.integrations.comet.CometLoggerCallback - -Keras -##### - -.. autoclass:: ray.air.integrations.keras.Callback - :members: - -MLflow -###### - -.. autoclass:: ray.air.integrations.mlflow.MLflowLoggerCallback - -Weights and Biases -################## - -.. autoclass:: ray.air.integrations.wandb.WandbLoggerCallback +See :doc:`this API reference ` for AIR integrations with other libraries +such as Weights and Biases, MLFlow, Keras, and more. diff --git a/doc/source/tune/api/integration.rst b/doc/source/tune/api/integration.rst index 3565752e447c..0ad8ff86d2a6 100644 --- a/doc/source/tune/api/integration.rst +++ b/doc/source/tune/api/integration.rst @@ -1,12 +1,23 @@ .. _tune-integration: -External library integrations for Ray Tune (tune.integration) -============================================================= +External library integrations for Ray Tune +=========================================== + +.. TODO: Clean this up. Both tune.integration and air.integrations are +.. captured here. Most of the `tune.integration` can be deprecated soon. +.. XGBoost/LightGBM callbacks are no longer recommended - use their trainers instead +.. which will automatically report+checkpoint. +.. After PTL trainer is introduced, we can also deprecate that callback. .. currentmodule:: ray +.. _tune-monitoring-integrations: + +Tune Experiment Monitoring Integrations +---------------------------------------- + Comet (air.integrations.comet) -------------------------------------------- +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ :ref:`See here for an example. ` @@ -14,39 +25,53 @@ Comet (air.integrations.comet) :toctree: doc/ ~air.integrations.comet.CometLoggerCallback - :noindex: -.. _tune-integration-keras: -Keras (tune.integration.keras) ------------------------------------------------------- +.. _tune-integration-mlflow: + +MLflow (air.integrations.mlflow) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +:ref:`See here for an example. ` .. autosummary:: :toctree: doc/ - ~tune.integration.keras.TuneReportCallback - ~tune.integration.keras.TuneReportCheckpointCallback + ~air.integrations.mlflow.MLflowLoggerCallback + ~air.integrations.mlflow.setup_mlflow +.. _tune-integration-wandb: -.. _tune-integration-mlflow: +Weights and Biases (air.integrations.wandb) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -MLflow (air.integrations.mlflow) +:ref:`See here for an example. ` + +.. autosummary:: + :toctree: doc/ + + ~air.integrations.wandb.WandbLoggerCallback + ~air.integrations.wandb.setup_wandb + + +Integrations with ML Libraries -------------------------------- -:ref:`See here for an example. ` +.. _tune-integration-keras: + +Keras (air.integrations.keras) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. autosummary:: :toctree: doc/ - ~air.integrations.mlflow.MLflowLoggerCallback - :noindex: - ~air.integrations.mlflow.setup_mlflow + ~air.integrations.keras.ReportCheckpointCallback .. _tune-integration-mxnet: MXNet (tune.integration.mxnet) ------------------------------- +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. autosummary:: :toctree: doc/ @@ -58,7 +83,7 @@ MXNet (tune.integration.mxnet) .. _tune-integration-pytorch-lightning: PyTorch Lightning (tune.integration.pytorch_lightning) ------------------------------------------------------- +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. autosummary:: :toctree: doc/ @@ -66,25 +91,10 @@ PyTorch Lightning (tune.integration.pytorch_lightning) ~tune.integration.pytorch_lightning.TuneReportCallback ~tune.integration.pytorch_lightning.TuneReportCheckpointCallback -.. _tune-integration-wandb: - -Weights and Biases (air.integrations.wandb) -------------------------------------------- - -:ref:`See here for an example. ` - -.. autosummary:: - :toctree: doc/ - - ~air.integrations.wandb.WandbLoggerCallback - :noindex: - ~air.integrations.wandb.setup_wandb - - .. _tune-integration-xgboost: XGBoost (tune.integration.xgboost) ----------------------------------- +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. autosummary:: :toctree: doc/ @@ -96,7 +106,7 @@ XGBoost (tune.integration.xgboost) .. _tune-integration-lightgbm: LightGBM (tune.integration.lightgbm) ------------------------------------- +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. autosummary:: :toctree: doc/ From eb3bf64415ecc540a700f6527d4e2e5b86e2f0e9 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Wed, 8 Feb 2023 23:43:48 -0800 Subject: [PATCH 22/24] Fix optuna link Signed-off-by: Justin Yu --- python/ray/tune/search/optuna/optuna_search.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/python/ray/tune/search/optuna/optuna_search.py b/python/ray/tune/search/optuna/optuna_search.py index b5d9ca80ded0..0c20fbb51abc 100644 --- a/python/ray/tune/search/optuna/optuna_search.py +++ b/python/ray/tune/search/optuna/optuna_search.py @@ -120,7 +120,7 @@ class OptunaSearch(Searcher): draw hyperparameter configurations. Defaults to ``MOTPESampler`` for multi-objective optimization with Optuna<2.9.0, and ``TPESampler`` in every other case. - See https://optuna.readthedocs.io/en/stable/reference/samplers.html + See https://optuna.readthedocs.io/en/stable/reference/samplers/index.html for available Optuna samplers. .. warning:: From aaffb053641f1bd288f98e38d9fdba54f23769a5 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Thu, 9 Feb 2023 17:05:26 -0800 Subject: [PATCH 23/24] Replace tune-sample-docs ref with tune-search-spaces Signed-off-by: Justin Yu --- doc/source/tune/api/search_space.rst | 2 - doc/source/tune/examples/tune-xgboost.ipynb | 2506 ++++++++--------- doc/source/tune/getting-started.rst | 2 +- doc/source/tune/key-concepts.rst | 4 +- .../tune/tutorials/tune-search-spaces.rst | 2 +- .../tutorials/tune_get_data_in_and_out.md | 2 +- 6 files changed, 1258 insertions(+), 1260 deletions(-) diff --git a/doc/source/tune/api/search_space.rst b/doc/source/tune/api/search_space.rst index 49fba30b2dba..21c2e7520f18 100644 --- a/doc/source/tune/api/search_space.rst +++ b/doc/source/tune/api/search_space.rst @@ -3,8 +3,6 @@ Tune Search Space API ===================== -.. _tune-sample-docs: - This section covers the functions you can use to define your search spaces. .. caution:: diff --git a/doc/source/tune/examples/tune-xgboost.ipynb b/doc/source/tune/examples/tune-xgboost.ipynb index b732c1879fb9..edf640b0b30c 100644 --- a/doc/source/tune/examples/tune-xgboost.ipynb +++ b/doc/source/tune/examples/tune-xgboost.ipynb @@ -1,1255 +1,1255 @@ { - "cells": [ - { - "cell_type": "markdown", - "id": "edce67b9", - "metadata": {}, - "source": [ - "# Tuning XGBoost hyperparameters with Ray Tune\n", - "\n", - "(tune-xgboost-ref)=\n", - "\n", - "XGBoost is currently one of the most popular machine learning algorithms. It performs\n", - "very well on a large selection of tasks, and was the key to success in many Kaggle\n", - "competitions.\n", - "\n", - "```{image} /images/xgboost_logo.png\n", - ":align: center\n", - ":alt: XGBoost\n", - ":target: https://xgboost.readthedocs.io/en/latest/\n", - ":width: 200px\n", - "```\n", - "\n", - "This tutorial will give you a quick introduction to XGBoost, show you how\n", - "to train an XGBoost model, and then guide you on how to optimize XGBoost\n", - "parameters using Tune to get the best performance. We tackle the following topics:\n", - "\n", - "```{contents}\n", - ":depth: 2\n", - "```\n", - "\n", - ":::{note}\n", - "To run this tutorial, you will need to install the following:\n", - "\n", - "```bash\n", - "$ pip install xgboost\n", - "```\n", - ":::\n", - "\n", - "## What is XGBoost\n", - "\n", - "XGBoost is an acronym for e**X**treme **G**radient **Boost**ing. Internally,\n", - "XGBoost uses [decision trees](https://en.wikipedia.org/wiki/Decision_tree). Instead\n", - "of training just one large decision tree, XGBoost and other related algorithms train\n", - "many small decision trees. The intuition behind this is that even though single\n", - "decision trees can be inaccurate and suffer from high variance,\n", - "combining the output of a large number of these weak learners can actually lead to\n", - "strong learner, resulting in better predictions and less variance.\n", - "\n", - ":::{figure} /images/tune-xgboost-ensemble.svg\n", - ":alt: Single vs. ensemble learning\n", - "\n", - "A single decision tree (left) might be able to get to an accuracy of 70%\n", - "for a binary classification task. By combining the output of several small\n", - "decision trees, an ensemble learner (right) might end up with a higher accuracy\n", - "of 90%.\n", - ":::\n", - "\n", - "Boosting algorithms start with a single small decision tree and evaluate how well\n", - "it predicts the given examples. When building the next tree, those samples that have\n", - "been misclassified before have a higher chance of being used to generate the tree.\n", - "This is useful because it avoids overfitting to samples that can be easily classified\n", - "and instead tries to come up with models that are able to classify hard examples, too.\n", - "Please see [here for a more thorough introduction to bagging and boosting algorithms](https://towardsdatascience.com/ensemble-methods-bagging-boosting-and-stacking-c9214a10a205).\n", - "\n", - "There are many boosting algorithms. In their core, they are all very similar. XGBoost\n", - "uses second-level derivatives to find splits that maximize the *gain* (the inverse of\n", - "the *loss*) - hence the name. In practice, there really is no drawback in using\n", - "XGBoost over other boosting algorithms - in fact, it usually shows the best performance.\n", - "\n", - "## Training a simple XGBoost classifier\n", - "\n", - "Let's first see how a simple XGBoost classifier can be trained. We'll use the\n", - "`breast_cancer`-Dataset included in the `sklearn` dataset collection. This is\n", - "a binary classification dataset. Given 30 different input features, our task is to\n", - "learn to identify subjects with breast cancer and those without.\n", - "\n", - "Here is the full code to train a simple XGBoost model:" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "id": "77b3c71c", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Accuracy: 0.9650\n" - ] - } - ], - "source": [ - "import sklearn.datasets\n", - "import sklearn.metrics\n", - "from sklearn.model_selection import train_test_split\n", - "import xgboost as xgb\n", - "\n", - "\n", - "def train_breast_cancer(config):\n", - " # Load dataset\n", - " data, labels = sklearn.datasets.load_breast_cancer(return_X_y=True)\n", - " # Split into train and test set\n", - " train_x, test_x, train_y, test_y = train_test_split(data, labels, test_size=0.25)\n", - " # Build input matrices for XGBoost\n", - " train_set = xgb.DMatrix(train_x, label=train_y)\n", - " test_set = xgb.DMatrix(test_x, label=test_y)\n", - " # Train the classifier\n", - " results = {}\n", - " bst = xgb.train(\n", - " config,\n", - " train_set,\n", - " evals=[(test_set, \"eval\")],\n", - " evals_result=results,\n", - " verbose_eval=False,\n", - " )\n", - " return results\n", - "\n", - "\n", - "if __name__ == \"__main__\":\n", - " results = train_breast_cancer(\n", - " {\"objective\": \"binary:logistic\", \"eval_metric\": [\"logloss\", \"error\"]}\n", - " )\n", - " accuracy = 1.0 - results[\"eval\"][\"error\"][-1]\n", - " print(f\"Accuracy: {accuracy:.4f}\")\n" - ] - }, - { - "cell_type": "markdown", - "id": "ec2a13f8", - "metadata": {}, - "source": [ - "As you can see, the code is quite simple. First, the dataset is loaded and split\n", - "into a `test` and `train` set. The XGBoost model is trained with `xgb.train()`.\n", - "XGBoost automatically evaluates metrics we specified on the test set. In our case\n", - "it calculates the *logloss* and the prediction *error*, which is the percentage of\n", - "misclassified examples. To calculate the accuracy, we just have to subtract the error\n", - "from `1.0`. Even in this simple example, most runs result\n", - "in a good accuracy of over `0.90`.\n", - "\n", - "Maybe you have noticed the `config` parameter we pass to the XGBoost algorithm. This\n", - "is a {class}`dict` in which you can specify parameters for the XGBoost algorithm. In this\n", - "simple example, the only parameters we passed are the `objective` and `eval_metric` parameters.\n", - "The value `binary:logistic` tells XGBoost that we aim to train a logistic regression model for\n", - "a binary classification task. You can find an overview over all valid objectives\n", - "[here in the XGBoost documentation](https://xgboost.readthedocs.io/en/latest/parameter.html#learning-task-parameters).\n", - "\n", - "## XGBoost Hyperparameters\n", - "\n", - "Even with the default settings, XGBoost was able to get to a good accuracy on the\n", - "breast cancer dataset. However, as in many machine learning algorithms, there are\n", - "many knobs to tune which might lead to even better performance. Let's explore some of\n", - "them below.\n", - "\n", - "### Maximum tree depth\n", - "\n", - "Remember that XGBoost internally uses many decision tree models to come up with\n", - "predictions. When training a decision tree, we need to tell the algorithm how\n", - "large the tree may get. The parameter for this is called the tree *depth*.\n", - "\n", - ":::{figure} /images/tune-xgboost-depth.svg\n", - ":align: center\n", - ":alt: Decision tree depth\n", - "\n", - "In this image, the left tree has a depth of 2, and the right tree a depth of 3.\n", - "Note that with each level, $2^{(d-1)}$ splits are added, where *d* is the depth\n", - "of the tree.\n", - ":::\n", - "\n", - "Tree depth is a property that concerns the model complexity. If you only allow short\n", - "trees, the models are likely not very precise - they underfit the data. If you allow\n", - "very large trees, the single models are likely to overfit to the data. In practice,\n", - "a number between `2` and `6` is often a good starting point for this parameter.\n", - "\n", - "XGBoost's default value is `3`.\n", - "\n", - "### Minimum child weight\n", - "\n", - "When a decision tree creates new leaves, it splits up the remaining data at one node\n", - "into two groups. If there are only few samples in one of these groups, it often\n", - "doesn't make sense to split it further. One of the reasons for this is that the\n", - "model is harder to train when we have fewer samples.\n", - "\n", - ":::{figure} /images/tune-xgboost-weight.svg\n", - ":align: center\n", - ":alt: Minimum child weight\n", - "\n", - "In this example, we start with 100 examples. At the first node, they are split\n", - "into 4 and 96 samples, respectively. In the next step, our model might find\n", - "that it doesn't make sense to split the 4 examples more. It thus only continues\n", - "to add leaves on the right side.\n", - ":::\n", - "\n", - "The parameter used by the model to decide if it makes sense to split a node is called\n", - "the *minimum child weight*. In the case of linear regression, this is just the absolute\n", - "number of nodes requried in each child. In other objectives, this value is determined\n", - "using the weights of the examples, hence the name.\n", - "\n", - "The larger the value, the more constrained the trees are and the less deep they will be.\n", - "This parameter thus also affects the model complexity. Values can range between 0\n", - "and infinity and are dependent on the sample size. For our ca. 500 examples in the\n", - "breast cancer dataset, values between `0` and `10` should be sensible.\n", - "\n", - "XGBoost's default value is `1`.\n", - "\n", - "### Subsample size\n", - "\n", - "Each decision tree we add is trained on a subsample of the total training dataset.\n", - "The probabilities for the samples are weighted according to the XGBoost algorithm,\n", - "but we can decide on which fraction of the samples we want to train each decision\n", - "tree on.\n", - "\n", - "Setting this value to `0.7` would mean that we randomly sample `70%` of the\n", - "training dataset before each training iteration.\n", - "\n", - "XGBoost's default value is `1`.\n", - "\n", - "### Learning rate / Eta\n", - "\n", - "Remember that XGBoost sequentially trains many decision trees, and that later trees\n", - "are more likely trained on data that has been misclassified by prior trees. In effect\n", - "this means that earlier trees make decisions for easy samples (i.e. those samples that\n", - "can easily be classified) and later trees make decisions for harder samples. It is then\n", - "sensible to assume that the later trees are less accurate than earlier trees.\n", - "\n", - "To address this fact, XGBoost uses a parameter called *Eta*, which is sometimes called\n", - "the *learning rate*. Don't confuse this with learning rates from gradient descent!\n", - "The original [paper on stochastic gradient boosting](https://www.sciencedirect.com/science/article/abs/pii/S0167947301000652)\n", - "introduces this parameter like so:\n", - "\n", - "$$\n", - "F_m(x) = F_{m-1}(x) + \\eta \\cdot \\gamma_{lm} \\textbf{1}(x \\in R_{lm})\n", - "$$\n", - "\n", - "This is just a complicated way to say that when we train we new decision tree,\n", - "represented by $\\gamma_{lm} \\textbf{1}(x \\in R_{lm})$, we want to dampen\n", - "its effect on the previous prediction $F_{m-1}(x)$ with a factor\n", - "$\\eta$.\n", - "\n", - "Typical values for this parameter are between `0.01` and `` 0.3` ``.\n", - "\n", - "XGBoost's default value is `0.3`.\n", - "\n", - "### Number of boost rounds\n", - "\n", - "Lastly, we can decide on how many boosting rounds we perform, which means how\n", - "many decision trees we ultimately train. When we do heavy subsampling or use small\n", - "learning rate, it might make sense to increase the number of boosting rounds.\n", - "\n", - "XGBoost's default value is `10`.\n", - "\n", - "### Putting it together\n", - "\n", - "Let's see how this looks like in code! We just need to adjust our `config` dict:" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "35073e88", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Accuracy: 0.9790\n" - ] - } - ], - "source": [ - "if __name__ == \"__main__\":\n", - " config = {\n", - " \"objective\": \"binary:logistic\",\n", - " \"eval_metric\": [\"logloss\", \"error\"],\n", - " \"max_depth\": 2,\n", - " \"min_child_weight\": 0,\n", - " \"subsample\": 0.8,\n", - " \"eta\": 0.2,\n", - " }\n", - " results = train_breast_cancer(config)\n", - " accuracy = 1.0 - results[\"eval\"][\"error\"][-1]\n", - " print(f\"Accuracy: {accuracy:.4f}\")\n" - ] - }, - { - "cell_type": "markdown", - "id": "69cf0c13", - "metadata": {}, - "source": [ - "The rest stays the same. Please note that we do not adjust the `num_boost_rounds` here.\n", - "The result should also show a high accuracy of over 90%.\n", - "\n", - "## Tuning the configuration parameters\n", - "\n", - "XGBoosts default parameters already lead to a good accuracy, and even our guesses in the\n", - "last section should result in accuracies well above 90%. However, our guesses were\n", - "just that: guesses. Often we do not know what combination of parameters would actually\n", - "lead to the best results on a machine learning task.\n", - "\n", - "Unfortunately, there are infinitely many combinations of hyperparameters we could try\n", - "out. Should we combine `max_depth=3` with `subsample=0.8` or with `subsample=0.9`?\n", - "What about the other parameters?\n", - "\n", - "This is where hyperparameter tuning comes into play. By using tuning libraries such as\n", - "Ray Tune we can try out combinations of hyperparameters. Using sophisticated search\n", - "strategies, these parameters can be selected so that they are likely to lead to good\n", - "results (avoiding an expensive *exhaustive search*). Also, trials that do not perform\n", - "well can be preemptively stopped to reduce waste of computing resources. Lastly, Ray Tune\n", - "also takes care of training these runs in parallel, greatly increasing search speed.\n", - "\n", - "Let's start with a basic example on how to use Tune for this. We just need to make\n", - "a few changes to our code-block:" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "id": "ff856a82", - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "2022-07-22 15:52:52,004\tINFO services.py:1483 -- View the Ray dashboard at \u001b[1m\u001b[32mhttp://127.0.0.1:8268\u001b[39m\u001b[22m\n", - "2022-07-22 15:52:55,858\tWARNING function_trainable.py:619 -- Function checkpointing is disabled. This may result in unexpected behavior when using checkpointing features or certain schedulers. To enable, set the train function arguments to be `func(config, checkpoint_dir=None)`.\n" - ] + "cells": [ + { + "cell_type": "markdown", + "id": "edce67b9", + "metadata": {}, + "source": [ + "# Tuning XGBoost hyperparameters with Ray Tune\n", + "\n", + "(tune-xgboost-ref)=\n", + "\n", + "XGBoost is currently one of the most popular machine learning algorithms. It performs\n", + "very well on a large selection of tasks, and was the key to success in many Kaggle\n", + "competitions.\n", + "\n", + "```{image} /images/xgboost_logo.png\n", + ":align: center\n", + ":alt: XGBoost\n", + ":target: https://xgboost.readthedocs.io/en/latest/\n", + ":width: 200px\n", + "```\n", + "\n", + "This tutorial will give you a quick introduction to XGBoost, show you how\n", + "to train an XGBoost model, and then guide you on how to optimize XGBoost\n", + "parameters using Tune to get the best performance. We tackle the following topics:\n", + "\n", + "```{contents}\n", + ":depth: 2\n", + "```\n", + "\n", + ":::{note}\n", + "To run this tutorial, you will need to install the following:\n", + "\n", + "```bash\n", + "$ pip install xgboost\n", + "```\n", + ":::\n", + "\n", + "## What is XGBoost\n", + "\n", + "XGBoost is an acronym for e**X**treme **G**radient **Boost**ing. Internally,\n", + "XGBoost uses [decision trees](https://en.wikipedia.org/wiki/Decision_tree). Instead\n", + "of training just one large decision tree, XGBoost and other related algorithms train\n", + "many small decision trees. The intuition behind this is that even though single\n", + "decision trees can be inaccurate and suffer from high variance,\n", + "combining the output of a large number of these weak learners can actually lead to\n", + "strong learner, resulting in better predictions and less variance.\n", + "\n", + ":::{figure} /images/tune-xgboost-ensemble.svg\n", + ":alt: Single vs. ensemble learning\n", + "\n", + "A single decision tree (left) might be able to get to an accuracy of 70%\n", + "for a binary classification task. By combining the output of several small\n", + "decision trees, an ensemble learner (right) might end up with a higher accuracy\n", + "of 90%.\n", + ":::\n", + "\n", + "Boosting algorithms start with a single small decision tree and evaluate how well\n", + "it predicts the given examples. When building the next tree, those samples that have\n", + "been misclassified before have a higher chance of being used to generate the tree.\n", + "This is useful because it avoids overfitting to samples that can be easily classified\n", + "and instead tries to come up with models that are able to classify hard examples, too.\n", + "Please see [here for a more thorough introduction to bagging and boosting algorithms](https://towardsdatascience.com/ensemble-methods-bagging-boosting-and-stacking-c9214a10a205).\n", + "\n", + "There are many boosting algorithms. In their core, they are all very similar. XGBoost\n", + "uses second-level derivatives to find splits that maximize the *gain* (the inverse of\n", + "the *loss*) - hence the name. In practice, there really is no drawback in using\n", + "XGBoost over other boosting algorithms - in fact, it usually shows the best performance.\n", + "\n", + "## Training a simple XGBoost classifier\n", + "\n", + "Let's first see how a simple XGBoost classifier can be trained. We'll use the\n", + "`breast_cancer`-Dataset included in the `sklearn` dataset collection. This is\n", + "a binary classification dataset. Given 30 different input features, our task is to\n", + "learn to identify subjects with breast cancer and those without.\n", + "\n", + "Here is the full code to train a simple XGBoost model:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "77b3c71c", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Accuracy: 0.9650\n" + ] + } + ], + "source": [ + "import sklearn.datasets\n", + "import sklearn.metrics\n", + "from sklearn.model_selection import train_test_split\n", + "import xgboost as xgb\n", + "\n", + "\n", + "def train_breast_cancer(config):\n", + " # Load dataset\n", + " data, labels = sklearn.datasets.load_breast_cancer(return_X_y=True)\n", + " # Split into train and test set\n", + " train_x, test_x, train_y, test_y = train_test_split(data, labels, test_size=0.25)\n", + " # Build input matrices for XGBoost\n", + " train_set = xgb.DMatrix(train_x, label=train_y)\n", + " test_set = xgb.DMatrix(test_x, label=test_y)\n", + " # Train the classifier\n", + " results = {}\n", + " bst = xgb.train(\n", + " config,\n", + " train_set,\n", + " evals=[(test_set, \"eval\")],\n", + " evals_result=results,\n", + " verbose_eval=False,\n", + " )\n", + " return results\n", + "\n", + "\n", + "if __name__ == \"__main__\":\n", + " results = train_breast_cancer(\n", + " {\"objective\": \"binary:logistic\", \"eval_metric\": [\"logloss\", \"error\"]}\n", + " )\n", + " accuracy = 1.0 - results[\"eval\"][\"error\"][-1]\n", + " print(f\"Accuracy: {accuracy:.4f}\")\n" + ] + }, + { + "cell_type": "markdown", + "id": "ec2a13f8", + "metadata": {}, + "source": [ + "As you can see, the code is quite simple. First, the dataset is loaded and split\n", + "into a `test` and `train` set. The XGBoost model is trained with `xgb.train()`.\n", + "XGBoost automatically evaluates metrics we specified on the test set. In our case\n", + "it calculates the *logloss* and the prediction *error*, which is the percentage of\n", + "misclassified examples. To calculate the accuracy, we just have to subtract the error\n", + "from `1.0`. Even in this simple example, most runs result\n", + "in a good accuracy of over `0.90`.\n", + "\n", + "Maybe you have noticed the `config` parameter we pass to the XGBoost algorithm. This\n", + "is a {class}`dict` in which you can specify parameters for the XGBoost algorithm. In this\n", + "simple example, the only parameters we passed are the `objective` and `eval_metric` parameters.\n", + "The value `binary:logistic` tells XGBoost that we aim to train a logistic regression model for\n", + "a binary classification task. You can find an overview over all valid objectives\n", + "[here in the XGBoost documentation](https://xgboost.readthedocs.io/en/latest/parameter.html#learning-task-parameters).\n", + "\n", + "## XGBoost Hyperparameters\n", + "\n", + "Even with the default settings, XGBoost was able to get to a good accuracy on the\n", + "breast cancer dataset. However, as in many machine learning algorithms, there are\n", + "many knobs to tune which might lead to even better performance. Let's explore some of\n", + "them below.\n", + "\n", + "### Maximum tree depth\n", + "\n", + "Remember that XGBoost internally uses many decision tree models to come up with\n", + "predictions. When training a decision tree, we need to tell the algorithm how\n", + "large the tree may get. The parameter for this is called the tree *depth*.\n", + "\n", + ":::{figure} /images/tune-xgboost-depth.svg\n", + ":align: center\n", + ":alt: Decision tree depth\n", + "\n", + "In this image, the left tree has a depth of 2, and the right tree a depth of 3.\n", + "Note that with each level, $2^{(d-1)}$ splits are added, where *d* is the depth\n", + "of the tree.\n", + ":::\n", + "\n", + "Tree depth is a property that concerns the model complexity. If you only allow short\n", + "trees, the models are likely not very precise - they underfit the data. If you allow\n", + "very large trees, the single models are likely to overfit to the data. In practice,\n", + "a number between `2` and `6` is often a good starting point for this parameter.\n", + "\n", + "XGBoost's default value is `3`.\n", + "\n", + "### Minimum child weight\n", + "\n", + "When a decision tree creates new leaves, it splits up the remaining data at one node\n", + "into two groups. If there are only few samples in one of these groups, it often\n", + "doesn't make sense to split it further. One of the reasons for this is that the\n", + "model is harder to train when we have fewer samples.\n", + "\n", + ":::{figure} /images/tune-xgboost-weight.svg\n", + ":align: center\n", + ":alt: Minimum child weight\n", + "\n", + "In this example, we start with 100 examples. At the first node, they are split\n", + "into 4 and 96 samples, respectively. In the next step, our model might find\n", + "that it doesn't make sense to split the 4 examples more. It thus only continues\n", + "to add leaves on the right side.\n", + ":::\n", + "\n", + "The parameter used by the model to decide if it makes sense to split a node is called\n", + "the *minimum child weight*. In the case of linear regression, this is just the absolute\n", + "number of nodes requried in each child. In other objectives, this value is determined\n", + "using the weights of the examples, hence the name.\n", + "\n", + "The larger the value, the more constrained the trees are and the less deep they will be.\n", + "This parameter thus also affects the model complexity. Values can range between 0\n", + "and infinity and are dependent on the sample size. For our ca. 500 examples in the\n", + "breast cancer dataset, values between `0` and `10` should be sensible.\n", + "\n", + "XGBoost's default value is `1`.\n", + "\n", + "### Subsample size\n", + "\n", + "Each decision tree we add is trained on a subsample of the total training dataset.\n", + "The probabilities for the samples are weighted according to the XGBoost algorithm,\n", + "but we can decide on which fraction of the samples we want to train each decision\n", + "tree on.\n", + "\n", + "Setting this value to `0.7` would mean that we randomly sample `70%` of the\n", + "training dataset before each training iteration.\n", + "\n", + "XGBoost's default value is `1`.\n", + "\n", + "### Learning rate / Eta\n", + "\n", + "Remember that XGBoost sequentially trains many decision trees, and that later trees\n", + "are more likely trained on data that has been misclassified by prior trees. In effect\n", + "this means that earlier trees make decisions for easy samples (i.e. those samples that\n", + "can easily be classified) and later trees make decisions for harder samples. It is then\n", + "sensible to assume that the later trees are less accurate than earlier trees.\n", + "\n", + "To address this fact, XGBoost uses a parameter called *Eta*, which is sometimes called\n", + "the *learning rate*. Don't confuse this with learning rates from gradient descent!\n", + "The original [paper on stochastic gradient boosting](https://www.sciencedirect.com/science/article/abs/pii/S0167947301000652)\n", + "introduces this parameter like so:\n", + "\n", + "$$\n", + "F_m(x) = F_{m-1}(x) + \\eta \\cdot \\gamma_{lm} \\textbf{1}(x \\in R_{lm})\n", + "$$\n", + "\n", + "This is just a complicated way to say that when we train we new decision tree,\n", + "represented by $\\gamma_{lm} \\textbf{1}(x \\in R_{lm})$, we want to dampen\n", + "its effect on the previous prediction $F_{m-1}(x)$ with a factor\n", + "$\\eta$.\n", + "\n", + "Typical values for this parameter are between `0.01` and `` 0.3` ``.\n", + "\n", + "XGBoost's default value is `0.3`.\n", + "\n", + "### Number of boost rounds\n", + "\n", + "Lastly, we can decide on how many boosting rounds we perform, which means how\n", + "many decision trees we ultimately train. When we do heavy subsampling or use small\n", + "learning rate, it might make sense to increase the number of boosting rounds.\n", + "\n", + "XGBoost's default value is `10`.\n", + "\n", + "### Putting it together\n", + "\n", + "Let's see how this looks like in code! We just need to adjust our `config` dict:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "35073e88", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Accuracy: 0.9790\n" + ] + } + ], + "source": [ + "if __name__ == \"__main__\":\n", + " config = {\n", + " \"objective\": \"binary:logistic\",\n", + " \"eval_metric\": [\"logloss\", \"error\"],\n", + " \"max_depth\": 2,\n", + " \"min_child_weight\": 0,\n", + " \"subsample\": 0.8,\n", + " \"eta\": 0.2,\n", + " }\n", + " results = train_breast_cancer(config)\n", + " accuracy = 1.0 - results[\"eval\"][\"error\"][-1]\n", + " print(f\"Accuracy: {accuracy:.4f}\")\n" + ] + }, + { + "cell_type": "markdown", + "id": "69cf0c13", + "metadata": {}, + "source": [ + "The rest stays the same. Please note that we do not adjust the `num_boost_rounds` here.\n", + "The result should also show a high accuracy of over 90%.\n", + "\n", + "## Tuning the configuration parameters\n", + "\n", + "XGBoosts default parameters already lead to a good accuracy, and even our guesses in the\n", + "last section should result in accuracies well above 90%. However, our guesses were\n", + "just that: guesses. Often we do not know what combination of parameters would actually\n", + "lead to the best results on a machine learning task.\n", + "\n", + "Unfortunately, there are infinitely many combinations of hyperparameters we could try\n", + "out. Should we combine `max_depth=3` with `subsample=0.8` or with `subsample=0.9`?\n", + "What about the other parameters?\n", + "\n", + "This is where hyperparameter tuning comes into play. By using tuning libraries such as\n", + "Ray Tune we can try out combinations of hyperparameters. Using sophisticated search\n", + "strategies, these parameters can be selected so that they are likely to lead to good\n", + "results (avoiding an expensive *exhaustive search*). Also, trials that do not perform\n", + "well can be preemptively stopped to reduce waste of computing resources. Lastly, Ray Tune\n", + "also takes care of training these runs in parallel, greatly increasing search speed.\n", + "\n", + "Let's start with a basic example on how to use Tune for this. We just need to make\n", + "a few changes to our code-block:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "ff856a82", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2022-07-22 15:52:52,004\tINFO services.py:1483 -- View the Ray dashboard at \u001b[1m\u001b[32mhttp://127.0.0.1:8268\u001b[39m\u001b[22m\n", + "2022-07-22 15:52:55,858\tWARNING function_trainable.py:619 -- Function checkpointing is disabled. This may result in unexpected behavior when using checkpointing features or certain schedulers. To enable, set the train function arguments to be `func(config, checkpoint_dir=None)`.\n" + ] + }, + { + "data": { + "text/html": [ + "== Status ==
Current time: 2022-07-22 15:53:04 (running for 00:00:07.77)
Memory usage on this node: 10.5/16.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/4.57 GiB heap, 0.0/2.0 GiB objects
Result logdir: /Users/kai/ray_results/train_breast_cancer_2022-07-22_15-52-48
Number of trials: 10/10 (10 TERMINATED)
\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "
Trial name status loc eta max_depth min_child_weight subsample acc iter total time (s)
train_breast_cancer_f8669_00000TERMINATED127.0.0.1:488520.0069356 5 3 0.8235040.944056 1 0.0316169
train_breast_cancer_f8669_00001TERMINATED127.0.0.1:488570.00145619 6 3 0.8329470.958042 1 0.0328588
train_breast_cancer_f8669_00002TERMINATED127.0.0.1:488580.00108208 7 3 0.9873190.944056 1 0.0319381
train_breast_cancer_f8669_00003TERMINATED127.0.0.1:488590.00530429 8 2 0.6156910.923077 1 0.028388
train_breast_cancer_f8669_00004TERMINATED127.0.0.1:488600.000721843 8 1 0.6509730.958042 1 0.0299618
train_breast_cancer_f8669_00005TERMINATED127.0.0.1:488610.0074509 1 1 0.7383410.874126 1 0.0193682
train_breast_cancer_f8669_00006TERMINATED127.0.0.1:488620.0879882 8 2 0.6715760.944056 1 0.0267372
train_breast_cancer_f8669_00007TERMINATED127.0.0.1:488630.0765404 7 2 0.7081570.965035 1 0.0276129
train_breast_cancer_f8669_00008TERMINATED127.0.0.1:488640.000627649 6 1 0.81121 0.951049 1 0.0310998
train_breast_cancer_f8669_00009TERMINATED127.0.0.1:488650.000383711 2 3 0.9905790.93007 1 0.0274954


" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2022-07-22 15:52:57,385\tINFO plugin_schema_manager.py:52 -- Loading the default runtime env schemas: ['/Users/kai/coding/ray/python/ray/_private/runtime_env/../../runtime_env/schemas/working_dir_schema.json', '/Users/kai/coding/ray/python/ray/_private/runtime_env/../../runtime_env/schemas/pip_schema.json'].\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Result for train_breast_cancer_f8669_00000:\n", + " date: 2022-07-22_15-53-00\n", + " done: true\n", + " experiment_id: 07d10c5f31e74133b53272b7ccf9c528\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " mean_accuracy: 0.9440559440559441\n", + " node_ip: 127.0.0.1\n", + " pid: 48852\n", + " time_since_restore: 0.031616926193237305\n", + " time_this_iter_s: 0.031616926193237305\n", + " time_total_s: 0.031616926193237305\n", + " timestamp: 1658501580\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: f8669_00000\n", + " warmup_time: 0.0027849674224853516\n", + " \n", + "Result for train_breast_cancer_f8669_00009:\n", + " date: 2022-07-22_15-53-04\n", + " done: true\n", + " experiment_id: bc0d5dd2d079432b859faac8a18928f0\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " mean_accuracy: 0.9300699300699301\n", + " node_ip: 127.0.0.1\n", + " pid: 48865\n", + " time_since_restore: 0.027495384216308594\n", + " time_this_iter_s: 0.027495384216308594\n", + " time_total_s: 0.027495384216308594\n", + " timestamp: 1658501584\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: f8669_00009\n", + " warmup_time: 0.005235910415649414\n", + " \n", + "Result for train_breast_cancer_f8669_00001:\n", + " date: 2022-07-22_15-53-04\n", + " done: true\n", + " experiment_id: 4b10d350d4374a0d9e7d0c3b1d4e3203\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " mean_accuracy: 0.958041958041958\n", + " node_ip: 127.0.0.1\n", + " pid: 48857\n", + " time_since_restore: 0.032858848571777344\n", + " time_this_iter_s: 0.032858848571777344\n", + " time_total_s: 0.032858848571777344\n", + " timestamp: 1658501584\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: f8669_00001\n", + " warmup_time: 0.004731178283691406\n", + " \n", + "Result for train_breast_cancer_f8669_00008:\n", + " date: 2022-07-22_15-53-04\n", + " done: true\n", + " experiment_id: 91c25cbbeb6f409d93e1d6537cb8e1ee\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " mean_accuracy: 0.951048951048951\n", + " node_ip: 127.0.0.1\n", + " pid: 48864\n", + " time_since_restore: 0.031099796295166016\n", + " time_this_iter_s: 0.031099796295166016\n", + " time_total_s: 0.031099796295166016\n", + " timestamp: 1658501584\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: f8669_00008\n", + " warmup_time: 0.003270864486694336\n", + " \n", + "Result for train_breast_cancer_f8669_00005:\n", + " date: 2022-07-22_15-53-04\n", + " done: true\n", + " experiment_id: d225b0fb59e14da7adba952456ccf1d5\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " mean_accuracy: 0.8741258741258742\n", + " node_ip: 127.0.0.1\n", + " pid: 48861\n", + " time_since_restore: 0.01936817169189453\n", + " time_this_iter_s: 0.01936817169189453\n", + " time_total_s: 0.01936817169189453\n", + " timestamp: 1658501584\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: f8669_00005\n", + " warmup_time: 0.003901958465576172\n", + " \n", + "Result for train_breast_cancer_f8669_00004:\n", + " date: 2022-07-22_15-53-04\n", + " done: true\n", + " experiment_id: 322484af6ea5422f8aaf8ff6a91af4f7\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " mean_accuracy: 0.958041958041958\n", + " node_ip: 127.0.0.1\n", + " pid: 48860\n", + " time_since_restore: 0.029961824417114258\n", + " time_this_iter_s: 0.029961824417114258\n", + " time_total_s: 0.029961824417114258\n", + " timestamp: 1658501584\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: f8669_00004\n", + " warmup_time: 0.003547191619873047\n", + " \n", + "Result for train_breast_cancer_f8669_00002:\n", + " date: 2022-07-22_15-53-04\n", + " done: true\n", + " experiment_id: 3f588954160b42ce8ce200f68127ebcd\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " mean_accuracy: 0.9440559440559441\n", + " node_ip: 127.0.0.1\n", + " pid: 48858\n", + " time_since_restore: 0.03193807601928711\n", + " time_this_iter_s: 0.03193807601928711\n", + " time_total_s: 0.03193807601928711\n", + " timestamp: 1658501584\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: f8669_00002\n", + " warmup_time: 0.003523111343383789\n", + " \n", + "Result for train_breast_cancer_f8669_00003:\n", + " date: 2022-07-22_15-53-04\n", + " done: true\n", + " experiment_id: a39ea777ce2d4ebca51b3d7a4179dae5\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " mean_accuracy: 0.9230769230769231\n", + " node_ip: 127.0.0.1\n", + " pid: 48859\n", + " time_since_restore: 0.028388023376464844\n", + " time_this_iter_s: 0.028388023376464844\n", + " time_total_s: 0.028388023376464844\n", + " timestamp: 1658501584\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: f8669_00003\n", + " warmup_time: 0.0035560131072998047\n", + " \n", + "Result for train_breast_cancer_f8669_00006:\n", + " date: 2022-07-22_15-53-04\n", + " done: true\n", + " experiment_id: f97c6b9674854f8d89ec26ba58cc1618\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " mean_accuracy: 0.9440559440559441\n", + " node_ip: 127.0.0.1\n", + " pid: 48862\n", + " time_since_restore: 0.026737213134765625\n", + " time_this_iter_s: 0.026737213134765625\n", + " time_total_s: 0.026737213134765625\n", + " timestamp: 1658501584\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: f8669_00006\n", + " warmup_time: 0.003425121307373047\n", + " \n", + "Result for train_breast_cancer_f8669_00007:\n", + " date: 2022-07-22_15-53-04\n", + " done: true\n", + " experiment_id: ff172037065a4d55998ed72f51bdc5df\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " mean_accuracy: 0.965034965034965\n", + " node_ip: 127.0.0.1\n", + " pid: 48863\n", + " time_since_restore: 0.027612924575805664\n", + " time_this_iter_s: 0.027612924575805664\n", + " time_total_s: 0.027612924575805664\n", + " timestamp: 1658501584\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: f8669_00007\n", + " warmup_time: 0.0031311511993408203\n", + " \n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2022-07-22 15:53:04,846\tINFO tune.py:738 -- Total run time: 8.99 seconds (7.74 seconds for the tuning loop).\n" + ] + } + ], + "source": [ + "import sklearn.datasets\n", + "import sklearn.metrics\n", + "\n", + "from ray import air, tune\n", + "from ray.air import session\n", + "\n", + "\n", + "def train_breast_cancer(config):\n", + " # Load dataset\n", + " data, labels = sklearn.datasets.load_breast_cancer(return_X_y=True)\n", + " # Split into train and test set\n", + " train_x, test_x, train_y, test_y = train_test_split(data, labels, test_size=0.25)\n", + " # Build input matrices for XGBoost\n", + " train_set = xgb.DMatrix(train_x, label=train_y)\n", + " test_set = xgb.DMatrix(test_x, label=test_y)\n", + " # Train the classifier\n", + " results = {}\n", + " xgb.train(\n", + " config,\n", + " train_set,\n", + " evals=[(test_set, \"eval\")],\n", + " evals_result=results,\n", + " verbose_eval=False,\n", + " )\n", + " # Return prediction accuracy\n", + " accuracy = 1.0 - results[\"eval\"][\"error\"][-1]\n", + " session.report({\"mean_accuracy\": accuracy, \"done\": True})\n", + "\n", + "\n", + "if __name__ == \"__main__\":\n", + " config = {\n", + " \"objective\": \"binary:logistic\",\n", + " \"eval_metric\": [\"logloss\", \"error\"],\n", + " \"max_depth\": tune.randint(1, 9),\n", + " \"min_child_weight\": tune.choice([1, 2, 3]),\n", + " \"subsample\": tune.uniform(0.5, 1.0),\n", + " \"eta\": tune.loguniform(1e-4, 1e-1),\n", + " }\n", + " tuner = tune.Tuner(\n", + " train_breast_cancer,\n", + " tune_config=tune.TuneConfig(\n", + " num_samples=10,\n", + " ),\n", + " param_space=config,\n", + " )\n", + " results = tuner.fit()\n" + ] + }, + { + "cell_type": "markdown", + "id": "4999e858", + "metadata": {}, + "source": [ + "As you can see, the changes in the actual training function are minimal. Instead of\n", + "returning the accuracy value, we report it back to Tune using `session.report()`.\n", + "Our `config` dictionary only changed slightly. Instead of passing hard-coded\n", + "parameters, we tell Tune to choose values from a range of valid options. There are\n", + "a number of options we have here, all of which are explained in\n", + "{ref}`the Tune docs `.\n", + "\n", + "For a brief explanation, this is what they do:\n", + "\n", + "- `tune.randint(min, max)` chooses a random integer value between *min* and *max*.\n", + " Note that *max* is exclusive, so it will not be sampled.\n", + "- `tune.choice([a, b, c])` chooses one of the items of the list at random. Each item\n", + " has the same chance to be sampled.\n", + "- `tune.uniform(min, max)` samples a floating point number between *min* and *max*.\n", + " Note that *max* is exclusive here, too.\n", + "- `tune.loguniform(min, max, base=10)` samples a floating point number between *min* and *max*,\n", + " but applies a logarithmic transformation to these boundaries first. Thus, this makes\n", + " it easy to sample values from different orders of magnitude.\n", + "\n", + "The `num_samples=10` option we pass to the `TuneConfig()` means that we sample 10 different\n", + "hyperparameter configurations from this search space.\n", + "\n", + "The output of our training run coud look like this:\n", + "\n", + "```{code-block} bash\n", + ":emphasize-lines: 14\n", + "\n", + " Number of trials: 10/10 (10 TERMINATED)\n", + " +---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+----------+--------+------------------+\n", + " | Trial name | status | loc | eta | max_depth | min_child_weight | subsample | acc | iter | total time (s) |\n", + " |---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+----------+--------+------------------|\n", + " | train_breast_cancer_b63aa_00000 | TERMINATED | | 0.000117625 | 2 | 2 | 0.616347 | 0.916084 | 1 | 0.0306492 |\n", + " | train_breast_cancer_b63aa_00001 | TERMINATED | | 0.0382954 | 8 | 2 | 0.581549 | 0.937063 | 1 | 0.0357082 |\n", + " | train_breast_cancer_b63aa_00002 | TERMINATED | | 0.000217926 | 1 | 3 | 0.528428 | 0.874126 | 1 | 0.0264609 |\n", + " | train_breast_cancer_b63aa_00003 | TERMINATED | | 0.000120929 | 8 | 1 | 0.634508 | 0.958042 | 1 | 0.036406 |\n", + " | train_breast_cancer_b63aa_00004 | TERMINATED | | 0.00839715 | 5 | 1 | 0.730624 | 0.958042 | 1 | 0.0389378 |\n", + " | train_breast_cancer_b63aa_00005 | TERMINATED | | 0.000732948 | 8 | 2 | 0.915863 | 0.958042 | 1 | 0.0382841 |\n", + " | train_breast_cancer_b63aa_00006 | TERMINATED | | 0.000856226 | 4 | 1 | 0.645209 | 0.916084 | 1 | 0.0357089 |\n", + " | train_breast_cancer_b63aa_00007 | TERMINATED | | 0.00769908 | 7 | 1 | 0.729443 | 0.909091 | 1 | 0.0390737 |\n", + " | train_breast_cancer_b63aa_00008 | TERMINATED | | 0.00186339 | 5 | 3 | 0.595744 | 0.944056 | 1 | 0.0343912 |\n", + " | train_breast_cancer_b63aa_00009 | TERMINATED | | 0.000950272 | 3 | 2 | 0.835504 | 0.965035 | 1 | 0.0348201 |\n", + " +---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+----------+--------+------------------+\n", + "```\n", + "\n", + "The best configuration we found used `eta=0.000950272`, `max_depth=3`,\n", + "`min_child_weight=2`, `subsample=0.835504` and reached an accuracy of\n", + "`0.965035`.\n", + "\n", + "## Early stopping\n", + "\n", + "Currently, Tune samples 10 different hyperparameter configurations and trains a full\n", + "XGBoost on all of them. In our small example, training is very fast. However,\n", + "if training takes longer, a significant amount of computer resources is spent on trials\n", + "that will eventually show a bad performance, e.g. a low accuracy. It would be good\n", + "if we could identify these trials early and stop them, so we don't waste any resources.\n", + "\n", + "This is where Tune's *Schedulers* shine. A Tune `TrialScheduler` is responsible\n", + "for starting and stopping trials. Tune implements a number of different schedulers, each\n", + "described {ref}`in the Tune documentation `.\n", + "For our example, we will use the `AsyncHyperBandScheduler` or `ASHAScheduler`.\n", + "\n", + "The basic idea of this scheduler: We sample a number of hyperparameter configurations.\n", + "Each of these configurations is trained for a specific number of iterations.\n", + "After these iterations, only the best performing hyperparameters are retained. These\n", + "are selected according to some loss metric, usually an evaluation loss. This cycle is\n", + "repeated until we end up with the best configuration.\n", + "\n", + "The `ASHAScheduler` needs to know three things:\n", + "\n", + "1. Which metric should be used to identify badly performing trials?\n", + "2. Should this metric be maximized or minimized?\n", + "3. How many iterations does each trial train for?\n", + "\n", + "There are more parameters, which are explained in the\n", + "{ref}`documentation `.\n", + "\n", + "Lastly, we have to report the loss metric to Tune. We do this with a `Callback` that\n", + "XGBoost accepts and calls after each evaluation round. Ray Tune comes\n", + "with {ref}`two XGBoost callbacks `\n", + "we can use for this. The `TuneReportCallback` just reports the evaluation\n", + "metrics back to Tune. The `TuneReportCheckpointCallback` also saves\n", + "checkpoints after each evaluation round. We will just use the latter in this\n", + "example so that we can retrieve the saved model later.\n", + "\n", + "These parameters from the `eval_metrics` configuration setting are then automatically\n", + "reported to Tune via the callback. Here, the raw error will be reported, not the accuracy.\n", + "To display the best reached accuracy, we will inverse it later.\n", + "\n", + "We will also load the best checkpointed model so that we can use it for predictions.\n", + "The best model is selected with respect to the `metric` and `mode` parameters we\n", + "pass to the `TunerConfig()`." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "d08b5b0a", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "== Status ==
Current time: 2022-07-22 16:56:01 (running for 00:00:10.38)
Memory usage on this node: 10.3/16.0 GiB
Using AsyncHyperBand: num_stopped=10\n", + "Bracket: Iter 8.000: -0.5107275277792991 | Iter 4.000: -0.5876629346317344 | Iter 2.000: -0.6544494184997531 | Iter 1.000: -0.6859214191253369
Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/4.57 GiB heap, 0.0/2.0 GiB objects
Current best trial: c28a3_00003 with eval-logloss=0.38665050018083796 and parameters={'objective': 'binary:logistic', 'eval_metric': ['logloss', 'error'], 'max_depth': 2, 'min_child_weight': 3, 'subsample': 0.782626252548841, 'eta': 0.06385952388342125}
Result logdir: /Users/kai/ray_results/train_breast_cancer_2022-07-22_16-55-50
Number of trials: 10/10 (10 TERMINATED)
\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "
Trial name status loc eta max_depth min_child_weight subsample iter total time (s) eval-logloss eval-error
train_breast_cancer_c28a3_00000TERMINATED127.0.0.1:544160.0186954 2 2 0.516916 10 0.22218 0.571496 0.0629371
train_breast_cancer_c28a3_00001TERMINATED127.0.0.1:544400.0304404 8 2 0.745969 2 0.135674 0.650353 0.0629371
train_breast_cancer_c28a3_00002TERMINATED127.0.0.1:544410.0217157 8 3 0.764138 2 0.173076 0.658545 0.041958
train_breast_cancer_c28a3_00003TERMINATED127.0.0.1:544420.0638595 2 3 0.782626 10 0.281865 0.386651 0.041958
train_breast_cancer_c28a3_00004TERMINATED127.0.0.1:544430.00442794 7 2 0.792359 1 0.0270212 0.689577 0.0699301
train_breast_cancer_c28a3_00005TERMINATED127.0.0.1:544440.00222624 3 1 0.536331 1 0.0238512 0.691446 0.0839161
train_breast_cancer_c28a3_00006TERMINATED127.0.0.1:544450.000825129 1 1 0.82472 1 0.015312 0.692624 0.118881
train_breast_cancer_c28a3_00007TERMINATED127.0.0.1:544460.000770826 7 2 0.947268 1 0.0175898 0.692598 0.132867
train_breast_cancer_c28a3_00008TERMINATED127.0.0.1:544470.000429759 7 1 0.88524 1 0.0193739 0.692785 0.0559441
train_breast_cancer_c28a3_00009TERMINATED127.0.0.1:544480.0149863 2 1 0.722738 1 0.0165932 0.682266 0.111888


" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Result for train_breast_cancer_c28a3_00000:\n", + " date: 2022-07-22_16-55-55\n", + " done: false\n", + " eval-error: 0.08391608391608392\n", + " eval-logloss: 0.6790360066440556\n", + " experiment_id: 2a3189442db341519836a07fb2d65dd9\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " node_ip: 127.0.0.1\n", + " pid: 54416\n", + " time_since_restore: 0.01624011993408203\n", + " time_this_iter_s: 0.01624011993408203\n", + " time_total_s: 0.01624011993408203\n", + " timestamp: 1658505355\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: c28a3_00000\n", + " warmup_time: 0.0035409927368164062\n", + " \n", + "Result for train_breast_cancer_c28a3_00000:\n", + " date: 2022-07-22_16-55-56\n", + " done: true\n", + " eval-error: 0.06293706293706294\n", + " eval-logloss: 0.5714958122560194\n", + " experiment_id: 2a3189442db341519836a07fb2d65dd9\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 10\n", + " node_ip: 127.0.0.1\n", + " pid: 54416\n", + " time_since_restore: 0.22218012809753418\n", + " time_this_iter_s: 0.007044076919555664\n", + " time_total_s: 0.22218012809753418\n", + " timestamp: 1658505356\n", + " timesteps_since_restore: 0\n", + " training_iteration: 10\n", + " trial_id: c28a3_00000\n", + " warmup_time: 0.0035409927368164062\n", + " \n", + "Result for train_breast_cancer_c28a3_00003:\n", + " date: 2022-07-22_16-56-01\n", + " done: false\n", + " eval-error: 0.08391608391608392\n", + " eval-logloss: 0.6472820101918041\n", + " experiment_id: 7ff6133237404b4ea4755b9f8cd114f2\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " node_ip: 127.0.0.1\n", + " pid: 54442\n", + " time_since_restore: 0.023206233978271484\n", + " time_this_iter_s: 0.023206233978271484\n", + " time_total_s: 0.023206233978271484\n", + " timestamp: 1658505361\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: c28a3_00003\n", + " warmup_time: 0.006722211837768555\n", + " \n", + "Result for train_breast_cancer_c28a3_00005:\n", + " date: 2022-07-22_16-56-01\n", + " done: true\n", + " eval-error: 0.08391608391608392\n", + " eval-logloss: 0.6914464114429234\n", + " experiment_id: 344762ab6d574b63a9374e19526d0510\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " node_ip: 127.0.0.1\n", + " pid: 54444\n", + " time_since_restore: 0.02385115623474121\n", + " time_this_iter_s: 0.02385115623474121\n", + " time_total_s: 0.02385115623474121\n", + " timestamp: 1658505361\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: c28a3_00005\n", + " warmup_time: 0.008936882019042969\n", + " \n", + "Result for train_breast_cancer_c28a3_00009:\n", + " date: 2022-07-22_16-56-01\n", + " done: true\n", + " eval-error: 0.11188811188811189\n", + " eval-logloss: 0.6822656309688008\n", + " experiment_id: 133901655fa64bf79f2dcc4e8e5e41b1\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " node_ip: 127.0.0.1\n", + " pid: 54448\n", + " time_since_restore: 0.016593217849731445\n", + " time_this_iter_s: 0.016593217849731445\n", + " time_total_s: 0.016593217849731445\n", + " timestamp: 1658505361\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: c28a3_00009\n", + " warmup_time: 0.004940032958984375\n", + " \n", + "Result for train_breast_cancer_c28a3_00007:\n", + " date: 2022-07-22_16-56-01\n", + " done: true\n", + " eval-error: 0.13286713286713286\n", + " eval-logloss: 0.6925980357023386\n", + " experiment_id: b4331027cbaf442ab905b2e51797dbbd\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " node_ip: 127.0.0.1\n", + " pid: 54446\n", + " time_since_restore: 0.017589807510375977\n", + " time_this_iter_s: 0.017589807510375977\n", + " time_total_s: 0.017589807510375977\n", + " timestamp: 1658505361\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: c28a3_00007\n", + " warmup_time: 0.003782033920288086\n", + " \n", + "Result for train_breast_cancer_c28a3_00006:\n", + " date: 2022-07-22_16-56-01\n", + " done: true\n", + " eval-error: 0.11888111888111888\n", + " eval-logloss: 0.6926244418104212\n", + " experiment_id: d3906de5943a4e05a4cc782382f67d24\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " node_ip: 127.0.0.1\n", + " pid: 54445\n", + " time_since_restore: 0.015311956405639648\n", + " time_this_iter_s: 0.015311956405639648\n", + " time_total_s: 0.015311956405639648\n", + " timestamp: 1658505361\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: c28a3_00006\n", + " warmup_time: 0.005506038665771484\n", + " \n", + "Result for train_breast_cancer_c28a3_00002:\n", + " date: 2022-07-22_16-56-01\n", + " done: false\n", + " eval-error: 0.04895104895104895\n", + " eval-logloss: 0.6752762102580571\n", + " experiment_id: a3645fc2d43145d88a1f5b7cc94df703\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " node_ip: 127.0.0.1\n", + " pid: 54441\n", + " time_since_restore: 0.027367830276489258\n", + " time_this_iter_s: 0.027367830276489258\n", + " time_total_s: 0.027367830276489258\n", + " timestamp: 1658505361\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: c28a3_00002\n", + " warmup_time: 0.0062830448150634766\n", + " \n", + "Result for train_breast_cancer_c28a3_00001:\n", + " date: 2022-07-22_16-56-01\n", + " done: false\n", + " eval-error: 0.07692307692307693\n", + " eval-logloss: 0.6698804135089154\n", + " experiment_id: 85766fe4d9fa482a91e396a8fd509a19\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " node_ip: 127.0.0.1\n", + " pid: 54440\n", + " time_since_restore: 0.017169952392578125\n", + " time_this_iter_s: 0.017169952392578125\n", + " time_total_s: 0.017169952392578125\n", + " timestamp: 1658505361\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: c28a3_00001\n", + " warmup_time: 0.006204843521118164\n", + " \n", + "Result for train_breast_cancer_c28a3_00008:\n", + " date: 2022-07-22_16-56-01\n", + " done: true\n", + " eval-error: 0.05594405594405594\n", + " eval-logloss: 0.692784742458717\n", + " experiment_id: 2c7d8bc38ad04536b1dec76819a2b3bf\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " node_ip: 127.0.0.1\n", + " pid: 54447\n", + " time_since_restore: 0.01937389373779297\n", + " time_this_iter_s: 0.01937389373779297\n", + " time_total_s: 0.01937389373779297\n", + " timestamp: 1658505361\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: c28a3_00008\n", + " warmup_time: 0.004342079162597656\n", + " \n", + "Result for train_breast_cancer_c28a3_00001:\n", + " date: 2022-07-22_16-56-01\n", + " done: true\n", + " eval-error: 0.06293706293706294\n", + " eval-logloss: 0.6503534216980834\n", + " experiment_id: 85766fe4d9fa482a91e396a8fd509a19\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 2\n", + " node_ip: 127.0.0.1\n", + " pid: 54440\n", + " time_since_restore: 0.13567376136779785\n", + " time_this_iter_s: 0.11850380897521973\n", + " time_total_s: 0.13567376136779785\n", + " timestamp: 1658505361\n", + " timesteps_since_restore: 0\n", + " training_iteration: 2\n", + " trial_id: c28a3_00001\n", + " warmup_time: 0.006204843521118164\n", + " \n", + "Result for train_breast_cancer_c28a3_00004:\n", + " date: 2022-07-22_16-56-01\n", + " done: true\n", + " eval-error: 0.06993006993006994\n", + " eval-logloss: 0.689577207281873\n", + " experiment_id: ef4fdc645c444112985b4957ab8a84e9\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 1\n", + " node_ip: 127.0.0.1\n", + " pid: 54443\n", + " time_since_restore: 0.027021169662475586\n", + " time_this_iter_s: 0.027021169662475586\n", + " time_total_s: 0.027021169662475586\n", + " timestamp: 1658505361\n", + " timesteps_since_restore: 0\n", + " training_iteration: 1\n", + " trial_id: c28a3_00004\n", + " warmup_time: 0.0063669681549072266\n", + " \n", + "Result for train_breast_cancer_c28a3_00002:\n", + " date: 2022-07-22_16-56-01\n", + " done: true\n", + " eval-error: 0.04195804195804196\n", + " eval-logloss: 0.658545415301423\n", + " experiment_id: a3645fc2d43145d88a1f5b7cc94df703\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 2\n", + " node_ip: 127.0.0.1\n", + " pid: 54441\n", + " time_since_restore: 0.17307591438293457\n", + " time_this_iter_s: 0.1457080841064453\n", + " time_total_s: 0.17307591438293457\n", + " timestamp: 1658505361\n", + " timesteps_since_restore: 0\n", + " training_iteration: 2\n", + " trial_id: c28a3_00002\n", + " warmup_time: 0.0062830448150634766\n", + " \n", + "Result for train_breast_cancer_c28a3_00003:\n", + " date: 2022-07-22_16-56-01\n", + " done: true\n", + " eval-error: 0.04195804195804196\n", + " eval-logloss: 0.38665050018083796\n", + " experiment_id: 7ff6133237404b4ea4755b9f8cd114f2\n", + " hostname: Kais-MacBook-Pro.local\n", + " iterations_since_restore: 10\n", + " node_ip: 127.0.0.1\n", + " pid: 54442\n", + " time_since_restore: 0.28186488151550293\n", + " time_this_iter_s: 0.03063178062438965\n", + " time_total_s: 0.28186488151550293\n", + " timestamp: 1658505361\n", + " timesteps_since_restore: 0\n", + " training_iteration: 10\n", + " trial_id: c28a3_00003\n", + " warmup_time: 0.006722211837768555\n", + " \n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2022-07-22 16:56:01,498\tINFO tune.py:738 -- Total run time: 10.53 seconds (10.37 seconds for the tuning loop).\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Best model parameters: {'objective': 'binary:logistic', 'eval_metric': ['logloss', 'error'], 'max_depth': 2, 'min_child_weight': 3, 'subsample': 0.782626252548841, 'eta': 0.06385952388342125}\n", + "Best model total accuracy: 0.9580\n" + ] + } + ], + "source": [ + "import sklearn.datasets\n", + "import sklearn.metrics\n", + "import os\n", + "from ray.tune.schedulers import ASHAScheduler\n", + "from sklearn.model_selection import train_test_split\n", + "import xgboost as xgb\n", + "\n", + "from ray import air, tune\n", + "from ray.air import session\n", + "from ray.tune.integration.xgboost import TuneReportCheckpointCallback\n", + "\n", + "\n", + "def train_breast_cancer(config: dict):\n", + " # This is a simple training function to be passed into Tune\n", + " # Load dataset\n", + " data, labels = sklearn.datasets.load_breast_cancer(return_X_y=True)\n", + " # Split into train and test set\n", + " train_x, test_x, train_y, test_y = train_test_split(data, labels, test_size=0.25)\n", + " # Build input matrices for XGBoost\n", + " train_set = xgb.DMatrix(train_x, label=train_y)\n", + " test_set = xgb.DMatrix(test_x, label=test_y)\n", + " # Train the classifier, using the Tune callback\n", + " xgb.train(\n", + " config,\n", + " train_set,\n", + " evals=[(test_set, \"eval\")],\n", + " verbose_eval=False,\n", + " callbacks=[TuneReportCheckpointCallback(filename=\"model.xgb\")],\n", + " )\n", + "\n", + "\n", + "def get_best_model_checkpoint(results):\n", + " best_bst = xgb.Booster()\n", + " best_result = results.get_best_result()\n", + "\n", + " with best_result.checkpoint.as_directory() as best_checkpoint_dir:\n", + " best_bst.load_model(os.path.join(best_checkpoint_dir, \"model.xgb\"))\n", + " accuracy = 1.0 - best_result.metrics[\"eval-error\"]\n", + " print(f\"Best model parameters: {best_result.config}\")\n", + " print(f\"Best model total accuracy: {accuracy:.4f}\")\n", + " return best_bst\n", + "\n", + "\n", + "def tune_xgboost(smoke_test=False):\n", + " search_space = {\n", + " # You can mix constants with search space objects.\n", + " \"objective\": \"binary:logistic\",\n", + " \"eval_metric\": [\"logloss\", \"error\"],\n", + " \"max_depth\": tune.randint(1, 9),\n", + " \"min_child_weight\": tune.choice([1, 2, 3]),\n", + " \"subsample\": tune.uniform(0.5, 1.0),\n", + " \"eta\": tune.loguniform(1e-4, 1e-1),\n", + " }\n", + " # This will enable aggressive early stopping of bad trials.\n", + " scheduler = ASHAScheduler(\n", + " max_t=10, grace_period=1, reduction_factor=2 # 10 training iterations\n", + " )\n", + "\n", + " tuner = tune.Tuner(\n", + " train_breast_cancer,\n", + " tune_config=tune.TuneConfig(\n", + " metric=\"eval-logloss\",\n", + " mode=\"min\",\n", + " scheduler=scheduler,\n", + " num_samples=1 if smoke_test else 10,\n", + " ),\n", + " param_space=search_space,\n", + " )\n", + " results = tuner.fit()\n", + "\n", + " return results\n", + "\n", + "\n", + "if __name__ == \"__main__\":\n", + " import argparse\n", + "\n", + " parser = argparse.ArgumentParser()\n", + " parser.add_argument(\n", + " \"--smoke-test\", action=\"store_true\", help=\"Finish quickly for testing\"\n", + " )\n", + " args, _ = parser.parse_known_args()\n", + "\n", + " results = tune_xgboost(smoke_test=args.smoke_test)\n", + "\n", + " # Load the best model checkpoint.\n", + " best_bst = get_best_model_checkpoint(results)\n", + "\n", + " # You could now do further predictions with\n", + " # best_bst.predict(...)\n" + ] + }, + { + "cell_type": "markdown", + "id": "20732fe4", + "metadata": {}, + "source": [ + "The output of our run could look like this:\n", + "\n", + "```{code-block} bash\n", + ":emphasize-lines: 7\n", + "\n", + " Number of trials: 10/10 (10 TERMINATED)\n", + " +---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+--------+------------------+----------------+--------------+\n", + " | Trial name | status | loc | eta | max_depth | min_child_weight | subsample | iter | total time (s) | eval-logloss | eval-error |\n", + " |---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+--------+------------------+----------------+--------------|\n", + " | train_breast_cancer_ba275_00000 | TERMINATED | | 0.00205087 | 2 | 1 | 0.898391 | 10 | 0.380619 | 0.678039 | 0.090909 |\n", + " | train_breast_cancer_ba275_00001 | TERMINATED | | 0.000183834 | 4 | 3 | 0.924939 | 1 | 0.0228798 | 0.693009 | 0.111888 |\n", + " | train_breast_cancer_ba275_00002 | TERMINATED | | 0.0242721 | 7 | 2 | 0.501551 | 10 | 0.376154 | 0.54472 | 0.06993 |\n", + " | train_breast_cancer_ba275_00003 | TERMINATED | | 0.000449692 | 5 | 3 | 0.890212 | 1 | 0.0234981 | 0.692811 | 0.090909 |\n", + " | train_breast_cancer_ba275_00004 | TERMINATED | | 0.000376393 | 7 | 2 | 0.883609 | 1 | 0.0231569 | 0.692847 | 0.062937 |\n", + " | train_breast_cancer_ba275_00005 | TERMINATED | | 0.00231942 | 3 | 3 | 0.877464 | 2 | 0.104867 | 0.689541 | 0.083916 |\n", + " | train_breast_cancer_ba275_00006 | TERMINATED | | 0.000542326 | 1 | 2 | 0.578584 | 1 | 0.0213971 | 0.692765 | 0.083916 |\n", + " | train_breast_cancer_ba275_00007 | TERMINATED | | 0.0016801 | 1 | 2 | 0.975302 | 1 | 0.02226 | 0.691999 | 0.083916 |\n", + " | train_breast_cancer_ba275_00008 | TERMINATED | | 0.000595756 | 8 | 3 | 0.58429 | 1 | 0.0221152 | 0.692657 | 0.06993 |\n", + " | train_breast_cancer_ba275_00009 | TERMINATED | | 0.000357845 | 8 | 1 | 0.637776 | 1 | 0.022635 | 0.692859 | 0.090909 |\n", + " +---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+--------+------------------+----------------+--------------+\n", + "\n", + "\n", + " Best model parameters: {'objective': 'binary:logistic', 'eval_metric': ['logloss', 'error'], 'max_depth': 7, 'min_child_weight': 2, 'subsample': 0.5015513240240503, 'eta': 0.024272050872920895}\n", + " Best model total accuracy: 0.9301\n", + "```\n", + "\n", + "As you can see, most trials have been stopped only after a few iterations. Only the\n", + "two most promising trials were run for the full 10 iterations.\n", + "\n", + "You can also ensure that all available resources are being used as the scheduler\n", + "terminates trials, freeing them up. This can be done through the\n", + "`ResourceChangingScheduler`. An example of this can be found here:\n", + "{doc}`/tune/examples/includes/xgboost_dynamic_resources_example`.\n", + "\n", + "## Using fractional GPUs\n", + "\n", + "You can often accelerate your training by using GPUs in addition to CPUs. However,\n", + "you usually don't have as many GPUs as you have trials to run. For instance, if you\n", + "run 10 Tune trials in parallel, you usually don't have access to 10 separate GPUs.\n", + "\n", + "Tune supports *fractional GPUs*. This means that each task is assigned a fraction\n", + "of the GPU memory for training. For 10 tasks, this could look like this:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7d1b20a3", + "metadata": {}, + "outputs": [], + "source": [ + "config = {\n", + " \"objective\": \"binary:logistic\",\n", + " \"eval_metric\": [\"logloss\", \"error\"],\n", + " \"tree_method\": \"gpu_hist\",\n", + " \"max_depth\": tune.randint(1, 9),\n", + " \"min_child_weight\": tune.choice([1, 2, 3]),\n", + " \"subsample\": tune.uniform(0.5, 1.0),\n", + " \"eta\": tune.loguniform(1e-4, 1e-1),\n", + "}\n", + "\n", + "tuner = tune.Tuner(\n", + " tune.with_resources(train_breast_cancer, resources={\"cpu\": 1, \"gpu\": 0.1}),\n", + " tune_config=tune.TuneConfig(\n", + " num_samples=10,\n", + " ),\n", + " param_space=config,\n", + ")\n", + "results = tuner.fit()\n" + ] + }, + { + "cell_type": "markdown", + "id": "ee131861", + "metadata": {}, + "source": [ + "Each task thus works with 10% of the available GPU memory. You also have to tell\n", + "XGBoost to use the `gpu_hist` tree method, so it knows it should use the GPU.\n", + "\n", + "## Conclusion\n", + "\n", + "You should now have a basic understanding on how to train XGBoost models and on how\n", + "to tune the hyperparameters to yield the best results. In our simple example,\n", + "Tuning the parameters didn't make a huge difference for the accuracy.\n", + "But in larger applications, intelligent hyperparameter tuning can make the\n", + "difference between a model that doesn't seem to learn at all, and a model\n", + "that outperforms all the other ones.\n", + "\n", + "## More XGBoost Examples\n", + "\n", + "- {doc}`/tune/examples/includes/xgboost_dynamic_resources_example`:\n", + " Trains a basic XGBoost model with Tune with the class-based API and a ResourceChangingScheduler, ensuring all resources are being used at all time.\n", + "\n", + "## Learn More\n", + "\n", + "- [XGBoost Hyperparameter Tuning - A Visual Guide](https://kevinvecmanis.io/machine%20learning/hyperparameter%20tuning/dataviz/python/2019/05/11/XGBoost-Tuning-Visual-Guide.html)\n", + "- [Notes on XGBoost Parameter Tuning](https://xgboost.readthedocs.io/en/latest/tutorials/param_tuning.html)\n", + "- [Doing XGBoost Hyperparameter Tuning the smart way](https://towardsdatascience.com/doing-xgboost-hyper-parameter-tuning-the-smart-way-part-1-of-2-f6d255a45dde)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "ray_dev_py38", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:05:16) \n[Clang 12.0.1 ]" + }, + "orphan": true, + "vscode": { + "interpreter": { + "hash": "265d195fda5292fe8f69c6e37c435a5634a1ed3b6799724e66a975f68fa21517" + } + } }, - { - "data": { - "text/html": [ - "== Status ==
Current time: 2022-07-22 15:53:04 (running for 00:00:07.77)
Memory usage on this node: 10.5/16.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/4.57 GiB heap, 0.0/2.0 GiB objects
Result logdir: /Users/kai/ray_results/train_breast_cancer_2022-07-22_15-52-48
Number of trials: 10/10 (10 TERMINATED)
\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "
Trial name status loc eta max_depth min_child_weight subsample acc iter total time (s)
train_breast_cancer_f8669_00000TERMINATED127.0.0.1:488520.0069356 5 3 0.8235040.944056 1 0.0316169
train_breast_cancer_f8669_00001TERMINATED127.0.0.1:488570.00145619 6 3 0.8329470.958042 1 0.0328588
train_breast_cancer_f8669_00002TERMINATED127.0.0.1:488580.00108208 7 3 0.9873190.944056 1 0.0319381
train_breast_cancer_f8669_00003TERMINATED127.0.0.1:488590.00530429 8 2 0.6156910.923077 1 0.028388
train_breast_cancer_f8669_00004TERMINATED127.0.0.1:488600.000721843 8 1 0.6509730.958042 1 0.0299618
train_breast_cancer_f8669_00005TERMINATED127.0.0.1:488610.0074509 1 1 0.7383410.874126 1 0.0193682
train_breast_cancer_f8669_00006TERMINATED127.0.0.1:488620.0879882 8 2 0.6715760.944056 1 0.0267372
train_breast_cancer_f8669_00007TERMINATED127.0.0.1:488630.0765404 7 2 0.7081570.965035 1 0.0276129
train_breast_cancer_f8669_00008TERMINATED127.0.0.1:488640.000627649 6 1 0.81121 0.951049 1 0.0310998
train_breast_cancer_f8669_00009TERMINATED127.0.0.1:488650.000383711 2 3 0.9905790.93007 1 0.0274954


" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "2022-07-22 15:52:57,385\tINFO plugin_schema_manager.py:52 -- Loading the default runtime env schemas: ['/Users/kai/coding/ray/python/ray/_private/runtime_env/../../runtime_env/schemas/working_dir_schema.json', '/Users/kai/coding/ray/python/ray/_private/runtime_env/../../runtime_env/schemas/pip_schema.json'].\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Result for train_breast_cancer_f8669_00000:\n", - " date: 2022-07-22_15-53-00\n", - " done: true\n", - " experiment_id: 07d10c5f31e74133b53272b7ccf9c528\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " mean_accuracy: 0.9440559440559441\n", - " node_ip: 127.0.0.1\n", - " pid: 48852\n", - " time_since_restore: 0.031616926193237305\n", - " time_this_iter_s: 0.031616926193237305\n", - " time_total_s: 0.031616926193237305\n", - " timestamp: 1658501580\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: f8669_00000\n", - " warmup_time: 0.0027849674224853516\n", - " \n", - "Result for train_breast_cancer_f8669_00009:\n", - " date: 2022-07-22_15-53-04\n", - " done: true\n", - " experiment_id: bc0d5dd2d079432b859faac8a18928f0\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " mean_accuracy: 0.9300699300699301\n", - " node_ip: 127.0.0.1\n", - " pid: 48865\n", - " time_since_restore: 0.027495384216308594\n", - " time_this_iter_s: 0.027495384216308594\n", - " time_total_s: 0.027495384216308594\n", - " timestamp: 1658501584\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: f8669_00009\n", - " warmup_time: 0.005235910415649414\n", - " \n", - "Result for train_breast_cancer_f8669_00001:\n", - " date: 2022-07-22_15-53-04\n", - " done: true\n", - " experiment_id: 4b10d350d4374a0d9e7d0c3b1d4e3203\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " mean_accuracy: 0.958041958041958\n", - " node_ip: 127.0.0.1\n", - " pid: 48857\n", - " time_since_restore: 0.032858848571777344\n", - " time_this_iter_s: 0.032858848571777344\n", - " time_total_s: 0.032858848571777344\n", - " timestamp: 1658501584\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: f8669_00001\n", - " warmup_time: 0.004731178283691406\n", - " \n", - "Result for train_breast_cancer_f8669_00008:\n", - " date: 2022-07-22_15-53-04\n", - " done: true\n", - " experiment_id: 91c25cbbeb6f409d93e1d6537cb8e1ee\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " mean_accuracy: 0.951048951048951\n", - " node_ip: 127.0.0.1\n", - " pid: 48864\n", - " time_since_restore: 0.031099796295166016\n", - " time_this_iter_s: 0.031099796295166016\n", - " time_total_s: 0.031099796295166016\n", - " timestamp: 1658501584\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: f8669_00008\n", - " warmup_time: 0.003270864486694336\n", - " \n", - "Result for train_breast_cancer_f8669_00005:\n", - " date: 2022-07-22_15-53-04\n", - " done: true\n", - " experiment_id: d225b0fb59e14da7adba952456ccf1d5\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " mean_accuracy: 0.8741258741258742\n", - " node_ip: 127.0.0.1\n", - " pid: 48861\n", - " time_since_restore: 0.01936817169189453\n", - " time_this_iter_s: 0.01936817169189453\n", - " time_total_s: 0.01936817169189453\n", - " timestamp: 1658501584\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: f8669_00005\n", - " warmup_time: 0.003901958465576172\n", - " \n", - "Result for train_breast_cancer_f8669_00004:\n", - " date: 2022-07-22_15-53-04\n", - " done: true\n", - " experiment_id: 322484af6ea5422f8aaf8ff6a91af4f7\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " mean_accuracy: 0.958041958041958\n", - " node_ip: 127.0.0.1\n", - " pid: 48860\n", - " time_since_restore: 0.029961824417114258\n", - " time_this_iter_s: 0.029961824417114258\n", - " time_total_s: 0.029961824417114258\n", - " timestamp: 1658501584\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: f8669_00004\n", - " warmup_time: 0.003547191619873047\n", - " \n", - "Result for train_breast_cancer_f8669_00002:\n", - " date: 2022-07-22_15-53-04\n", - " done: true\n", - " experiment_id: 3f588954160b42ce8ce200f68127ebcd\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " mean_accuracy: 0.9440559440559441\n", - " node_ip: 127.0.0.1\n", - " pid: 48858\n", - " time_since_restore: 0.03193807601928711\n", - " time_this_iter_s: 0.03193807601928711\n", - " time_total_s: 0.03193807601928711\n", - " timestamp: 1658501584\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: f8669_00002\n", - " warmup_time: 0.003523111343383789\n", - " \n", - "Result for train_breast_cancer_f8669_00003:\n", - " date: 2022-07-22_15-53-04\n", - " done: true\n", - " experiment_id: a39ea777ce2d4ebca51b3d7a4179dae5\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " mean_accuracy: 0.9230769230769231\n", - " node_ip: 127.0.0.1\n", - " pid: 48859\n", - " time_since_restore: 0.028388023376464844\n", - " time_this_iter_s: 0.028388023376464844\n", - " time_total_s: 0.028388023376464844\n", - " timestamp: 1658501584\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: f8669_00003\n", - " warmup_time: 0.0035560131072998047\n", - " \n", - "Result for train_breast_cancer_f8669_00006:\n", - " date: 2022-07-22_15-53-04\n", - " done: true\n", - " experiment_id: f97c6b9674854f8d89ec26ba58cc1618\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " mean_accuracy: 0.9440559440559441\n", - " node_ip: 127.0.0.1\n", - " pid: 48862\n", - " time_since_restore: 0.026737213134765625\n", - " time_this_iter_s: 0.026737213134765625\n", - " time_total_s: 0.026737213134765625\n", - " timestamp: 1658501584\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: f8669_00006\n", - " warmup_time: 0.003425121307373047\n", - " \n", - "Result for train_breast_cancer_f8669_00007:\n", - " date: 2022-07-22_15-53-04\n", - " done: true\n", - " experiment_id: ff172037065a4d55998ed72f51bdc5df\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " mean_accuracy: 0.965034965034965\n", - " node_ip: 127.0.0.1\n", - " pid: 48863\n", - " time_since_restore: 0.027612924575805664\n", - " time_this_iter_s: 0.027612924575805664\n", - " time_total_s: 0.027612924575805664\n", - " timestamp: 1658501584\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: f8669_00007\n", - " warmup_time: 0.0031311511993408203\n", - " \n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "2022-07-22 15:53:04,846\tINFO tune.py:738 -- Total run time: 8.99 seconds (7.74 seconds for the tuning loop).\n" - ] - } - ], - "source": [ - "import sklearn.datasets\n", - "import sklearn.metrics\n", - "\n", - "from ray import air, tune\n", - "from ray.air import session\n", - "\n", - "\n", - "def train_breast_cancer(config):\n", - " # Load dataset\n", - " data, labels = sklearn.datasets.load_breast_cancer(return_X_y=True)\n", - " # Split into train and test set\n", - " train_x, test_x, train_y, test_y = train_test_split(data, labels, test_size=0.25)\n", - " # Build input matrices for XGBoost\n", - " train_set = xgb.DMatrix(train_x, label=train_y)\n", - " test_set = xgb.DMatrix(test_x, label=test_y)\n", - " # Train the classifier\n", - " results = {}\n", - " xgb.train(\n", - " config,\n", - " train_set,\n", - " evals=[(test_set, \"eval\")],\n", - " evals_result=results,\n", - " verbose_eval=False,\n", - " )\n", - " # Return prediction accuracy\n", - " accuracy = 1.0 - results[\"eval\"][\"error\"][-1]\n", - " session.report({\"mean_accuracy\": accuracy, \"done\": True})\n", - "\n", - "\n", - "if __name__ == \"__main__\":\n", - " config = {\n", - " \"objective\": \"binary:logistic\",\n", - " \"eval_metric\": [\"logloss\", \"error\"],\n", - " \"max_depth\": tune.randint(1, 9),\n", - " \"min_child_weight\": tune.choice([1, 2, 3]),\n", - " \"subsample\": tune.uniform(0.5, 1.0),\n", - " \"eta\": tune.loguniform(1e-4, 1e-1),\n", - " }\n", - " tuner = tune.Tuner(\n", - " train_breast_cancer,\n", - " tune_config=tune.TuneConfig(\n", - " num_samples=10,\n", - " ),\n", - " param_space=config,\n", - " )\n", - " results = tuner.fit()\n" - ] - }, - { - "cell_type": "markdown", - "id": "4999e858", - "metadata": {}, - "source": [ - "As you can see, the changes in the actual training function are minimal. Instead of\n", - "returning the accuracy value, we report it back to Tune using `session.report()`.\n", - "Our `config` dictionary only changed slightly. Instead of passing hard-coded\n", - "parameters, we tell Tune to choose values from a range of valid options. There are\n", - "a number of options we have here, all of which are explained in\n", - "{ref}`the Tune docs `.\n", - "\n", - "For a brief explanation, this is what they do:\n", - "\n", - "- `tune.randint(min, max)` chooses a random integer value between *min* and *max*.\n", - " Note that *max* is exclusive, so it will not be sampled.\n", - "- `tune.choice([a, b, c])` chooses one of the items of the list at random. Each item\n", - " has the same chance to be sampled.\n", - "- `tune.uniform(min, max)` samples a floating point number between *min* and *max*.\n", - " Note that *max* is exclusive here, too.\n", - "- `tune.loguniform(min, max, base=10)` samples a floating point number between *min* and *max*,\n", - " but applies a logarithmic transformation to these boundaries first. Thus, this makes\n", - " it easy to sample values from different orders of magnitude.\n", - "\n", - "The `num_samples=10` option we pass to the `TuneConfig()` means that we sample 10 different\n", - "hyperparameter configurations from this search space.\n", - "\n", - "The output of our training run coud look like this:\n", - "\n", - "```{code-block} bash\n", - ":emphasize-lines: 14\n", - "\n", - " Number of trials: 10/10 (10 TERMINATED)\n", - " +---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+----------+--------+------------------+\n", - " | Trial name | status | loc | eta | max_depth | min_child_weight | subsample | acc | iter | total time (s) |\n", - " |---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+----------+--------+------------------|\n", - " | train_breast_cancer_b63aa_00000 | TERMINATED | | 0.000117625 | 2 | 2 | 0.616347 | 0.916084 | 1 | 0.0306492 |\n", - " | train_breast_cancer_b63aa_00001 | TERMINATED | | 0.0382954 | 8 | 2 | 0.581549 | 0.937063 | 1 | 0.0357082 |\n", - " | train_breast_cancer_b63aa_00002 | TERMINATED | | 0.000217926 | 1 | 3 | 0.528428 | 0.874126 | 1 | 0.0264609 |\n", - " | train_breast_cancer_b63aa_00003 | TERMINATED | | 0.000120929 | 8 | 1 | 0.634508 | 0.958042 | 1 | 0.036406 |\n", - " | train_breast_cancer_b63aa_00004 | TERMINATED | | 0.00839715 | 5 | 1 | 0.730624 | 0.958042 | 1 | 0.0389378 |\n", - " | train_breast_cancer_b63aa_00005 | TERMINATED | | 0.000732948 | 8 | 2 | 0.915863 | 0.958042 | 1 | 0.0382841 |\n", - " | train_breast_cancer_b63aa_00006 | TERMINATED | | 0.000856226 | 4 | 1 | 0.645209 | 0.916084 | 1 | 0.0357089 |\n", - " | train_breast_cancer_b63aa_00007 | TERMINATED | | 0.00769908 | 7 | 1 | 0.729443 | 0.909091 | 1 | 0.0390737 |\n", - " | train_breast_cancer_b63aa_00008 | TERMINATED | | 0.00186339 | 5 | 3 | 0.595744 | 0.944056 | 1 | 0.0343912 |\n", - " | train_breast_cancer_b63aa_00009 | TERMINATED | | 0.000950272 | 3 | 2 | 0.835504 | 0.965035 | 1 | 0.0348201 |\n", - " +---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+----------+--------+------------------+\n", - "```\n", - "\n", - "The best configuration we found used `eta=0.000950272`, `max_depth=3`,\n", - "`min_child_weight=2`, `subsample=0.835504` and reached an accuracy of\n", - "`0.965035`.\n", - "\n", - "## Early stopping\n", - "\n", - "Currently, Tune samples 10 different hyperparameter configurations and trains a full\n", - "XGBoost on all of them. In our small example, training is very fast. However,\n", - "if training takes longer, a significant amount of computer resources is spent on trials\n", - "that will eventually show a bad performance, e.g. a low accuracy. It would be good\n", - "if we could identify these trials early and stop them, so we don't waste any resources.\n", - "\n", - "This is where Tune's *Schedulers* shine. A Tune `TrialScheduler` is responsible\n", - "for starting and stopping trials. Tune implements a number of different schedulers, each\n", - "described {ref}`in the Tune documentation `.\n", - "For our example, we will use the `AsyncHyperBandScheduler` or `ASHAScheduler`.\n", - "\n", - "The basic idea of this scheduler: We sample a number of hyperparameter configurations.\n", - "Each of these configurations is trained for a specific number of iterations.\n", - "After these iterations, only the best performing hyperparameters are retained. These\n", - "are selected according to some loss metric, usually an evaluation loss. This cycle is\n", - "repeated until we end up with the best configuration.\n", - "\n", - "The `ASHAScheduler` needs to know three things:\n", - "\n", - "1. Which metric should be used to identify badly performing trials?\n", - "2. Should this metric be maximized or minimized?\n", - "3. How many iterations does each trial train for?\n", - "\n", - "There are more parameters, which are explained in the\n", - "{ref}`documentation `.\n", - "\n", - "Lastly, we have to report the loss metric to Tune. We do this with a `Callback` that\n", - "XGBoost accepts and calls after each evaluation round. Ray Tune comes\n", - "with {ref}`two XGBoost callbacks `\n", - "we can use for this. The `TuneReportCallback` just reports the evaluation\n", - "metrics back to Tune. The `TuneReportCheckpointCallback` also saves\n", - "checkpoints after each evaluation round. We will just use the latter in this\n", - "example so that we can retrieve the saved model later.\n", - "\n", - "These parameters from the `eval_metrics` configuration setting are then automatically\n", - "reported to Tune via the callback. Here, the raw error will be reported, not the accuracy.\n", - "To display the best reached accuracy, we will inverse it later.\n", - "\n", - "We will also load the best checkpointed model so that we can use it for predictions.\n", - "The best model is selected with respect to the `metric` and `mode` parameters we\n", - "pass to the `TunerConfig()`." - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "id": "d08b5b0a", - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "== Status ==
Current time: 2022-07-22 16:56:01 (running for 00:00:10.38)
Memory usage on this node: 10.3/16.0 GiB
Using AsyncHyperBand: num_stopped=10\n", - "Bracket: Iter 8.000: -0.5107275277792991 | Iter 4.000: -0.5876629346317344 | Iter 2.000: -0.6544494184997531 | Iter 1.000: -0.6859214191253369
Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/4.57 GiB heap, 0.0/2.0 GiB objects
Current best trial: c28a3_00003 with eval-logloss=0.38665050018083796 and parameters={'objective': 'binary:logistic', 'eval_metric': ['logloss', 'error'], 'max_depth': 2, 'min_child_weight': 3, 'subsample': 0.782626252548841, 'eta': 0.06385952388342125}
Result logdir: /Users/kai/ray_results/train_breast_cancer_2022-07-22_16-55-50
Number of trials: 10/10 (10 TERMINATED)
\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "\n", - "
Trial name status loc eta max_depth min_child_weight subsample iter total time (s) eval-logloss eval-error
train_breast_cancer_c28a3_00000TERMINATED127.0.0.1:544160.0186954 2 2 0.516916 10 0.22218 0.571496 0.0629371
train_breast_cancer_c28a3_00001TERMINATED127.0.0.1:544400.0304404 8 2 0.745969 2 0.135674 0.650353 0.0629371
train_breast_cancer_c28a3_00002TERMINATED127.0.0.1:544410.0217157 8 3 0.764138 2 0.173076 0.658545 0.041958
train_breast_cancer_c28a3_00003TERMINATED127.0.0.1:544420.0638595 2 3 0.782626 10 0.281865 0.386651 0.041958
train_breast_cancer_c28a3_00004TERMINATED127.0.0.1:544430.00442794 7 2 0.792359 1 0.0270212 0.689577 0.0699301
train_breast_cancer_c28a3_00005TERMINATED127.0.0.1:544440.00222624 3 1 0.536331 1 0.0238512 0.691446 0.0839161
train_breast_cancer_c28a3_00006TERMINATED127.0.0.1:544450.000825129 1 1 0.82472 1 0.015312 0.692624 0.118881
train_breast_cancer_c28a3_00007TERMINATED127.0.0.1:544460.000770826 7 2 0.947268 1 0.0175898 0.692598 0.132867
train_breast_cancer_c28a3_00008TERMINATED127.0.0.1:544470.000429759 7 1 0.88524 1 0.0193739 0.692785 0.0559441
train_breast_cancer_c28a3_00009TERMINATED127.0.0.1:544480.0149863 2 1 0.722738 1 0.0165932 0.682266 0.111888


" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Result for train_breast_cancer_c28a3_00000:\n", - " date: 2022-07-22_16-55-55\n", - " done: false\n", - " eval-error: 0.08391608391608392\n", - " eval-logloss: 0.6790360066440556\n", - " experiment_id: 2a3189442db341519836a07fb2d65dd9\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " node_ip: 127.0.0.1\n", - " pid: 54416\n", - " time_since_restore: 0.01624011993408203\n", - " time_this_iter_s: 0.01624011993408203\n", - " time_total_s: 0.01624011993408203\n", - " timestamp: 1658505355\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: c28a3_00000\n", - " warmup_time: 0.0035409927368164062\n", - " \n", - "Result for train_breast_cancer_c28a3_00000:\n", - " date: 2022-07-22_16-55-56\n", - " done: true\n", - " eval-error: 0.06293706293706294\n", - " eval-logloss: 0.5714958122560194\n", - " experiment_id: 2a3189442db341519836a07fb2d65dd9\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 10\n", - " node_ip: 127.0.0.1\n", - " pid: 54416\n", - " time_since_restore: 0.22218012809753418\n", - " time_this_iter_s: 0.007044076919555664\n", - " time_total_s: 0.22218012809753418\n", - " timestamp: 1658505356\n", - " timesteps_since_restore: 0\n", - " training_iteration: 10\n", - " trial_id: c28a3_00000\n", - " warmup_time: 0.0035409927368164062\n", - " \n", - "Result for train_breast_cancer_c28a3_00003:\n", - " date: 2022-07-22_16-56-01\n", - " done: false\n", - " eval-error: 0.08391608391608392\n", - " eval-logloss: 0.6472820101918041\n", - " experiment_id: 7ff6133237404b4ea4755b9f8cd114f2\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " node_ip: 127.0.0.1\n", - " pid: 54442\n", - " time_since_restore: 0.023206233978271484\n", - " time_this_iter_s: 0.023206233978271484\n", - " time_total_s: 0.023206233978271484\n", - " timestamp: 1658505361\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: c28a3_00003\n", - " warmup_time: 0.006722211837768555\n", - " \n", - "Result for train_breast_cancer_c28a3_00005:\n", - " date: 2022-07-22_16-56-01\n", - " done: true\n", - " eval-error: 0.08391608391608392\n", - " eval-logloss: 0.6914464114429234\n", - " experiment_id: 344762ab6d574b63a9374e19526d0510\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " node_ip: 127.0.0.1\n", - " pid: 54444\n", - " time_since_restore: 0.02385115623474121\n", - " time_this_iter_s: 0.02385115623474121\n", - " time_total_s: 0.02385115623474121\n", - " timestamp: 1658505361\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: c28a3_00005\n", - " warmup_time: 0.008936882019042969\n", - " \n", - "Result for train_breast_cancer_c28a3_00009:\n", - " date: 2022-07-22_16-56-01\n", - " done: true\n", - " eval-error: 0.11188811188811189\n", - " eval-logloss: 0.6822656309688008\n", - " experiment_id: 133901655fa64bf79f2dcc4e8e5e41b1\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " node_ip: 127.0.0.1\n", - " pid: 54448\n", - " time_since_restore: 0.016593217849731445\n", - " time_this_iter_s: 0.016593217849731445\n", - " time_total_s: 0.016593217849731445\n", - " timestamp: 1658505361\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: c28a3_00009\n", - " warmup_time: 0.004940032958984375\n", - " \n", - "Result for train_breast_cancer_c28a3_00007:\n", - " date: 2022-07-22_16-56-01\n", - " done: true\n", - " eval-error: 0.13286713286713286\n", - " eval-logloss: 0.6925980357023386\n", - " experiment_id: b4331027cbaf442ab905b2e51797dbbd\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " node_ip: 127.0.0.1\n", - " pid: 54446\n", - " time_since_restore: 0.017589807510375977\n", - " time_this_iter_s: 0.017589807510375977\n", - " time_total_s: 0.017589807510375977\n", - " timestamp: 1658505361\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: c28a3_00007\n", - " warmup_time: 0.003782033920288086\n", - " \n", - "Result for train_breast_cancer_c28a3_00006:\n", - " date: 2022-07-22_16-56-01\n", - " done: true\n", - " eval-error: 0.11888111888111888\n", - " eval-logloss: 0.6926244418104212\n", - " experiment_id: d3906de5943a4e05a4cc782382f67d24\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " node_ip: 127.0.0.1\n", - " pid: 54445\n", - " time_since_restore: 0.015311956405639648\n", - " time_this_iter_s: 0.015311956405639648\n", - " time_total_s: 0.015311956405639648\n", - " timestamp: 1658505361\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: c28a3_00006\n", - " warmup_time: 0.005506038665771484\n", - " \n", - "Result for train_breast_cancer_c28a3_00002:\n", - " date: 2022-07-22_16-56-01\n", - " done: false\n", - " eval-error: 0.04895104895104895\n", - " eval-logloss: 0.6752762102580571\n", - " experiment_id: a3645fc2d43145d88a1f5b7cc94df703\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " node_ip: 127.0.0.1\n", - " pid: 54441\n", - " time_since_restore: 0.027367830276489258\n", - " time_this_iter_s: 0.027367830276489258\n", - " time_total_s: 0.027367830276489258\n", - " timestamp: 1658505361\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: c28a3_00002\n", - " warmup_time: 0.0062830448150634766\n", - " \n", - "Result for train_breast_cancer_c28a3_00001:\n", - " date: 2022-07-22_16-56-01\n", - " done: false\n", - " eval-error: 0.07692307692307693\n", - " eval-logloss: 0.6698804135089154\n", - " experiment_id: 85766fe4d9fa482a91e396a8fd509a19\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " node_ip: 127.0.0.1\n", - " pid: 54440\n", - " time_since_restore: 0.017169952392578125\n", - " time_this_iter_s: 0.017169952392578125\n", - " time_total_s: 0.017169952392578125\n", - " timestamp: 1658505361\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: c28a3_00001\n", - " warmup_time: 0.006204843521118164\n", - " \n", - "Result for train_breast_cancer_c28a3_00008:\n", - " date: 2022-07-22_16-56-01\n", - " done: true\n", - " eval-error: 0.05594405594405594\n", - " eval-logloss: 0.692784742458717\n", - " experiment_id: 2c7d8bc38ad04536b1dec76819a2b3bf\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " node_ip: 127.0.0.1\n", - " pid: 54447\n", - " time_since_restore: 0.01937389373779297\n", - " time_this_iter_s: 0.01937389373779297\n", - " time_total_s: 0.01937389373779297\n", - " timestamp: 1658505361\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: c28a3_00008\n", - " warmup_time: 0.004342079162597656\n", - " \n", - "Result for train_breast_cancer_c28a3_00001:\n", - " date: 2022-07-22_16-56-01\n", - " done: true\n", - " eval-error: 0.06293706293706294\n", - " eval-logloss: 0.6503534216980834\n", - " experiment_id: 85766fe4d9fa482a91e396a8fd509a19\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 2\n", - " node_ip: 127.0.0.1\n", - " pid: 54440\n", - " time_since_restore: 0.13567376136779785\n", - " time_this_iter_s: 0.11850380897521973\n", - " time_total_s: 0.13567376136779785\n", - " timestamp: 1658505361\n", - " timesteps_since_restore: 0\n", - " training_iteration: 2\n", - " trial_id: c28a3_00001\n", - " warmup_time: 0.006204843521118164\n", - " \n", - "Result for train_breast_cancer_c28a3_00004:\n", - " date: 2022-07-22_16-56-01\n", - " done: true\n", - " eval-error: 0.06993006993006994\n", - " eval-logloss: 0.689577207281873\n", - " experiment_id: ef4fdc645c444112985b4957ab8a84e9\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 1\n", - " node_ip: 127.0.0.1\n", - " pid: 54443\n", - " time_since_restore: 0.027021169662475586\n", - " time_this_iter_s: 0.027021169662475586\n", - " time_total_s: 0.027021169662475586\n", - " timestamp: 1658505361\n", - " timesteps_since_restore: 0\n", - " training_iteration: 1\n", - " trial_id: c28a3_00004\n", - " warmup_time: 0.0063669681549072266\n", - " \n", - "Result for train_breast_cancer_c28a3_00002:\n", - " date: 2022-07-22_16-56-01\n", - " done: true\n", - " eval-error: 0.04195804195804196\n", - " eval-logloss: 0.658545415301423\n", - " experiment_id: a3645fc2d43145d88a1f5b7cc94df703\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 2\n", - " node_ip: 127.0.0.1\n", - " pid: 54441\n", - " time_since_restore: 0.17307591438293457\n", - " time_this_iter_s: 0.1457080841064453\n", - " time_total_s: 0.17307591438293457\n", - " timestamp: 1658505361\n", - " timesteps_since_restore: 0\n", - " training_iteration: 2\n", - " trial_id: c28a3_00002\n", - " warmup_time: 0.0062830448150634766\n", - " \n", - "Result for train_breast_cancer_c28a3_00003:\n", - " date: 2022-07-22_16-56-01\n", - " done: true\n", - " eval-error: 0.04195804195804196\n", - " eval-logloss: 0.38665050018083796\n", - " experiment_id: 7ff6133237404b4ea4755b9f8cd114f2\n", - " hostname: Kais-MacBook-Pro.local\n", - " iterations_since_restore: 10\n", - " node_ip: 127.0.0.1\n", - " pid: 54442\n", - " time_since_restore: 0.28186488151550293\n", - " time_this_iter_s: 0.03063178062438965\n", - " time_total_s: 0.28186488151550293\n", - " timestamp: 1658505361\n", - " timesteps_since_restore: 0\n", - " training_iteration: 10\n", - " trial_id: c28a3_00003\n", - " warmup_time: 0.006722211837768555\n", - " \n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "2022-07-22 16:56:01,498\tINFO tune.py:738 -- Total run time: 10.53 seconds (10.37 seconds for the tuning loop).\n" - ] - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Best model parameters: {'objective': 'binary:logistic', 'eval_metric': ['logloss', 'error'], 'max_depth': 2, 'min_child_weight': 3, 'subsample': 0.782626252548841, 'eta': 0.06385952388342125}\n", - "Best model total accuracy: 0.9580\n" - ] - } - ], - "source": [ - "import sklearn.datasets\n", - "import sklearn.metrics\n", - "import os\n", - "from ray.tune.schedulers import ASHAScheduler\n", - "from sklearn.model_selection import train_test_split\n", - "import xgboost as xgb\n", - "\n", - "from ray import air, tune\n", - "from ray.air import session\n", - "from ray.tune.integration.xgboost import TuneReportCheckpointCallback\n", - "\n", - "\n", - "def train_breast_cancer(config: dict):\n", - " # This is a simple training function to be passed into Tune\n", - " # Load dataset\n", - " data, labels = sklearn.datasets.load_breast_cancer(return_X_y=True)\n", - " # Split into train and test set\n", - " train_x, test_x, train_y, test_y = train_test_split(data, labels, test_size=0.25)\n", - " # Build input matrices for XGBoost\n", - " train_set = xgb.DMatrix(train_x, label=train_y)\n", - " test_set = xgb.DMatrix(test_x, label=test_y)\n", - " # Train the classifier, using the Tune callback\n", - " xgb.train(\n", - " config,\n", - " train_set,\n", - " evals=[(test_set, \"eval\")],\n", - " verbose_eval=False,\n", - " callbacks=[TuneReportCheckpointCallback(filename=\"model.xgb\")],\n", - " )\n", - "\n", - "\n", - "def get_best_model_checkpoint(results):\n", - " best_bst = xgb.Booster()\n", - " best_result = results.get_best_result()\n", - "\n", - " with best_result.checkpoint.as_directory() as best_checkpoint_dir:\n", - " best_bst.load_model(os.path.join(best_checkpoint_dir, \"model.xgb\"))\n", - " accuracy = 1.0 - best_result.metrics[\"eval-error\"]\n", - " print(f\"Best model parameters: {best_result.config}\")\n", - " print(f\"Best model total accuracy: {accuracy:.4f}\")\n", - " return best_bst\n", - "\n", - "\n", - "def tune_xgboost(smoke_test=False):\n", - " search_space = {\n", - " # You can mix constants with search space objects.\n", - " \"objective\": \"binary:logistic\",\n", - " \"eval_metric\": [\"logloss\", \"error\"],\n", - " \"max_depth\": tune.randint(1, 9),\n", - " \"min_child_weight\": tune.choice([1, 2, 3]),\n", - " \"subsample\": tune.uniform(0.5, 1.0),\n", - " \"eta\": tune.loguniform(1e-4, 1e-1),\n", - " }\n", - " # This will enable aggressive early stopping of bad trials.\n", - " scheduler = ASHAScheduler(\n", - " max_t=10, grace_period=1, reduction_factor=2 # 10 training iterations\n", - " )\n", - "\n", - " tuner = tune.Tuner(\n", - " train_breast_cancer,\n", - " tune_config=tune.TuneConfig(\n", - " metric=\"eval-logloss\",\n", - " mode=\"min\",\n", - " scheduler=scheduler,\n", - " num_samples=1 if smoke_test else 10,\n", - " ),\n", - " param_space=search_space,\n", - " )\n", - " results = tuner.fit()\n", - "\n", - " return results\n", - "\n", - "\n", - "if __name__ == \"__main__\":\n", - " import argparse\n", - "\n", - " parser = argparse.ArgumentParser()\n", - " parser.add_argument(\n", - " \"--smoke-test\", action=\"store_true\", help=\"Finish quickly for testing\"\n", - " )\n", - " args, _ = parser.parse_known_args()\n", - "\n", - " results = tune_xgboost(smoke_test=args.smoke_test)\n", - "\n", - " # Load the best model checkpoint.\n", - " best_bst = get_best_model_checkpoint(results)\n", - "\n", - " # You could now do further predictions with\n", - " # best_bst.predict(...)\n" - ] - }, - { - "cell_type": "markdown", - "id": "20732fe4", - "metadata": {}, - "source": [ - "The output of our run could look like this:\n", - "\n", - "```{code-block} bash\n", - ":emphasize-lines: 7\n", - "\n", - " Number of trials: 10/10 (10 TERMINATED)\n", - " +---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+--------+------------------+----------------+--------------+\n", - " | Trial name | status | loc | eta | max_depth | min_child_weight | subsample | iter | total time (s) | eval-logloss | eval-error |\n", - " |---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+--------+------------------+----------------+--------------|\n", - " | train_breast_cancer_ba275_00000 | TERMINATED | | 0.00205087 | 2 | 1 | 0.898391 | 10 | 0.380619 | 0.678039 | 0.090909 |\n", - " | train_breast_cancer_ba275_00001 | TERMINATED | | 0.000183834 | 4 | 3 | 0.924939 | 1 | 0.0228798 | 0.693009 | 0.111888 |\n", - " | train_breast_cancer_ba275_00002 | TERMINATED | | 0.0242721 | 7 | 2 | 0.501551 | 10 | 0.376154 | 0.54472 | 0.06993 |\n", - " | train_breast_cancer_ba275_00003 | TERMINATED | | 0.000449692 | 5 | 3 | 0.890212 | 1 | 0.0234981 | 0.692811 | 0.090909 |\n", - " | train_breast_cancer_ba275_00004 | TERMINATED | | 0.000376393 | 7 | 2 | 0.883609 | 1 | 0.0231569 | 0.692847 | 0.062937 |\n", - " | train_breast_cancer_ba275_00005 | TERMINATED | | 0.00231942 | 3 | 3 | 0.877464 | 2 | 0.104867 | 0.689541 | 0.083916 |\n", - " | train_breast_cancer_ba275_00006 | TERMINATED | | 0.000542326 | 1 | 2 | 0.578584 | 1 | 0.0213971 | 0.692765 | 0.083916 |\n", - " | train_breast_cancer_ba275_00007 | TERMINATED | | 0.0016801 | 1 | 2 | 0.975302 | 1 | 0.02226 | 0.691999 | 0.083916 |\n", - " | train_breast_cancer_ba275_00008 | TERMINATED | | 0.000595756 | 8 | 3 | 0.58429 | 1 | 0.0221152 | 0.692657 | 0.06993 |\n", - " | train_breast_cancer_ba275_00009 | TERMINATED | | 0.000357845 | 8 | 1 | 0.637776 | 1 | 0.022635 | 0.692859 | 0.090909 |\n", - " +---------------------------------+------------+-------+-------------+-------------+--------------------+-------------+--------+------------------+----------------+--------------+\n", - "\n", - "\n", - " Best model parameters: {'objective': 'binary:logistic', 'eval_metric': ['logloss', 'error'], 'max_depth': 7, 'min_child_weight': 2, 'subsample': 0.5015513240240503, 'eta': 0.024272050872920895}\n", - " Best model total accuracy: 0.9301\n", - "```\n", - "\n", - "As you can see, most trials have been stopped only after a few iterations. Only the\n", - "two most promising trials were run for the full 10 iterations.\n", - "\n", - "You can also ensure that all available resources are being used as the scheduler\n", - "terminates trials, freeing them up. This can be done through the\n", - "`ResourceChangingScheduler`. An example of this can be found here:\n", - "{doc}`/tune/examples/includes/xgboost_dynamic_resources_example`.\n", - "\n", - "## Using fractional GPUs\n", - "\n", - "You can often accelerate your training by using GPUs in addition to CPUs. However,\n", - "you usually don't have as many GPUs as you have trials to run. For instance, if you\n", - "run 10 Tune trials in parallel, you usually don't have access to 10 separate GPUs.\n", - "\n", - "Tune supports *fractional GPUs*. This means that each task is assigned a fraction\n", - "of the GPU memory for training. For 10 tasks, this could look like this:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "7d1b20a3", - "metadata": {}, - "outputs": [], - "source": [ - "config = {\n", - " \"objective\": \"binary:logistic\",\n", - " \"eval_metric\": [\"logloss\", \"error\"],\n", - " \"tree_method\": \"gpu_hist\",\n", - " \"max_depth\": tune.randint(1, 9),\n", - " \"min_child_weight\": tune.choice([1, 2, 3]),\n", - " \"subsample\": tune.uniform(0.5, 1.0),\n", - " \"eta\": tune.loguniform(1e-4, 1e-1),\n", - "}\n", - "\n", - "tuner = tune.Tuner(\n", - " tune.with_resources(train_breast_cancer, resources={\"cpu\": 1, \"gpu\": 0.1}),\n", - " tune_config=tune.TuneConfig(\n", - " num_samples=10,\n", - " ),\n", - " param_space=config,\n", - ")\n", - "results = tuner.fit()\n" - ] - }, - { - "cell_type": "markdown", - "id": "ee131861", - "metadata": {}, - "source": [ - "Each task thus works with 10% of the available GPU memory. You also have to tell\n", - "XGBoost to use the `gpu_hist` tree method, so it knows it should use the GPU.\n", - "\n", - "## Conclusion\n", - "\n", - "You should now have a basic understanding on how to train XGBoost models and on how\n", - "to tune the hyperparameters to yield the best results. In our simple example,\n", - "Tuning the parameters didn't make a huge difference for the accuracy.\n", - "But in larger applications, intelligent hyperparameter tuning can make the\n", - "difference between a model that doesn't seem to learn at all, and a model\n", - "that outperforms all the other ones.\n", - "\n", - "## More XGBoost Examples\n", - "\n", - "- {doc}`/tune/examples/includes/xgboost_dynamic_resources_example`:\n", - " Trains a basic XGBoost model with Tune with the class-based API and a ResourceChangingScheduler, ensuring all resources are being used at all time.\n", - "\n", - "## Learn More\n", - "\n", - "- [XGBoost Hyperparameter Tuning - A Visual Guide](https://kevinvecmanis.io/machine%20learning/hyperparameter%20tuning/dataviz/python/2019/05/11/XGBoost-Tuning-Visual-Guide.html)\n", - "- [Notes on XGBoost Parameter Tuning](https://xgboost.readthedocs.io/en/latest/tutorials/param_tuning.html)\n", - "- [Doing XGBoost Hyperparameter Tuning the smart way](https://towardsdatascience.com/doing-xgboost-hyper-parameter-tuning-the-smart-way-part-1-of-2-f6d255a45dde)" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "ray_dev_py38", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:05:16) \n[Clang 12.0.1 ]" - }, - "orphan": true, - "vscode": { - "interpreter": { - "hash": "265d195fda5292fe8f69c6e37c435a5634a1ed3b6799724e66a975f68fa21517" - } - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} + "nbformat": 4, + "nbformat_minor": 5 +} \ No newline at end of file diff --git a/doc/source/tune/getting-started.rst b/doc/source/tune/getting-started.rst index 624fe86d5409..14e303b2dae3 100644 --- a/doc/source/tune/getting-started.rst +++ b/doc/source/tune/getting-started.rst @@ -73,7 +73,7 @@ make sure that the function is :ref:`serializable by Ray `. :start-after: __train_func_begin__ :end-before: __train_func_end__ -Let's run one trial by calling :ref:`Tuner.fit ` and :ref:`randomly sample ` +Let's run one trial by calling :ref:`Tuner.fit ` and :ref:`randomly sample ` from a uniform distribution for learning rate and momentum. .. literalinclude:: /../../python/ray/tune/tests/tutorial.py diff --git a/doc/source/tune/key-concepts.rst b/doc/source/tune/key-concepts.rst index 40944fe73a49..0606566d8444 100644 --- a/doc/source/tune/key-concepts.rst +++ b/doc/source/tune/key-concepts.rst @@ -81,10 +81,10 @@ how these values are sampled (e.g. from a uniform distribution or a normal distribution). Tune offers various functions to define search spaces and sampling methods. -:ref:`You can find the documentation of these search space definitions here `. +:ref:`You can find the documentation of these search space definitions here `. Here's an example covering all search space functions. Again, -:ref:`here is the full explanation of all these functions `. +:ref:`here is the full explanation of all these functions `. .. literalinclude:: doc_code/key_concepts.py :language: python diff --git a/doc/source/tune/tutorials/tune-search-spaces.rst b/doc/source/tune/tutorials/tune-search-spaces.rst index 3ae9df9493cb..bb2f441b7ecd 100644 --- a/doc/source/tune/tutorials/tune-search-spaces.rst +++ b/doc/source/tune/tutorials/tune-search-spaces.rst @@ -16,7 +16,7 @@ Thereby, you can either use the ``tune.grid_search`` primitive to use grid searc results = tuner.fit() -Or you can use one of the random sampling primitives to specify distributions (:ref:`tune-sample-docs`): +Or you can use one of the random sampling primitives to specify distributions (:doc:`/tune/api/search_space`): .. code-block:: python diff --git a/doc/source/tune/tutorials/tune_get_data_in_and_out.md b/doc/source/tune/tutorials/tune_get_data_in_and_out.md index 2366e3d3b3dd..6585366114f2 100644 --- a/doc/source/tune/tutorials/tune_get_data_in_and_out.md +++ b/doc/source/tune/tutorials/tune_get_data_in_and_out.md @@ -62,7 +62,7 @@ Objects from the outer scope of the `training_function` will also be automatical TL;DR - use the `param_space` argument to specify small, serializable constants and variables. ``` -The first way of passing inputs into Trainables is the [*search space*](tune-key-concepts-search-spaces) (it may also be called *parameter space* or *config*). In the Trainable itself, it maps to the `config` dict passed in as an argument to the function. You define the search space using the `param_space` argument of the `Tuner`. The search space is a dict and may be composed of [*distributions*](), which will sample a different value for each Trial, or of constant values. The search space may be composed of nested dictionaries, and those in turn can have distributions as well. +The first way of passing inputs into Trainables is the [*search space*](tune-key-concepts-search-spaces) (it may also be called *parameter space* or *config*). In the Trainable itself, it maps to the `config` dict passed in as an argument to the function. You define the search space using the `param_space` argument of the `Tuner`. The search space is a dict and may be composed of [*distributions*](), which will sample a different value for each Trial, or of constant values. The search space may be composed of nested dictionaries, and those in turn can have distributions as well. ```{warning} Each value in the search space will be saved directly in the Trial metadata. This means that every value in the search space **must** be serializable and take up a small amount of memory. From 31ece63acc2afd006860573a24ce6e9b26409450 Mon Sep 17 00:00:00 2001 From: Justin Yu Date: Thu, 9 Feb 2023 17:26:50 -0800 Subject: [PATCH 24/24] Revert sklearn.rst change Signed-off-by: Justin Yu --- doc/source/tune/api/sklearn.rst | 22 ++++------------------ 1 file changed, 4 insertions(+), 18 deletions(-) diff --git a/doc/source/tune/api/sklearn.rst b/doc/source/tune/api/sklearn.rst index d34c0b6d2888..3bb733dc8187 100644 --- a/doc/source/tune/api/sklearn.rst +++ b/doc/source/tune/api/sklearn.rst @@ -10,27 +10,13 @@ TuneGridSearchCV .. currentmodule:: ray.tune.sklearn -.. autosummary:: - :toctree: doc/ - - TuneGridSearchCV - TuneGridSearchCV.fit - TuneGridSearchCV.score - TuneGridSearchCV.score_samples - TuneGridSearchCV.get_params - TuneGridSearchCV.set_params +.. autoclass:: TuneGridSearchCV + :members: .. _tunesearchcv-docs: TuneSearchCV ------------ -.. autosummary:: - :toctree: doc/ - - TuneSearchCV - TuneSearchCV.fit - TuneSearchCV.score - TuneSearchCV.score_samples - TuneSearchCV.get_params - TuneSearchCV.set_params +.. autoclass:: TuneSearchCV + :members: