[docs/air] Fix up some minor docstrings (ray-project#28361)

justinvyu · Sep 14, 2022 · d5db148 · d5db148
1 parent 52036e7
commit d5db148
Show file tree

Hide file tree

Showing 9 changed files with 68 additions and 67 deletions.
diff --git a/doc/source/ray-overview/ray-libraries.rst b/doc/source/ray-overview/ray-libraries.rst
@@ -19,7 +19,7 @@ Dask |dask|
 
 Dask provides advanced parallelism for analytics, enabling performance at scale for the tools you love. Dask uses existing Python APIs and data structures to make it easy to switch between Numpy, Pandas, Scikit-learn to their Dask-powered equivalents.
 
-[`Link to integration <../data/dask-on-ray.html>`__]
+[:ref:`Link to integration <dask-on-ray>`]
 
 Flambe |flambe|
 ---------------
@@ -74,7 +74,7 @@ MARS |mars|
 
 Mars is a tensor-based unified framework for large-scale data computation which scales Numpy, Pandas and Scikit-learn. Mars can scale in to a single machine, and scale out to a cluster with thousands of machines.
 
-[`Link to integration <../data/mars-on-ray.html>`__]
+[:ref:`Link to integration <mars-on-ray>`]
 
 Modin |modin|
 -------------

diff --git a/doc/source/rllib/rllib-training.rst b/doc/source/rllib/rllib-training.rst
@@ -748,9 +748,9 @@ Here is an example of the basic usage (for a more complete example, see `custom_
 
 .. note::
 
-    It's recommended that you run RLlib algorithms with :doc:`Tune <../tune/index>`, for easy experiment management and visualization of results. Just set ``"run": ALG_NAME, "env": ENV_NAME`` in the experiment config.
+    It's recommended that you run RLlib algorithms with :ref:`Ray Tune <tune-main>`, for easy experiment management and visualization of results. Just set ``"run": ALG_NAME, "env": ENV_NAME`` in the experiment config.
 
-All RLlib algorithms are compatible with the :ref:`Tune API <tune-60-seconds>`. This enables them to be easily used in experiments with :doc:`Tune <../tune/index>`. For example, the following code performs a simple hyperparam sweep of PPO:
+All RLlib algorithms are compatible with the :ref:`Tune API <tune-60-seconds>`. This enables them to be easily used in experiments with :ref:`Ray Tune <tune-main>`. For example, the following code performs a simple hyperparam sweep of PPO:
 
 .. code-block:: python
 

diff --git a/doc/source/tune/examples/tune-pytorch-lightning.ipynb b/doc/source/tune/examples/tune-pytorch-lightning.ipynb
@@ -488,7 +488,7 @@
    "id": "ca050dfa",
    "metadata": {},
    "source": [
-    "You can also specify {doc}`fractional GPUs for Tune <../../ray-core/tasks/using-ray-with-gpus>`,\n",
+    "You can also specify {ref}`fractional GPUs for Tune <tune-parallelism>`,\n",
     "allowing multiple trials to share GPUs and thus increase concurrency under resource constraints.\n",
     "While the `gpus_per_trial` passed into\n",
     "Tune is a decimal value, the `gpus` passed into the `pl.Trainer` should still be an integer.\n",

diff --git a/python/ray/air/checkpoint.py b/python/ray/air/checkpoint.py
@@ -91,7 +91,8 @@ class Checkpoint:
     be used to create checkpoint objects
     (e.g. ``Checkpoint.from_directory()``).
 
-    *Other implementation notes:*
+    **Other implementation notes:**
+
     When converting between different checkpoint formats, it is guaranteed
     that a full round trip of conversions (e.g. directory --> dict -->
     obj ref --> directory) will recover the original checkpoint data.
@@ -488,6 +489,8 @@ def as_directory(self) -> Iterator[str]:
 
         Example:
 
+        .. code-block:: python
+
             with checkpoint.as_directory() as checkpoint_dir:
                 # Do some read-only processing of files within checkpoint_dir
                 pass

diff --git a/python/ray/air/config.py b/python/ray/air/config.py
@@ -273,52 +273,48 @@ class DatasetConfig:
     ``datasets`` argument. Users have the opportunity to selectively override these
     configs by passing the ``dataset_config`` argument. Trainers can also define user
     customizable values (e.g., XGBoostTrainer doesn't support streaming ingest).
+
+    Args:
+        fit: Whether to fit preprocessors on this dataset. This can be set on at most
+            one dataset at a time. True by default for the "train" dataset only.
+        split: Whether the dataset should be split across multiple workers.
+            True by default for the "train" dataset only.
+        required: Whether to raise an error if the Dataset isn't provided by the user.
+            False by default.
+        transform: Whether to transform the dataset with the fitted preprocessor.
+            This must be enabled at least for the dataset that is fit.
+            True by default.
+        use_stream_api: Whether the dataset should be streamed into memory using
+            pipelined reads. When enabled, get_dataset_shard() returns DatasetPipeline
+            instead of Dataset. The amount of memory to use is controlled
+            by `stream_window_size`. False by default.
+        stream_window_size: Configure the streaming window size in bytes.
+            A good value is something like 20% of object store memory.
+            If set to -1, then an infinite window size will be used (similar to
+            bulk ingest). This only has an effect if use_stream_api is set.
+            Set to 1.0 GiB by default.
+        global_shuffle: Whether to enable global shuffle (per pipeline window
+            in streaming mode). Note that this is an expensive all-to-all operation,
+            and most likely you want to use local shuffle instead.
+            See https://docs.ray.io/en/master/data/faq.html and
+            https://docs.ray.io/en/master/ray-air/check-ingest.html.
+            False by default.
+        randomize_block_order: Whether to randomize the iteration order over blocks.
+            The main purpose of this is to prevent data fetching hotspots in the
+            cluster when running many parallel workers / trials on the same data.
+            We recommend enabling it always. True by default.
     """
 
     # TODO(ekl) could we unify DataParallelTrainer and Trainer so the same data ingest
     # strategy applies to all Trainers?
 
-    # Whether to fit preprocessors on this dataset. This can be set on at most one
-    # dataset at a time.
-    # True by default for the "train" dataset only.
     fit: Optional[bool] = None
-
-    # Whether the dataset should be split across multiple workers.
-    # True by default for the "train" dataset only.
     split: Optional[bool] = None
-
-    # Whether to raise an error if the Dataset isn't provided by the user.
-    # False by default.
     required: Optional[bool] = None
-
-    # Whether to transform the dataset with the fitted preprocessor. This must be
-    # enabled at least for the dataset that is fit.
-    # True by default.
     transform: Optional[bool] = None
-
-    # Whether the dataset should be streamed into memory using pipelined reads.
-    # When enabled, get_dataset_shard() returns DatasetPipeline instead of Dataset.
-    # The amount of memory to use is controlled by `stream_window_size`.
-    # False by default.
     use_stream_api: Optional[bool] = None
-
-    # Configure the streaming window size in bytes. A good value is something like
-    # 20% of object store memory. If set to -1, then an infinite window size will be
-    # used (similar to bulk ingest). This only has an effect if use_stream_api is set.
-    # Set to 1.0 GiB by default.
     stream_window_size: Optional[float] = None
-
-    # Whether to enable global shuffle (per pipeline window in streaming mode). Note
-    # that this is an expensive all-to-all operation, and most likely you want to use
-    # local shuffle instead. See https://docs.ray.io/en/master/data/faq.html and
-    # https://docs.ray.io/en/master/air/check-ingest.html.
-    # False by default.
     global_shuffle: Optional[bool] = None
-
-    # Whether to randomize the iteration order over blocks. The main purpose of this
-    # is to prevent data fetching hotspots in the cluster when running many parallel
-    # workers / trials on the same data. We recommend enabling it always.
-    # True by default.
     randomize_block_order: Optional[bool] = None
 
     def __repr__(self):
@@ -353,7 +349,7 @@ def merge(
         """Merge two given DatasetConfigs, the second taking precedence.
 
         Raises:
-            ValueError if validation fails on the merged configs.
+            ValueError: if validation fails on the merged configs.
         """
         has_wildcard = WILDCARD_KEY in a
         result = a.copy()

diff --git a/python/ray/train/base_trainer.py b/python/ray/train/base_trainer.py
@@ -41,10 +41,10 @@ class BaseTrainer(abc.ABC):
     Note: The base ``BaseTrainer`` class cannot be instantiated directly. Only
     one of its subclasses can be used.
 
-    How does a trainer work?
+    **How does a trainer work?**
 
     - First, initialize the Trainer. The initialization runs locally,
-      so heavyweight setup should not be done in __init__.
+      so heavyweight setup should not be done in ``__init__``.
     - Then, when you call ``trainer.fit()``, the Trainer is serialized
       and copied to a remote Ray actor. The following methods are then
       called in sequence on the remote actor.
@@ -301,7 +301,7 @@ def preprocess_datasets(self) -> None:
     def training_loop(self) -> None:
         """Loop called by fit() to run training and report results to Tune.
 
-        Note: this method runs on a remote process.
+        .. note:: This method runs on a remote process.
 
         ``self.datasets`` have already been preprocessed by ``self.preprocessor``.
 
@@ -311,7 +311,7 @@ def training_loop(self) -> None:
 
         Example:
 
-        .. code-block: python
+        .. code-block:: python
 
             from ray.train.trainer import BaseTrainer
 

diff --git a/python/ray/train/gbdt_trainer.py b/python/ray/train/gbdt_trainer.py
@@ -68,9 +68,9 @@ def _convert_scaling_config_to_ray_params(
 
 @DeveloperAPI
 class GBDTTrainer(BaseTrainer):
-    """Common logic for gradient-boosting decision tree (GBDT) frameworks
-    like XGBoost-Ray and LightGBM-Ray.
+    """Abstract class for scaling gradient-boosting decision tree (GBDT) frameworks.
 
+    Inherited by XGBoostTrainer and LightGBMTrainer.
 
     Args:
         datasets: Ray Datasets to use for training and validation. Must include a

diff --git a/python/ray/train/predictor.py b/python/ray/train/predictor.py
@@ -40,8 +40,9 @@ class PredictorNotSerializableException(RuntimeError):
 class Predictor(abc.ABC):
     """Predictors load models from checkpoints to perform inference.
 
-    Note: The base ``Predictor`` class cannot be instantiated directly. Only one of
-    its subclasses can be used.
+    .. note::
+        The base ``Predictor`` class cannot be instantiated directly. Only one of
+        its subclasses can be used.
 
     **How does a Predictor work?**
 
@@ -50,27 +51,27 @@ class Predictor(abc.ABC):
 
     When the ``predict`` method is called the following occurs:
 
-        - The input batch is converted into a pandas DataFrame. Tensor input (like a
-          ``np.ndarray``) will be converted into a single column Pandas Dataframe.
-        - If there is a :ref:`Preprocessor <air-preprocessor-ref>` saved in the provided
-          :ref:`Checkpoint <air-checkpoint-ref>`, the preprocessor will be used to
-          transform the DataFrame.
-        - The transformed DataFrame will be passed to the model for inference (via the
-          ``predictor._predict_pandas`` method).
-        - The predictions will be outputted by ``predict`` in the same type as the
-          original input.
+    - The input batch is converted into a pandas DataFrame. Tensor input (like a
+      ``np.ndarray``) will be converted into a single column Pandas Dataframe.
+    - If there is a :ref:`Preprocessor <air-preprocessor-ref>` saved in the provided
+      :ref:`Checkpoint <air-checkpoint-ref>`, the preprocessor will be used to
+      transform the DataFrame.
+    - The transformed DataFrame will be passed to the model for inference (via the
+      ``predictor._predict_pandas`` method).
+    - The predictions will be outputted by ``predict`` in the same type as the
+      original input.
 
     **How do I create a new Predictor?**
 
     To implement a new Predictor for your particular framework, you should subclass
     the base ``Predictor`` and implement the following two methods:
 
-        1. ``_predict_pandas``: Given a pandas.DataFrame input, return a
-            pandas.DataFrame containing predictions.
-        2. ``from_checkpoint``: Logic for creating a Predictor from an
-           :ref:`AIR Checkpoint <air-checkpoint-ref>`.
-        3. Optionally ``_predict_arrow`` for better performance when working with
-           tensor data to avoid extra copies from Pandas conversions.
+    1. ``_predict_pandas``: Given a pandas.DataFrame input, return a
+       pandas.DataFrame containing predictions.
+    2. ``from_checkpoint``: Logic for creating a Predictor from an
+       :ref:`AIR Checkpoint <air-checkpoint-ref>`.
+    3. Optionally ``_predict_arrow`` for better performance when working with
+       tensor data to avoid extra copies from Pandas conversions.
     """
 
     def __init__(self, preprocessor: Optional[Preprocessor] = None):
@@ -141,8 +142,8 @@ def predict(self, data: DataBatchType, **kwargs) -> DataBatchType:
             directly to ``_predict_pandas``.
 
         Returns:
-            DataBatchType: Prediction result. The return type will be the same as the
-                input type.
+            DataBatchType:
+                Prediction result. The return type will be the same as the input type.
         """
         data_df = convert_batch_type_to_pandas(data, self._cast_tensor_columns)
 

diff --git a/python/ray/tune/tuner.py b/python/ray/tune/tuner.py
@@ -226,7 +226,8 @@ def fit(self) -> ResultGrid:
         to resume.
 
         Raises:
-            RayTaskError when the exception happens in trainable else TuneError.
+            RayTaskError: If user-provided trainable raises an exception
+            TuneError: General Ray Tune error.
         """
 
         if not self._is_ray_client: