Skip to content

Commit

Permalink
[Data] Add docstring to explain Dataset.deserialize_lineage (#47203)
Browse files Browse the repository at this point in the history
Users had questions about how the serialization and deserialization
works under the hood for Dataset. This doc adds simple explanation that
the pickle is used to do it.

Signed-off-by: Cheng Su <[email protected]>
Signed-off-by: Cheng Su <[email protected]>
Co-authored-by: Scott Lee <[email protected]>
  • Loading branch information
c21 and scottjlee authored Aug 21, 2024
1 parent b433de7 commit c01d524
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions python/ray/data/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -4777,8 +4777,8 @@ def serialize_lineage(self) -> bytes:
futures, to bytes that can be stored and later deserialized, possibly on a
different cluster.
Note that this will drop all computed data, and that everything is
recomputed from scratch after deserialization.
Note that this uses pickle and will drop all computed data, and that everything
is recomputed from scratch after deserialization.
Use :py:meth:`Dataset.deserialize_lineage` to deserialize the serialized
bytes returned from this method into a Dataset.
Expand Down Expand Up @@ -4866,8 +4866,8 @@ def deserialize_lineage(serialized_ds: bytes) -> "Dataset":
"""
Deserialize the provided lineage-serialized Dataset.
This assumes that the provided serialized bytes were serialized using
:py:meth:`Dataset.serialize_lineage`.
This uses pickle, and assumes that the provided serialized bytes were
serialized using :py:meth:`Dataset.serialize_lineage`.
Examples:
Expand Down

0 comments on commit c01d524

Please sign in to comment.