Skip to content

Commit

Permalink
[Datasets] Raise error message if user calls Dataset.__iter__ (ray-…
Browse files Browse the repository at this point in the history
…project#30575)

New users might try for item in dataset and get confused when they receive the default error message. This PR adds a more descriptive error that points users towards Dataset.take or Dataset.map_batches.

Signed-off-by: tmynn <[email protected]>
  • Loading branch information
bveeramani authored and tamohannes committed Jan 25, 2023
1 parent 81a1032 commit 133e4d1
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions python/ray/data/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -4206,6 +4206,13 @@ def __len__(self) -> int:
"This may be an expensive operation."
)

def __iter__(self):
raise TypeError(
"`Dataset` objects aren't iterable. To iterate records, call "
"`ds.iter_rows()` or `ds.iter_batches()`. For more information, read "
"https://docs.ray.io/en/latest/data/consuming-datasets.html."
)

def _block_num_rows(self) -> List[int]:
get_num_rows = cached_remote_fn(_get_num_rows)
return ray.get([get_num_rows.remote(b) for b in self.get_internal_block_refs()])
Expand Down

0 comments on commit 133e4d1

Please sign in to comment.