-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docs] Document using a different separator for read_csv #27850
Conversation
2b4353a
to
cce97d4
Compare
>>> # because by default read_csv only reads .csv files. | ||
>>> from pyarrow import csv | ||
>>> parse_options = csv.ParseOptions(delimiter="\\t") | ||
>>> ray.data.read_csv( # doctest: +SKIP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be skipped at the moment which is very very sad -- tracked in #27853
Signed-off-by: Philipp Moritz <[email protected]>
Signed-off-by: Philipp Moritz <[email protected]>
Signed-off-by: Philipp Moritz <[email protected]>
Signed-off-by: Philipp Moritz <[email protected]>
3e7f01c
to
1945998
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@c21 take a look?
Co-authored-by: matthewdeng <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pcmoritz! Looks good except several minor comments.
python/ray/data/read_api.py
Outdated
>>> # Read files that use a different delimiter. The partition_filter=None is needed here | ||
>>> # because by default read_csv only reads .csv files. | ||
>>> from pyarrow import csv | ||
>>> parse_options = csv.ParseOptions(delimiter="\\t") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be delimiter="\t"
right? I tested it out in #27738 (comment) .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that before and then it didn't render correctly in the docs, let me try again :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so the right solution that makes this work in all cases (both raw docstring and readthedocs) seems to be to put an r
before the doc string and not escape this, which I did now :)
@@ -0,0 +1,150 @@ | |||
5.1 3.5 1.4 0.2 setosa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to define the schema at the first line:
sepal.length sepal.width petal.length petal.width variety
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I fixed that now :)
6.3 2.5 5.0 1.9 virginica | ||
6.5 3.0 5.2 2.0 virginica | ||
6.2 3.4 5.4 2.3 virginica | ||
5.9 3.0 5.1 1.8 virginica |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: one new line at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Thanks for all the feedback, this should be ready to merge now, @c21 can you approve when you get a chance? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pcmoritz, LGTM!
…#27850) See discussion in ray-project#27738 Co-authored-by: matthewdeng <[email protected]> Signed-off-by: Zhi Lin <[email protected]>
…#27850) See discussion in ray-project#27738 Co-authored-by: matthewdeng <[email protected]> Signed-off-by: ilee300a <[email protected]>
Why are these changes needed?
See discussion in #27738
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.