-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MINOR][DOCS] JSON APIs related documentation fixes #17602
Conversation
`JSON Lines <http://jsonlines.org/>`_(newline-delimited JSON) is supported by default. | ||
For JSON (one record per file), set the `wholeFile` parameter to ``true``. | ||
`JSON Lines <http://jsonlines.org/>`_ (newline-delimited JSON) is supported by default. | ||
For JSON (one record per file), set the ``wholeFile`` parameter to ``true``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`JSON Lines <http://jsonlines.org/>`_(newline-delimited JSON) is supported by default. | ||
For JSON (one record per file), set the `wholeFile` parameter to ``true``. | ||
`JSON Lines <http://jsonlines.org/>`_ (newline-delimited JSON) is supported by default. | ||
For JSON (one record per file), set the ``wholeFile`` parameter to ``true``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Loads a JSON file (<a href="http://jsonlines.org/">JSON Lines text format or | ||
* newline-delimited JSON</a>) and returns the result as a `DataFrame`. | ||
* Loads a JSON file and returns the results as a `DataFrame`. | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
82aadaa
to
9043f01
Compare
@@ -634,7 +634,9 @@ def saveAsTable(self, name, format=None, mode=None, partitionBy=None, **options) | |||
|
|||
@since(1.4) | |||
def json(self, path, mode=None, compression=None, dateFormat=None, timestampFormat=None): | |||
"""Saves the content of the :class:`DataFrame` in JSON format at the specified path. | |||
"""Saves the content of the :class:`DataFrame` in JSON format | |||
(`JSON Lines text format or newline-delimited JSON <http://jsonlines.org/>`_) at the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -883,7 +883,7 @@ Configuration of Parquet can be done using the `setConf` method on `SparkSession | |||
|
|||
<div data-lang="scala" markdown="1"> | |||
Spark SQL can automatically infer the schema of a JSON dataset and load it as a `Dataset[Row]`. | |||
This conversion can be done using `SparkSession.read.json()` on either an RDD of String, | |||
This conversion can be done using `SparkSession.read.json()` on either a `Dataset[String]`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -897,7 +897,7 @@ For a regular multi-line JSON file, set the `wholeFile` option to `true`. | |||
|
|||
<div data-lang="java" markdown="1"> | |||
Spark SQL can automatically infer the schema of a JSON dataset and load it as a `Dataset<Row>`. | |||
This conversion can be done using `SparkSession.read().json()` on either an RDD of String, | |||
This conversion can be done using `SparkSession.read().json()` on either a `Dataset<String>`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test build #75689 has finished for PR 17602 at commit
|
Test build #75691 has finished for PR 17602 at commit
|
Test build #75690 has finished for PR 17602 at commit
|
Test build #75692 has finished for PR 17602 at commit
|
Merged to master |
Thank you @srowen. |
## What changes were proposed in this pull request? This PR proposes corrections related to JSON APIs as below: - Rendering links in Python documentation - Replacing `RDD` to `Dataset` in programing guide - Adding missing description about JSON Lines consistently in `DataFrameReader.json` in Python API - De-duplicating little bit of `DataFrameReader.json` in Scala/Java API ## How was this patch tested? Manually build the documentation via `jekyll build`. Corresponding snapstops will be left on the codes. Note that currently there are Javadoc8 breaks in several places. These are proposed to be handled in apache#17477. So, this PR does not fix those. Author: hyukjinkwon <[email protected]> Closes apache#17602 from HyukjinKwon/minor-json-documentation.
What changes were proposed in this pull request?
This PR proposes corrections related to JSON APIs as below:
RDD
toDataset
in programing guideDataFrameReader.json
in Python APIDataFrameReader.json
in Scala/Java APIHow was this patch tested?
Manually build the documentation via
jekyll build
. Corresponding snapstops will be left on the codes.Note that currently there are Javadoc8 breaks in several places. These are proposed to be handled in #17477. So, this PR does not fix those.