Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] pyarrow.json.read_json when read indent json file with report error #40912

Open
FlyTOmeLight opened this issue Mar 30, 2024 · 1 comment

Comments

@FlyTOmeLight
Copy link

FlyTOmeLight commented Mar 30, 2024

Describe the bug, including details regarding any error messages, version, and platform.

pyarrow version: 14.0.2

    pajson.read_json("indent.json")
  File "pyarrow/_json.pyx", line 308, in pyarrow._json.read_json
  File "pyarrow/error.pxi", line 154, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: JSON parse error: Column() changed from object to string in row 0
import pyarrow.json as pajson
pajson.read_json("indent.json")

when i write indent.json, i use json.dump(raw_data, fp, ensure_ascii=False, indent=4)
and then i use pajson.read_json, that bug will be report, i wonder know how to fix it.
here is my wrong json.
wrong.json

Component(s)

Python

@martsec
Copy link

martsec commented Apr 10, 2024

As far as I am aware, Arrow only supports to read line-delimited JSON files (see docs and note)

Though there it seems to be a couple options that could help with reading your json https://arrow.apache.org/docs/python/generated/pyarrow.json.ParseOptions.html#pyarrow.json.ParseOptions

newlines_in_valuesbool, optional (default False)

Whether objects may be printed across multiple lines (for example pretty printed). If false, input must end with an empty line.

@kou kou changed the title [Python]pyarrow.json.read_json when read indent json file with report error [Python] pyarrow.json.read_json when read indent json file with report error Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants