-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix broken JSON encoding #729
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good and tests pass locally. I made some nitpicking suggestions, but I don't insist on any of them :).
test/util/test_jsonencoder.py
Outdated
"np_bool": np.bool(True), | ||
"np_int8": np.int8(1), | ||
"np_uint8": np.uint8(2), | ||
"np_int16": np.int16(3), | ||
"np_uint16": np.uint8(4), | ||
"np_int32": np.int32(5), | ||
"np_uint32": np.uint32(6), | ||
"np_int64": np.int64(7), | ||
"np_uint64": np.uint64(8), | ||
"np_float32": np.float32(9.1), | ||
"np_float64": np.float64(9.2), | ||
"py_bool": True, | ||
"py_int": 11, | ||
"py_float": 12.3, | ||
"py_str": "Hallo", | ||
"py_null": None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quotation marks should be single per current xcube dev guide (or use dict(...)
to avoid quotes entirely in the keys).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quotation marks should be single
Right.
test/util/test_jsonencoder.py
Outdated
"np_bool": bool(np.bool(True)), | ||
"np_int8": int(np.int8(1)), | ||
"np_uint8": int(np.uint8(2)), | ||
"np_int16": int(np.int16(3)), | ||
"np_uint16": int(np.uint8(4)), | ||
"np_int32": int(np.int32(5)), | ||
"np_uint32": int(np.uint32(6)), | ||
"np_int64": int(np.int64(7)), | ||
"np_uint64": int(np.uint64(8)), | ||
"np_float32": float(np.float32(9.1)), | ||
"np_float64": float(np.float64(9.2)), | ||
"py_bool": True, | ||
"py_int": 11, | ||
"py_float": 12.3, | ||
"py_str": "Hallo", | ||
"py_null": None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, single quote marks / dict(...)
preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right.
xcube/util/jsonencoder.py
Outdated
|
||
def default(self, obj): | ||
if hasattr(obj, 'dtype') and hasattr(obj, 'ndim'): | ||
if obj.ndim == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to chain this condition onto the if
above with another and
rather than adding another indentation level. Python guarantees short-circuit evaluation in this case, so we're safe from AttributeError
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No - see my code comment.
xcube/util/jsonencoder.py
Outdated
if np.issubdtype(obj.dtype, np.bool): | ||
return bool(obj) | ||
if np.issubdtype(obj.dtype, np.integer): | ||
return int(obj) | ||
elif np.issubdtype(obj.dtype, np.floating): | ||
return float(obj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can see, if
and elif
are functionally equivalent for these conditions, since they're mutually exclusive and every clause exits the function. I don't have a stylistic preference for one or the other, but I would prefer them all to be the same :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right.
xcube/util/jsonencoder.py
Outdated
elif np.issubdtype(obj.dtype, np.floating): | ||
return float(obj) | ||
else: | ||
return str(obj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure which types are covered by the else
case, but I guess it can't hurt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just discovered that my implementation is possibly wrong because JSONEncoder.default(self, obj)
raises TypeError
by default. Adding extra test...
FYI @TonioF, I need to merge today |
Codecov ReportBase: 92.45% // Head: 92.47% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #729 +/- ##
==========================================
+ Coverage 92.45% 92.47% +0.01%
==========================================
Files 321 323 +2
Lines 30763 30881 +118
==========================================
+ Hits 28443 28556 +113
- Misses 2320 2325 +5
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests pass, and code makes sense to me.
My one requested change just involves adding a TODO comment for a possible useful expansion, so ignore it if you're in a hurry.
xcube/util/jsonencoder.py
Outdated
converted_obj = {k: to_json_value(v) for k, v in obj.items()} | ||
if any(converted_obj[k] is not obj[k] for k in obj.keys()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keys can be non-serializable too, so converted_obj = {to_json_value(k): to_json_value(v)...
would be useful. And then of course we'd also need to check whether the keys are identical in the if
statement on the next line.
No need to implement now if it's working for our current needs, but a TODO comment might be appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
converted_obj = {to_json_value(k): to_json_value(v)...
is wrong because keys must be strings in JSON. I agree to change that, but later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review revised to "Approve" after discussion in chat.
xarray >= 2022.6
andzarr >= 2.11
to ensure sparse Zarr datasets can be written usingdataset.to_zarr(store)
. (xcube to write sparse zarrs #688)xcube.util.jsonencoder
that offers the classNumpyJSONEncoder
used to serialize numpy-like scalar values to JSON. It also offers the functionto_json_value()
to convert Python objects into JSON-serializable versions. The new functionality is required to ensure dataset attributes that are JSON-serializable. For example, the latest version of therioxarray
package generates a_FillValue
attribute with datatypenp.uint8
.Closes #688.
EDIT
It is likely that other places where we
json.dump()
are affected too! In particular, where numeric metadata attributes are serialized such as_FillValue
,add_offset
,valid_max
, etc.Checklist:
New/modified features documented indocs/source/*
CHANGES.md