Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Unify all json.(loads|dumps) usage to utils.json #28702

Merged
merged 2 commits into from
May 28, 2024

Conversation

eyalezer
Copy link
Contributor

SUMMARY

Second phase of the json migration to use the new utils.json module

After completing the initial phase of creating the utils.json module as mentioned in the following link: #28522, we are now moving on to the second phase. This phase involves consolidating all json usage and transitioning to the utilization of the newly created module.

During this phase:

  • Refactored all instances where json was being used and updated the references to utilize the json utils module.
  • Made necessary additions and fixes to the tests to ensure their compatibility with the changes made.

@eyalezer eyalezer requested a review from a team as a code owner May 24, 2024 16:15
@github-actions github-actions bot added risk:db-migration PRs that require a DB migration api Related to the REST API labels May 24, 2024
json_utils.dumps(
payload, default=json_utils.json_iso_dttm_ser, ignore_nan=True
),
json.dumps(payload, default=json.json_iso_dttm_ser, ignore_nan=True),

Check warning

Code scanning / CodeQL

Information exposure through an exception Medium

Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Stack trace information
flows to this location and may be exposed to an external user.
Copy link

codecov bot commented May 24, 2024

Codecov Report

Attention: Patch coverage is 68.04734% with 54 lines in your changes are missing coverage. Please review.

Project coverage is 83.47%. Comparing base (76d897e) to head (774d3d1).
Report is 233 commits behind head on master.

Files Patch % Lines
superset/extensions/pylint.py 0.00% 23 Missing ⚠️
superset/commands/dataset/export.py 25.00% 3 Missing ⚠️
superset/commands/dashboard/export.py 33.33% 2 Missing ⚠️
superset/views/chart/views.py 33.33% 2 Missing ⚠️
superset/charts/data/api.py 75.00% 1 Missing ⚠️
superset/commands/chart/export.py 50.00% 1 Missing ⚠️
superset/commands/chart/importers/v1/utils.py 50.00% 1 Missing ⚠️
superset/commands/database/export.py 50.00% 1 Missing ⚠️
superset/commands/database/validate.py 50.00% 1 Missing ⚠️
superset/commands/query/export.py 50.00% 1 Missing ⚠️
... and 18 more
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #28702       +/-   ##
===========================================
+ Coverage   60.48%   83.47%   +22.98%     
===========================================
  Files        1931      523     -1408     
  Lines       76236    37575    -38661     
  Branches     8568        0     -8568     
===========================================
- Hits        46114    31365    -14749     
+ Misses      28017     6210    -21807     
+ Partials     2105        0     -2105     
Flag Coverage Δ
hive 49.01% <56.21%> (-0.16%) ⬇️
javascript ?
mysql 77.12% <66.27%> (?)
postgres 77.23% <66.86%> (?)
presto 53.56% <58.57%> (-0.25%) ⬇️
python 83.47% <68.04%> (+19.98%) ⬆️
sqlite 76.68% <66.86%> (?)
unit 58.94% <53.84%> (+1.32%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@eyalezer eyalezer force-pushed the json_utils branch 2 times, most recently from 85e6fab to f697f25 Compare May 24, 2024 19:55
@eyalezer
Copy link
Contributor Author

eyalezer commented May 24, 2024

@mistercrunch - here's the second part of the refactor, as expected this is a huge PR: 232 files changed

@mistercrunch
Copy link
Member

OMG glad to see this, happy to help fast-merge this since it'll conflict with everything else otherwise.

Oh as a follow up, or maybe something we may want to bundle here -> @betodealmeida mentioned that there's a fairly easy way for us to add a linting rule that prevents people from doing simple import json, and force going through the wrappers in superset/utils/json.py

@eyalezer
Copy link
Contributor Author

i'll rebase it quickly so it won't catch up more conflicts...

regarding the linter, it's a great idea and it looks like it should be plausible by adding a custom mypy plugin for example... but there's more research needed to be done here.

@betodealmeida
Copy link
Member

@eyalezer I have this PR out, it's a pylint rule: #26803

We could do something similar for json.

@eyalezer
Copy link
Contributor Author

@betodealmeida - awesome, so it's even easier than i thought... i'll look into it now and test it

@eyalezer
Copy link
Contributor Author

@mistercrunch - Rebased before it's too late
@betodealmeida - Thanks for the reference

  • Added another commit with the pylint rule to lint any "import simple/json" - tested and working as expected

@mistercrunch
Copy link
Member

Amazing, this is a massive refactor that should make everything json-related much more manageable. Interestingly python's standard lib json IS simplejson (see here https://stackoverflow.com/questions/712791/what-are-the-differences-between-json-and-simplejson-python-modules), but simplejson is typically ahead. Also having json all in one place allows us to consider things like https://pypi.org/project/ujson/ and do things like what triggered this refactor (improve utf-8 support + error handling) centrally.

@mistercrunch mistercrunch merged commit 07b2449 into apache:master May 28, 2024
35 checks passed
@eyalezer eyalezer deleted the json_utils branch May 28, 2024 21:48
@eyalezer
Copy link
Contributor Author

It's interesting that you brought it up. After I finished refactoring all the json.(loads|dumps) to utilize the json module, one of the first things I did was to check if ujson actually provides any significant performance enhancements to superset. it seems to be functioning fine, but I must admit that I haven't thoroughly tested it to accurately measure the extent of its performance improvements. Nevertheless, if anyone is interested in giving it a try, I still have the branch available here: https://github.com/eyalezer/superset/tree/ujson.

@geido
Copy link
Member

geido commented May 29, 2024

Hey @eyalezer this is great. If you haven't, feel free to join the Apache Superset Slack. I'd be happy to help if you wish to contribute more to the project!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Related to the REST API risk:db-migration PRs that require a DB migration size/XXL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants