Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(sqllab): Force trino client async execution #24859

Merged
merged 1 commit into from
Sep 6, 2023

Conversation

giftig
Copy link
Contributor

@giftig giftig commented Aug 1, 2023

SUMMARY

We are currently unable to stop trino queries, because the underlying trino client blocks until the query completes, and doesn't make the query ID or any other info available in the meantime. Unfortunately it doesn't look like they plan to change that any time soon, either.

Make the following changes:

  • Add a new method execute_with_cursor to db_engine_spec which combines execute with handle_cursor, factoring it out of the one place it's used, deep in the execute query logic.
  • Make handle_cursor poll the cursor for query ID, as it's going to be populated asynchronously. Add warnings that the trino impl will require using execute_with_cursor. Currently nothing is directly calling handle_cursor, with the one original call eliminated.
  • Override execute_with_cursor for the trino engine and execute the two tasks in parallel to allow us to poll for the query ID while the query is still blocking.

TESTING INSTRUCTIONS

Test that trino queries can be stopped in SQL Lab, that database errors from trino are still reported correctly, and that queries in other engines still run / stop correctly.

ADDITIONAL INFORMATION

  • Has associated issue: Trino queries cannot be stopped in SQL Lab #24858
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@codecov
Copy link

codecov bot commented Aug 1, 2023

Codecov Report

Merging #24859 (b87caf6) into master (7f9b038) will decrease coverage by 0.08%.
Report is 3 commits behind head on master.
The diff coverage is 83.72%.

❗ Current head b87caf6 differs from pull request most recent head 4919b58. Consider uploading reports for the commit 4919b58 to get more accurate results

@@            Coverage Diff             @@
##           master   #24859      +/-   ##
==========================================
- Coverage   69.00%   68.93%   -0.08%     
==========================================
  Files        1904     1904              
  Lines       74105    74136      +31     
  Branches     8193     8194       +1     
==========================================
- Hits        51137    51105      -32     
- Misses      20847    20912      +65     
+ Partials     2121     2119       -2     
Flag Coverage Δ
hive 54.11% <32.43%> (-0.03%) ⬇️
mysql 79.17% <32.43%> (-0.05%) ⬇️
postgres ?
presto 54.01% <32.43%> (-0.03%) ⬇️
python 83.20% <83.78%> (-0.17%) ⬇️
sqlite ?
unit 55.02% <83.78%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
...ins/legacy-preset-chart-nvd3/src/transformProps.js 7.14% <0.00%> (-0.27%) ⬇️
superset/db_engine_specs/trino.py 82.55% <79.31%> (-1.19%) ⬇️
...dashboard/components/AddSliceCard/AddSliceCard.tsx 68.88% <100.00%> (+5.25%) ⬆️
superset/db_engine_specs/base.py 89.40% <100.00%> (-1.62%) ⬇️
superset/sql_lab.py 86.25% <100.00%> (-0.16%) ⬇️

... and 17 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@giftig
Copy link
Contributor Author

giftig commented Aug 1, 2023

Tests are passing with a small modification to the way the mock is used, and I've just pushed a fix for the small formatting issue black corrected in the tests I modified, so build should now pass.

I'm looking at adding a unit test for this scenario in test_trino.py but the test scenario here will be quite fiddly / timing-based because I need to call execute_with_cursor and then prove that query_id has been found and populated prior to that call returning, which means my test will need a threaded solution similar to what I've done in the actual code, and my execute mock call will have to wait for an arbitrary period of time; there's potential to slow down tests and introduce flakiness based on timing issues with that approach, so I'm not keen.

Existing test coverage covers trino's handle_cursor, along with the new call to handle_with_cursor in sql_lab, but there's no coverage for trino's handle_with_cursor specifically. Perhaps simpler coverage of the trino method will suffice; i.e. make sure that we at least make the expected calls on and fetch the appropriate query_id from the cursor eventually, and not worrying about the timing of that. That won't fully test the scenario I've fixed here, but will at least provide basic coverage of the method.

@giftig giftig force-pushed the issue-24585 branch 2 times, most recently from 8463d02 to ea2fef9 Compare August 1, 2023 15:02
@giftig
Copy link
Contributor Author

giftig commented Aug 1, 2023

Added a basic coverage unit test as described above.

@john-bodley
Copy link
Member

@nytai @villebro would you mind reviewing this? I know @villebro was working with the cancel query logic in the past and I suspect this was previously tested with Trino—though I may be wrong.

@john-bodley john-bodley requested a review from nytai August 1, 2023 16:52
We are currently unable to stop trino queries, because the underlying
trino client blocks until the query completes, and doesn't make the
query ID or any other info available in the meantime. Unfortunately it
doesn't look like they plan to change that any time soon, either.

Make the following changes:
  - Add a new method execute_with_cursor to db_engine_spec which
    combines execute with handle_cursor, factoring it out of the one
    place it's used, deep in the execute query logic.
  - Make handle_cursor poll the cursor for query ID, as it's going to
    be populated asynchronously. Add warnings that the trino impl will
    require using execute_with_cursor. Currently nothing is directly
    calling handle_cursor, with the one original call eliminated.
  - Override execute_with_cursor for the trino engine and execute the
    two tasks in parallel to allow us to poll for the query ID
    while the query is still blocking.
@dungdm93
Copy link
Contributor

dungdm93 commented Aug 2, 2023

After discuss with @giftig and find out this issues is caused by
https://github.com/trinodb/trino-python-client/blob/d6047a80c1a772cb8240f4a81df810b0d48af4df/trino/client.py#L819-L821

I'm gonna ask Trino community about this to figure out it's should be fix in Trino or Superset side.

@giftig
Copy link
Contributor Author

giftig commented Aug 3, 2023

To share the feedback from the trino team via @dungdm93:

Because before this change lot of people didn't use the API correctly and assumed that DDL queries for example didn't need fetch methods to be called.
One option is to have a thread pool within the DB-API client itself on which the queries execute so that Cursor.execute can return eagerly but that's not something we have considered since no one has complained about the change until now.
Also it was an easy solution to fix a bug we had about not implementing protocol correctly - see trinodb/trino-python-client#220

So it doesn't sound like there are immediate plans to fix the issue in the client, though there's a suggestion how it might be done.

Currently the lack of that feature in the client causes a bug in superset so imo the way forward it to merge this PR to fix the issue and to follow up and try to get a more direct fix merged into the trino client. We'll then have to upgrade the trino client in superset, and my understanding from @villebro is we'd prefer to stay a couple of versions behind the latest, so we can assume some lead time to catch up with changes we make there. Once we've upgraded we can revert this workaround.

Thoughts @dungdm93 @villebro ?

@giftig
Copy link
Contributor Author

giftig commented Aug 11, 2023

Sorry to nag @villebro @dungdm93 but can we discuss any remaining concerns with this PR and try to get this moving? I understand there's an alternate fix being proposed by @dungdm93 in parallel, but I think we all agree that will take quite a while to be ready to go into master, and in the meantime this remains a fairly impactful bug. I'd like to get this workaround merged into master in the meantime, with the understanding it can be reverted / dropped from future releases once we've managed to upgrade to a trino client release containing @dungdm93 's fix.

Happy to address any comments which come up on this PR in the meantime, and talk through any reasoning / the testing I did as needed.

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - we can revisit this once the Trino client adds non-blocking functionality.

@villebro villebro merged commit cfda30c into apache:master Sep 6, 2023
29 checks passed
darwinsubramaniam pushed a commit to darwinsubramaniam/superset that referenced this pull request Sep 7, 2023
@giftig giftig deleted the issue-24585 branch September 7, 2023 08:36
@michael-s-molina michael-s-molina added the v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch label Sep 7, 2023
michael-s-molina pushed a commit that referenced this pull request Sep 7, 2023
@rusackas rusackas added the 2.1.2 label Sep 8, 2023
sebastiankruk added a commit to sebastiankruk/superset that referenced this pull request Sep 9, 2023
* fix: Issue apache#24493; Resolved report selection menu in chart and dashboard page (apache#25157)

* fix: DML failures in SQL Lab (apache#25190)

* fix: All values being selected in Select (apache#25202)

* docs: fix wrong type in PREFERRED_DATABASES example (apache#25200)

Signed-off-by: cmontemuino <[email protected]>

* docs: add CVEs for 2.1.1 (apache#25206)

* chore: back port 2.1.1 doc changes (apache#25165)

* feat(sqllab): Show sql in the current result (apache#24787)

* docs(FAQ): add answer re: necessary specs, copy-edit existing answer (apache#24992)

* fix: `is_select` (apache#25189)

* fix: Cypress test to force mouseover (apache#25209)

* fix(sqllab): Force trino client async execution (apache#24859)

* fix: granularity_sqla and GENERIC_CHART_AXES (apache#25213)

* chore: Convert deckgl class components to functional (apache#25177)

* fix: Cypress test to force mouseover (follow-up) (apache#25223)

* fix(docs): Fixing a typo in README.md (apache#25216)

* chore(read_csv): remove deprecated argument (apache#25226)

* chore(trino): remove unnecessary index checks (apache#25211)

---------

Signed-off-by: cmontemuino <[email protected]>
Co-authored-by: Sandeep Patel <[email protected]>
Co-authored-by: Hugh A. Miles II <[email protected]>
Co-authored-by: Michael S. Molina <[email protected]>
Co-authored-by: Carlos M <[email protected]>
Co-authored-by: Daniel Vaz Gaspar <[email protected]>
Co-authored-by: Elizabeth Thompson <[email protected]>
Co-authored-by: JUST.in DO IT <[email protected]>
Co-authored-by: Sam Firke <[email protected]>
Co-authored-by: Beto Dealmeida <[email protected]>
Co-authored-by: Rob Moore <[email protected]>
Co-authored-by: Kamil Gabryjelski <[email protected]>
Co-authored-by: yousoph <[email protected]>
Co-authored-by: Ville Brofeldt <[email protected]>
villebro added a commit to villebro/superset that referenced this pull request Oct 5, 2023
jinghua-qa added a commit to preset-io/superset that referenced this pull request Oct 10, 2023
michael-s-molina pushed a commit that referenced this pull request Oct 13, 2023
saghatelian added a commit to 10webio/superset that referenced this pull request Oct 23, 2023
* fix: is_select with UNION (apache#25290)

(cherry picked from commit bb002d6)

* fix: Add explicit ON DELETE CASCADE for dashboard_roles (apache#25320)

(cherry picked from commit d54e827)

* fix(chart): Supporting custom SQL as temporal x-axis column with filter (apache#25126)

Co-authored-by: Kamil Gabryjelski <[email protected]>

* fix: Use RLS clause instead of ID for cache key (apache#25229)

(cherry picked from commit fba66c6)

* fix: Improve the reliability of alerts & reports (apache#25239)

(cherry picked from commit f672d5d)

* fix: DashboardRoles cascade operation (apache#25349)

(cherry picked from commit a971a28)

* fix: datetime with timezone excel export (apache#25318)

Co-authored-by: Michael S. Molina <[email protected]>
(cherry picked from commit 5ebcd2a)

* fix: Workaround for Cypress ECONNRESET error (apache#25399)

(cherry picked from commit d76ff39)

* fix(sqllab): invalid persisted tab state (apache#25308) (apache#25398)

* fix: Rename on_delete parameter to ondelete (apache#25424)

(cherry picked from commit 893b45f)

* fix: preventing save button from flickering in SQL Lab (apache#25106)

(cherry picked from commit 296ff17)

* fix: chart import (apache#25425)

(cherry picked from commit a4d8f36)

* fix: swagger UI CSP error (apache#25368)

(cherry picked from commit 1716b9f)

* fix: smarter date formatter (apache#25404)

(cherry picked from commit f0080f9)

* fix(sqllab): invalid start date (apache#25437)

* fix(nativeFilters): Speed up native filters by removing unnecessary rerenders (apache#25282)

Co-authored-by: JUST.in DO IT <[email protected]>
(cherry picked from commit a0eeb4d)

* fix(SqlLab): make icon placement even (apache#25372)

(cherry picked from commit 11b49a6)

* fix: Duplicate items when pasting into Select (apache#25447)

(cherry picked from commit 7cf96cd)

* fix: update the SQLAlchemy model definition at json column for Log table (apache#25445)

(cherry picked from commit e83a76a)

* fix(helm chart): set chart appVersion to 3.0.0 (apache#25373)

* fix(mysql): handle string typed decimal results (apache#24241)

(cherry picked from commit 7eab59a)

* fix: Styles not loading because of faulty CSP setting (apache#25468)

(cherry picked from commit 0cebffd)

* fix(sqllab): error with lazy_gettext for tab titles (apache#25469)

(cherry picked from commit ddde178)

* fix: Address Mypy issue which is causing CI to fail (apache#25494)

(cherry picked from commit 36ed617)

* chore: Adds 3.0.1 CHANGELOG

* fix: Unable to sync columns when database or dataset name contains `+` (apache#25390)

(cherry picked from commit dbe0838)

* fix(sqllab): Broken query containing 'children' (apache#25490)

(cherry picked from commit b92957e)

* chore: Expand error detail on screencapture (apache#25519)

(cherry picked from commit ba541e8)

* fix: tags permissions error message (apache#25516)

(cherry picked from commit 50b0816)

* fix: Apply normalization to all dttm columns (apache#25147)

(cherry picked from commit 58fcd29)

* fix: REST API CSRF exempt list (apache#25590)

(cherry picked from commit 549abb5)

* fix(RLS): Fix Info Tooltip + Button Alignment on RLS Modal (apache#25400)

(cherry picked from commit a6d0e6f)

* fix: thubmnails loading - Talisman default config (apache#25486)

(cherry picked from commit 52f631a)

* fix(Presto): catch DatabaseError when testing Presto views (apache#25559)

Co-authored-by: Rui Zhao <[email protected]>
(cherry picked from commit be3714e)

* fix(Charts): Set max row limit + removed the option to use an empty row limit value (apache#25579)

(cherry picked from commit f556ef5)

* fix(window): unavailable localStorage and sessionStorage (apache#25599)

* fix: finestTemporalGrainFormatter (apache#25618)

(cherry picked from commit 62bffaf)

* fix: revert fix(sqllab): Force trino client async execution (apache#24859) (apache#25541)

(cherry picked from commit e56e0de)

* chore: Updates 3.0.1 CHANGELOG

* fix(sqllab): Mistitled for new tab after rename (apache#25523)

(cherry picked from commit a520124)

* fix(sqllab): template validation error within comments (apache#25626)

(cherry picked from commit b370c66)

* fix: avoid 500 errors with SQLLAB_BACKEND_PERSISTENCE (apache#25553)

(cherry picked from commit 99f79f5)

* fix(import): Make sure query context is overwritten for overwriting imports (apache#25493)

(cherry picked from commit a0a0d80)

* fix: permalink save/overwrites in explore (apache#25112)

Co-authored-by: Elizabeth Thompson <[email protected]>
(cherry picked from commit e58a3ab)

* fix(header navlinks): link navlinks to path prefix (apache#25495)

(cherry picked from commit 51c56dd)

* fix: improve upload ZIP file validation (apache#25658)

* fix: warning of nth-child (apache#23638)

(cherry picked from commit 16cc089)

* fix(dremio): Fixes issue with Dremio SQL generation for Charts with Series Limit (apache#25657)

(cherry picked from commit be82657)

---------

Co-authored-by: Beto Dealmeida <[email protected]>
Co-authored-by: John Bodley <[email protected]>
Co-authored-by: Zef Lin <[email protected]>
Co-authored-by: Kamil Gabryjelski <[email protected]>
Co-authored-by: Jack Fragassi <[email protected]>
Co-authored-by: Michael S. Molina <[email protected]>
Co-authored-by: JUST.in DO IT <[email protected]>
Co-authored-by: Jack <[email protected]>
Co-authored-by: Daniel Vaz Gaspar <[email protected]>
Co-authored-by: Stepan <[email protected]>
Co-authored-by: Corbin Bullard <[email protected]>
Co-authored-by: Gyuil Han <[email protected]>
Co-authored-by: Celalettin Calis <[email protected]>
Co-authored-by: Ville Brofeldt <[email protected]>
Co-authored-by: ʈᵃᵢ <[email protected]>
Co-authored-by: Michael S. Molina <[email protected]>
Co-authored-by: mapledan <[email protected]>
Co-authored-by: Igor Khrol <[email protected]>
Co-authored-by: Rui Zhao <[email protected]>
Co-authored-by: Fabien <[email protected]>
Co-authored-by: Hugh A. Miles II <[email protected]>
Co-authored-by: OskarNS <[email protected]>
justinpark added a commit that referenced this pull request Nov 7, 2023
justinpark added a commit that referenced this pull request Nov 7, 2023
cccs-rc pushed a commit to CybercentreCanada/superset that referenced this pull request Mar 6, 2024
cccs-rc pushed a commit to CybercentreCanada/superset that referenced this pull request Mar 6, 2024
@mistercrunch mistercrunch added 🍒 3.0.0 🍒 3.0.1 🍒 3.0.2 🍒 3.0.3 🍒 3.0.4 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.1.0 labels Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/L v3.0 Label added by the release manager to track PRs to be included in the 3.0 branch 🍒 3.0.0 🍒 3.0.1 🍒 3.0.2 🍒 3.0.3 🍒 3.0.4 🚢 3.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants