Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: % replace in values_for_column #28271

Merged
merged 2 commits into from
Apr 30, 2024
Merged

fix: % replace in values_for_column #28271

merged 2 commits into from
Apr 30, 2024

Conversation

betodealmeida
Copy link
Member

@betodealmeida betodealmeida commented Apr 29, 2024

SUMMARY

When we run a query we replace %% (in certain SQLAlchemy dialects) with %:

if engine.dialect.identifier_preparer._double_percents: # noqa
sql = sql.replace("%%", "%")

These dialects use pyformat (%(name)s) or format (%s) to parameterize arguments passed to the cursor. Normally we'd need to pass %% for literal percent signs, but the current behavior for most databases (Postgres, Presto, MySQL, Hive, Druid) is to accept unescaped percent symbols when no parameters are passed — which is the case for the calls Superset does to cursor.execute.

This logic is missing from values_for_column, which results in queries that are valid but, for some reason, extremely slow in Druid:

SELECT * FROM t WHERE name LIKE '%%a';  -- slow

While the expected query is fast:

SELECT * FROM t WHERE name LIKE '%a';  -- fast

This PR adds the same behavior to the values_for_column method.

Note that while drivers accept unescaped percent symbols when no parameters are passed to the cursor, this behavior is undocument in the DB API 2.0 spec. Ideally we'd not do this replacement in Superset, and instead:

  1. In SQL Lab we'd escape % by replacing it with %% before sending the query to the DB.
  2. When going from Explore to SQL Lab we'd unescape %% back to %.

This way Superset is not relying on undocument behavior. But that fix is more complicated and requires a lot of testing.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@eschutho eschutho merged commit fe37d91 into master Apr 30, 2024
27 checks passed
@michael-s-molina michael-s-molina added the v4.0 Label added by the release manager to track PRs to be included in the 4.0 branch label May 1, 2024
jzhao62 pushed a commit to jzhao62/superset that referenced this pull request May 16, 2024
eschutho pushed a commit that referenced this pull request Jun 5, 2024
@mistercrunch mistercrunch added 🍒 4.0.1 🍒 4.0.2 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels labels Jul 24, 2024
@rusackas rusackas deleted the fix-values-for-column branch September 27, 2024 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels preset-io size/L v4.0 Label added by the release manager to track PRs to be included in the 4.0 branch 🍒 4.0.1 🍒 4.0.2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants