-
Notifications
You must be signed in to change notification settings - Fork 13.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for period character in table names #7453
Conversation
Codecov Report
@@ Coverage Diff @@
## master #7453 +/- ##
==========================================
+ Coverage 65.17% 65.23% +0.05%
==========================================
Files 433 433
Lines 21428 21433 +5
Branches 2360 2358 -2
==========================================
+ Hits 13966 13981 +15
+ Misses 7342 7332 -10
Partials 120 120
Continue to review full report at Codecov.
|
@cgivre please take a look at this PR. If this works out this should make merging your Drill PR slightly easier. |
@villebro I'll take a look this weekend. This pretty much will solve the issues with the Drill integration. |
@@ -279,33 +280,32 @@ def convert_dttm(cls, target_type, dttm): | |||
return "'{}'".format(dttm.strftime('%Y-%m-%d %H:%M:%S')) | |||
|
|||
@classmethod | |||
def fetch_result_sets(cls, db, datasource_type): | |||
"""Returns a list of tables [schema1.table1, schema2.table2, ...] | |||
def get_all_datasource_names(cls, db, datasource_type: str) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason db
wasn't type annotated is because it caused a circular import (models.core
already has a reference to db_engine_specs
). This should probably be refactored, but is outside the scope of this PR.
@villebro I tried this out and it worked really well even without the Drill PR! I think pretty much the only thing that Drill needs now for the integration is the time grains! Thanks for your help with this. |
Thanks for verifying that this works @cgivre . @mistercrunch @john-bodley @betodealmeida would really appreciate help reviewing this, as I think this is one of the last hurdles to being able to support engines that rely on the presence of non-standard characters in schema/table names. |
Kind reminder to committers that this is pending review, would be great to get this reviewed/merged so Drill integration can be finalized (is blocked by this PR). |
@villebro my main comment which is somewhat related to #7490 is given that the cluster/schema/table name construct can be quite complicated and historically we've often flatten these names into a single string (and then split or used regular expressions to extract the components) whether we should move towards using a class (possibly a |
Good point @john-bodley , having a dedicated class with proper parsing/formatting functionality would probably be a good idea. Do you feel this should be addressed as part of this PR, or start a new PR for that? |
I think a separate PR is fine as it’s probably a large change. |
CATEGORY
SUMMARY
In SQL Lab, table names are currently assumed to follow the following convention:
schema.table
ortable
This is handled in the frontend by assuming that a period in the table name always implies a separator between schema and table name. Since there is no standardized way for SQLAlchemy inspectors to return table names (some return
schema.table
, others onlytable
), they can be in either format. Since some databases (at least Apache Drill and Postgres) support periods in schema and table names, and their respective inspectors don't return schema prefixed table names, this causes problems when querying tables, asTableSelector
strips away everything before the period character. This PR moves this logic from the frontend to the backend, and makes it possible to configure this behavior per engine.This proposal changes table name handling in the following way:
try_remove_schema_from_table_name
is added todb_engine_specs
(defaults toTrue
). By default, whenTrue
,get_table_names()
andget_view_names()
checks if a table name starts with the schema name followed by a period, and if so, removes the schema name from the table name. Example:schema.table
becomestable
, whiletable
remains unchanged.{'schema': 'schema_name', 'table': 'table_name'}
when handed to the frontend. Previously they were either of the formattable
orschema.table
. This removes any ambiguity in the frontend.schema.table
.table
only in the dropdown.Filtering also supports this, i.e. when no schema is chosen, the filter substring makes the comparison assuming that the table name is
schema.table
, making it possible to include the schema name in the filter string.SCREENSHOTS
When no schema is chosen, table names are displayed as
schema.table
:If a schema is selected, only the table name is shown (in this case the table name is
test.table
, nottable
in schematest
):TEST PLAN
Tested locally on Postgres and sqlite. Js unit tests updated to correspond to new data structures and python unit tests added to test table name fetching.
ADDITIONAL INFORMATION
REVIEWERS
@cgivre