Autocomplete in the table browser in SQL lab is broken - Fix part 2 #7770

khtruong · 2019-06-24T20:10:12Z

SUMMARY

This is the complete fix for autocomplete in the table browser in SQL lab. We use the createFilterOptions API from the react-select-fast-filter-options library for autocomplete. The API expects a list of option objects where its properties value and label are strings. The bug is that we were passing objects to the property value, which caused the algorithm to not work properly. Now we explicitly pass strings to the API.

Two additional changes I added:

So this entire component is buggy and doesn't work as expected. We have one function that will add the option if it is missing to the list with just the table name. I believe the original intent was to display something while we were fetching the tables. This led to a few bugs:

We may be adding a table that no longer exists to the list, which doesn't make sense. So when the user selects this table, it just breaks silently because the table doesn't exist.
Even if the table exists, we did not populate this option with the schema and so when we make a request to get the table, again it breaks silently because we get the table with just its name and no schema information.

I feel that if we really wanted to display something while we fetch the tables, we should do it properly with tests. So I just removed the buggy code for now and thought we can revisit it later if that is a feature we really want.

I had to update the unit tests to return promises because if you do not return a promise, Jest just calls the promise and continues onto the next test. This caused false positives. There were two tests that are broken and need to be fixed. I filed TableSelector handle error tests do not run correctly #7768 to track this so that this PR did not contain unrelated changes.

TEST PLAN

Manually and extended unit tests

REVIEWERS

@mistercrunch @betodealmeida @DiggidyDave @xtinec

betodealmeida · 2019-06-25T16:53:35Z

superset/assets/src/components/TableSelector.jsx

@@ -191,12 +199,6 @@ export default class TableSelector extends React.PureComponent {
      this.onChange();
    });
  }
-  addOptionIfMissing(options, value) {


@khtruong we don't need this anymore?

Oh I'm sorry. I totally forgot to mention in my description. So this entire component is buggy and doesn't work as expected. What this function does is that if the option is missing, then we add it to the list with just the table name. I believe the original intent was to display something while we were fetching the tables. This led to a few bugs:

We may be adding a table that no longer exists to the list, which doesn't make sense. So when the user selects this table, it just breaks silently because the table doesn't exist.

Even if the table exists, we did not populate this option with the schema and so when we make a request to get the table, again it breaks silently because we get the table with just its name and no schema information.

I felt that if we really wanted to display something while we fetch the tables, we should do it properly with tests. So I just removed the buggy code for now and thought we can revisit it later if that is a feature we really want.

The component is still buggy but it's less buggy. I'll add this explanation to the description. Let me know your thoughts.

Thanks, make sense!

superset/assets/spec/javascripts/components/TableSelector_spec.jsx

codecov-io · 2019-06-25T20:14:11Z

Codecov Report

❗ No coverage uploaded for pull request base (master@15426fe). Click here to learn what that means.
The diff coverage is 50%.

@@            Coverage Diff            @@
##             master    #7770   +/-   ##
=========================================
  Coverage          ?   65.89%           
=========================================
  Files             ?      459           
  Lines             ?    22007           
  Branches          ?     2417           
=========================================
  Hits              ?    14502           
  Misses            ?     7384           
  Partials          ?      121

Impacted Files	Coverage Δ
superset/views/core.py	`73.07% <0%> (ø)`
superset/assets/src/components/TableSelector.jsx	`84.16% <100%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 15426fe...000ccf6. Read the comment docs.

mistercrunch

Note that <TableSelector /> is also used in the <TableEditor />, where it's used in somewhat different ways. If a chart points to a table and that physical table gets dropped, we still need to see it there.

Really in that case we should have some sort of red exclamation point with a tooltip highlighting that fact.

mistercrunch · 2019-06-26T05:57:15Z

Another element of complexity that I was dealing with when I generalized that component (for <TableEditor />) is the [optional, see this.props.database.allow_multi_schema_metadata_fetch] feature where if the schema isn't selected you can in theory search (as in autocomplete) across all tables across all schemas. This feature makes the code opaque and depends on eager caching techniques in the backend (I think at Airbnb there's a cron refreshing that cache from the catalog every 5).

Clearly searching/ across schemas is super useful, but adds another layer of complexity on top or an already heavy component.

mistercrunch · 2019-06-26T06:04:27Z

A bit more context. I think recently in #7297, there was an effort to move from using "fully-qualified" table names as in {schema}.{tableName} to using objects instead as in {schema, tableName}. I think that notation was the way that we dealt with schemas when allow_multi_schema_metadata_fetch == true. The driver for the change was that Apache Drill had tables name with dots in them, and that was throwing off some if the dot-spliting logic...

villebro · 2019-06-26T06:12:43Z

The breaking change @mistercrunch is referring to was introduced in #7453 , where fully qualified table names were replaced by objects. It seems the consequences were more far reaching than were understood at the time..

khtruong · 2019-06-26T16:47:24Z

Note that <TableSelector /> is also used in the <TableEditor />, where it's used in somewhat different ways. If a chart points to a table and that physical table gets dropped, we still need to see it there.

Really in that case we should have some sort of red exclamation point with a tooltip highlighting that fact.

I didn't see TableEditor but I did see TableSelector in DataSourceEditor and verified this fix there. There was also another issue when a user explores a chart based off a query. The query name would then be added to the list of tables, which doesn't make sense.

I could file a Github to track the ideal experience where there is some sort of red bang with a tooltip letting the user know what the physical table was. Otherwise putting addOptionIfMissing brings back all the issues highlighted earlier. Let me know what you think.

khtruong · 2019-06-26T17:04:35Z

@villebro @mistercrunch Thanks for the backstory! Wow, I can see that there was a lot of work done to fix the previous issues. One question: Will there be a regression if the option value is not specifically an object {table: ..., schema: ...}? I did not use fully-qualified table names. I just pulled out table and schema into separate properties in the parent object and it seems fine. Am I missing something?

villebro · 2019-06-26T18:14:27Z

@khtruong I think any solution that makes it possible to unambiguously separate schema from table name should be fine. IIRC one option that was proposed was to keep using a fully qualified string, but apply escaping where necessary. I also remember that @john-bodley is/was working on introducing the concept of a cluster, i.e. one level of abstraction above schema (cluster.schema.table), and who knows what some future engine brings along (I can see some cloud DW introducing region, too).. So perhaps good to leave room for future expansion, while at the same time trying to keep it as simple as possible. Which might be a tall order 😄

khtruong · 2019-06-26T23:36:50Z

@mistercrunch Let me know if you prefer a specific change. I think my changes will not bring back issues you saw when we used the fully qualified names in the past because the table name and schema name are still separate properties. And I can file a Github issue to track the ideal experience where there is some sort of red bang with a tooltip letting the user know if the physical table is missing.

mistercrunch · 2019-06-28T15:27:56Z

I just wanted to make sure you have all the context for the complexity in this area. I'm ok with the change if there are no regressions across the board.

The caching feature (from cross schema table name autocomplete) I mentioned seems brittle, and someone at Airbnb (where they use the feature) might want to confirm that this works there.

khtruong · 2019-06-28T18:28:26Z

I just wanted to make sure you have all the context for the complexity in this area. I'm ok with the change if there are no regressions across the board.

The caching feature (from cross schema table name autocomplete) I mentioned seems brittle, and someone at Airbnb (where they use the feature) might want to confirm that this works there.

Gotcha! I don't think this should affect the caching feature since that is a backend operation and this change is mostly frontend changes and does not modify the schema/table name itself.

I have filed issue #7798 as well.

villebro · 2019-06-28T18:37:05Z

If it's of any help I'll be happy to run this PR through some testing, as the Presto + Pular and Drill issues that were the basis for my PRs that were mentioned here are still fresh in my memory. Will be able to do a thorough testing during the weekend.

khtruong · 2019-06-28T18:43:16Z

If it's of any help I'll be happy to run this PR through some testing, as the Presto + Pular and Drill issues that were the basis for my PRs that were mentioned here are still fresh in my memory. Will be able to do a thorough testing during the weekend.

That'd be awesome @villebro ! Thanks!

villebro

Functionally this works 100 % with all difficult edge cases I used to validate the Presto+Pulsar and Drill PRs. Apart from aligning the object structures across the frontend and backend this LGTM. Incidentally, what is the role of title, as it seems to always have the same value as label?

villebro · 2019-06-29T11:25:01Z

superset/assets/src/components/TableSelector.jsx

+        value: o.value.table,
+        schema: o.value.schema,
+        label: o.label,
+        title: o.label,


To avoid confusion down the line I think it might be a good idea to align what is returned by the backend with this new structure. Something along these lines (didn't test so might contain typos):

diff --git a/superset/assets/spec/javascripts/sqllab/fixtures.js b/superset/assets/spec/javascripts/sqllab/fixtures.js index 99e740c3..2b737fef 100644 --- a/superset/assets/spec/javascripts/sqllab/fixtures.js +++ b/superset/assets/spec/javascripts/sqllab/fixtures.js @@ -343,16 +343,22 @@ export const databases = { export const tables = { options: [ { - value: { schema: 'main', table: 'birth_names' }, + value: 'birth_names', + schema: 'main', label: 'birth_names', + title: 'birth_names', }, { - value: { schema: 'main', table: 'energy_usage' }, + value: 'energy_usage', + schema: 'main', label: 'energy_usage', + title: 'energy_usage', }, { - value: { schema: 'main', table: 'wb_health_population' }, + value: 'wb_health_population', + schema: 'main', label: 'wb_health_population', + title: 'wb_health_population', }, ], }; diff --git a/superset/assets/src/components/TableSelector.jsx b/superset/assets/src/components/TableSelector.jsx index 4c9ef7a9..dc5d075c 100644 --- a/superset/assets/src/components/TableSelector.jsx +++ b/superset/assets/src/components/TableSelector.jsx @@ -108,10 +108,10 @@ export default class TableSelector extends React.PureComponent { `${encodeURIComponent(this.props.schema)}/${encodeURIComponent(input)}`), }).then(({ json }) => { const options = json.options.map(o => ({ - value: o.value.table, - schema: o.value.schema, + value: o.value, + schema: o.schema, label: o.label, - title: o.label, + title: o.title, })); return ({ options }); }); @@ -138,10 +138,10 @@ export default class TableSelector extends React.PureComponent { return SupersetClient.get({ endpoint }) .then(({ json }) => { const options = json.options.map(o => ({ - value: o.value.table, - schema: o.value.schema, + value: o.value, + schema: o.schema, label: o.label, - title: o.label, + title: o.title, })); this.setState(() => ({ filterOptions: createFilterOptions({ options }), diff --git a/superset/views/core.py b/superset/views/core.py index 06727f7a..8355ec39 100755 --- a/superset/views/core.py +++ b/superset/views/core.py @@ -1627,14 +1627,16 @@ class Superset(BaseSupersetView): max_tables = max_items * len(tables) // total_items max_views = max_items * len(views) // total_items - def get_datasource_value(ds_name: utils.DatasourceName) -> Dict[str, str]: - return {'schema': ds_name.schema, 'table': ds_name.table} - - table_options = [{'value': get_datasource_value(tn), - 'label': get_datasource_label(tn)} + table_options = [{'value': tn.table, + 'schema': tn.schema, + 'label': get_datasource_label(tn), + 'title': get_datasource_label(tn)} for tn in tables[:max_tables]] - table_options.extend([{'value': get_datasource_value(vn), - 'label': f'[view] {get_datasource_label(vn)}'} + table_options.extend([{'value': vn.table, + 'schema': vn.schema, + 'label': f'[view] {get_datasource_label(vn)}', + 'title': f'[view] {get_datasource_label(vn)}', + } for vn in views[:max_views]]) payload = { 'tableLength': len(tables) + len(views),

Sounds good! I'll update the code. As for the title, I am not sure what the history is on that attribute and why we need it.

villebro · 2019-06-29T11:28:32Z

superset/assets/src/components/TableSelector.jsx

+            value: o.value.table,
+            schema: o.value.schema,
+            label: o.label,
+            title: o.label,


Same as above

villebro · 2019-06-29T16:48:28Z

One problem that @mistercrunch pointed out is that if value always contains only the table name, it will not be possible to search over the fully qualified name if no schema is selected multi schema fetch is enabled. However, if we add a property table with only the table name, and provide fully qualified names in value when no schema is selected (otherwise just table name) similar to how label is now populated, searching over both schema and table will be possible.

I think these changes go slightly beyond the original intent of this PR, so I propose we merge this as is, and I can put together a new PR that tackles this stuff I've just proposed (should be fairly quick).

khtruong · 2019-07-01T17:40:49Z

One problem that @mistercrunch pointed out is that if value always contains only the table name, it will not be possible to search over the fully qualified name if no schema is selected multi schema fetch is enabled. However, if we add a property table with only the table name, and provide fully qualified names in value when no schema is selected (otherwise just table name) similar to how label is now populated, searching over both schema and table will be possible.

I think these changes go slightly beyond the original intent of this PR, so I propose we merge this as is, and I can put together a new PR that tackles this stuff I've just proposed (should be fairly quick).

Got it. That would be awesome if you have time. Otherwise, we can file an issue to track it for the future.

fix: table autocomplete and update unit tests

ce1551b

pull-request-size bot added the size/L label Jun 24, 2019

fix: linting issues

0f59383

betodealmeida reviewed Jun 25, 2019

View reviewed changes

graceguo-supercat reviewed Jun 25, 2019

View reviewed changes

superset/assets/spec/javascripts/components/TableSelector_spec.jsx Outdated Show resolved Hide resolved

mistercrunch requested changes Jun 26, 2019

View reviewed changes

fix: disable tests properly

8642af4

empty commit

2537fee

villebro requested changes Jun 29, 2019

View reviewed changes

villebro approved these changes Jun 29, 2019

View reviewed changes

fix: align structure across fe and be

3e97acf

mistercrunch approved these changes Jul 1, 2019

View reviewed changes

Merge remote-tracking branch 'apache/master' into VIZ-713-v2

000ccf6

betodealmeida merged commit 963dce6 into apache:master Jul 1, 2019

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.34.0 labels Feb 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autocomplete in the table browser in SQL lab is broken - Fix part 2 #7770

Autocomplete in the table browser in SQL lab is broken - Fix part 2 #7770

khtruong commented Jun 24, 2019 •

edited

Loading

betodealmeida Jun 25, 2019

khtruong Jun 25, 2019 •

edited

Loading

betodealmeida Jun 26, 2019

codecov-io commented Jun 25, 2019 •

edited

Loading

mistercrunch left a comment

mistercrunch commented Jun 26, 2019

mistercrunch commented Jun 26, 2019 •

edited

Loading

villebro commented Jun 26, 2019

khtruong commented Jun 26, 2019

khtruong commented Jun 26, 2019

villebro commented Jun 26, 2019

khtruong commented Jun 26, 2019

mistercrunch commented Jun 28, 2019

khtruong commented Jun 28, 2019

villebro commented Jun 28, 2019

khtruong commented Jun 28, 2019

villebro left a comment

villebro Jun 29, 2019

khtruong Jul 1, 2019

villebro Jun 29, 2019

villebro commented Jun 29, 2019

khtruong commented Jul 1, 2019

Autocomplete in the table browser in SQL lab is broken - Fix part 2 #7770

Autocomplete in the table browser in SQL lab is broken - Fix part 2 #7770

Conversation

khtruong commented Jun 24, 2019 • edited Loading

CATEGORY

SUMMARY

TEST PLAN

REVIEWERS

betodealmeida Jun 25, 2019

Choose a reason for hiding this comment

khtruong Jun 25, 2019 • edited Loading

Choose a reason for hiding this comment

betodealmeida Jun 26, 2019

Choose a reason for hiding this comment

codecov-io commented Jun 25, 2019 • edited Loading

Codecov Report

mistercrunch left a comment

Choose a reason for hiding this comment

mistercrunch commented Jun 26, 2019

mistercrunch commented Jun 26, 2019 • edited Loading

villebro commented Jun 26, 2019

khtruong commented Jun 26, 2019

khtruong commented Jun 26, 2019

villebro commented Jun 26, 2019

khtruong commented Jun 26, 2019

mistercrunch commented Jun 28, 2019

khtruong commented Jun 28, 2019

villebro commented Jun 28, 2019

khtruong commented Jun 28, 2019

villebro left a comment

Choose a reason for hiding this comment

villebro Jun 29, 2019

Choose a reason for hiding this comment

khtruong Jul 1, 2019

Choose a reason for hiding this comment

villebro Jun 29, 2019

Choose a reason for hiding this comment

villebro commented Jun 29, 2019

khtruong commented Jul 1, 2019

khtruong commented Jun 24, 2019 •

edited

Loading

khtruong Jun 25, 2019 •

edited

Loading

codecov-io commented Jun 25, 2019 •

edited

Loading

mistercrunch commented Jun 26, 2019 •

edited

Loading