Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: switching out ConnectorRegistry references for DatasourceDAO #20149

Closed
wants to merge 25 commits into from

Conversation

hughhhh
Copy link
Member

@hughhhh hughhhh commented May 20, 2022

SUMMARY

Switching out all references of ConnectorRegistry to DatasourceDAO. In the PR, we are focusing on just removing the references and moving towards allowing other Datasources to power charts.

One issue is the circular dependencies that are blocking us from fully removing the logic to allow the app to run. So i've moved that code out the ConnectorRegistry for now and will prioritize a bigger refactor to stop the import issues in superset moving forward.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@hughhhh hughhhh changed the title Connreg to datadao 2 chore: switching out ConnectorRegistry references for DatasourceDAO May 20, 2022
@@ -282,7 +286,7 @@ def request_access(self) -> FlaskResponse:
datasource_id = request.args.get("datasource_id")
datasource_type = request.args.get("datasource_type")
if datasource_id and datasource_type:
ds_class = ConnectorRegistry.sources.get(datasource_type)
ds_class = DatasourceDAO.sources.get(datasource_type)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use DatasourceType(datasource_type) here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DatasourceDAO.sources should be able to take either a string or the DatasourceType instance, so I think either works here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of course you'll get an error either way if the datasource_type is not valid.

@@ -2535,7 +2539,7 @@ def fetch_datasource_metadata(self) -> FlaskResponse: # pylint: disable=no-self
"""

datasource_id, datasource_type = request.args["datasourceKey"].split("__")
datasource = ConnectorRegistry.get_datasource(
datasource = DatasourceDAO.get_datasource(
datasource_type,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use DatasourceType(datasource.type)

@@ -74,7 +75,7 @@ def save(self) -> FlaskResponse:
datasource_id = datasource_dict.get("id")
datasource_type = datasource_dict.get("type")
database_id = datasource_dict["database"].get("id")
orm_datasource = ConnectorRegistry.get_datasource(
orm_datasource = DatasourceDAO.get_datasource(
datasource_type, datasource_id, db.session
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use DatasourceType(datasource.type)

@@ -132,7 +132,7 @@ def test_query_cache_key_changes_when_datasource_is_updated(self):
cache_key_original = query_context.query_cache_key(query_object)

# make temporary change and revert it to refresh the changed_on property
datasource = ConnectorRegistry.get_datasource(
datasource = DatasourceDAO.get_datasource(
datasource_type=payload["datasource"]["type"],
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use DatasourceType(datasource.type)

@@ -156,7 +156,7 @@ def test_query_cache_key_changes_when_metric_is_updated(self):
payload = get_query_context("birth_names")

# make temporary change and revert it to refresh the changed_on property
datasource = ConnectorRegistry.get_datasource(
datasource = DatasourceDAO.get_datasource(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use DatasourceType(datasource.type)

@@ -407,16 +408,19 @@ def export_dashboards( # pylint: disable=too-many-locals
id_ = target.get("datasetId")
if id_ is None:
continue
datasource = ConnectorRegistry.get_datasource_by_id(session, id_)
# todo(phillip): get datasource type for this method
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this comment

# pylint: disable=import-outside-toplevel
from superset.datasource.dao import DatasourceDAO

return DatasourceDAO.sources[self.datasource_type]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use DatasourceType(datasource_type)

@@ -83,13 +83,14 @@ def invalidate(self) -> Response:
return self.response_400(message=str(error))
datasource_uids = set(datasources.get("datasource_uids", []))
for ds in datasources.get("datasources", []):
ds_obj = ConnectorRegistry.get_datasource_by_name(
ds_obj = DatasourceDAO.get_datasource_by_name(
session=db.session,
datasource_type=ds.get("datasource_type"),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update DatasourceType(datasource_type)

@hughhhh hughhhh force-pushed the connreg-to-datadao_2 branch 3 times, most recently from c9cd671 to 6a85df3 Compare May 20, 2022 21:04
@pull-request-size pull-request-size bot added size/XL and removed size/L labels Jun 5, 2022
@hughhhh hughhhh force-pushed the connreg-to-datadao_2 branch 5 times, most recently from 821da1c to ac2d8aa Compare June 5, 2022 19:04
@hughhhh hughhhh force-pushed the connreg-to-datadao_2 branch 2 times, most recently from efb667e to 051c26c Compare June 5, 2022 23:09
@hughhhh
Copy link
Member Author

hughhhh commented Jun 6, 2022

I left a comment about moving sqlatable specific lookups to the model or a sqlatable dao

Decided to move all the functions into the model

@@ -75,7 +76,7 @@ def run(self) -> Optional[str]:
new_state: TemporaryExploreState = {
"owner": owner,
"datasource_id": datasource_id,
"datasource_type": datasource_type,
"datasource_type": DatasourceType(datasource_type),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this one different from the above one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def get_datasource(
cls, session: Session, datasource_type: DatasourceType, datasource_id: int
) -> Datasource:
if datasource_type not in cls.sources:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're going to want to accept the string value as well here.
I don't think "table" in cls.sources is going to work for example.

Copy link
Member Author

@hughhhh hughhhh Jun 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm currently casting all the values going into the get_datasource function like this:

DatasourceDAO.get_datasource(
  db.session, DatasourceType(datasource["type"]), int(datasource["id"])
)

So moving forward i think we should push for the enums since it's easy to cast via just wrapping it in a DatasourceType() constructor. If we allow string we'd have to add logic to handle/report whenever the user sending us invalid values.

I'm okay with putting the Union[str, DatasourceType] but don't think it takes much work to just wrap the string with the DatasourceType constructor before passing it into the function

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also just wrap source = cls.sources[datasource_type] in a try/catch block. I just think there's too much room for error here across the application with people who may not know that they can't pass in a string. This should work if datasource_type is either a string or the enum key.

set
)
session = self.get_session
all_datasources = SqlaTable.get_all_sqlatables_datasources(session)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, but any reason why you wouldn't name this method get_all_datasources?

Copy link
Member

@eschutho eschutho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most are small questions, but if datasource_type not in cls.sources: could cause problems if a string is passed in. I personally think we could allow both types, but we could prob go either way. It's just going to be difficult to enforce with typing.


BASE_URL = "https://github.com/apache-superset/examples-data/blob/master/"

misc_dash_slices: Set[str] = set() # slices assembled in a 'Misc Chart' dashboard


def get_table_connector_registry() -> Any:
return ConnectorRegistry.sources["table"]
return DatasourceDAO.sources[DatasourceType.TABLE]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use the SqlaTable model directly since this is the only option here?

cls, session: Session, datasource_type: DatasourceType, datasource_id: int
) -> Datasource:
if datasource_type not in cls.sources:
raise DatasourceTypeNotSupportedError()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notes: set this exception to return 422


@classmethod
def get_datasource(
cls, session: Session, datasource_type: DatasourceType, datasource_id: int
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go with a Union[str, DatasourceType] and add test to verify works both and will throw 422

@hughhhh
Copy link
Member Author

hughhhh commented Jun 14, 2022

Moving here #20380

@hughhhh hughhhh closed this Jun 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants