-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ingestion/trino): Add sibling support in ingestion #9853
feat(ingestion/trino): Add sibling support in ingestion #9853
Conversation
class TrinoConfig(BasicSQLAlchemyConfig): | ||
# defaults | ||
scheme: str = Field(default="trino", description="", hidden_from_docs=True) | ||
|
||
catalog_to_connector_details: Dict[str, ConnectorDetail] = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's make sure we're using the terms catalog and connector in a way that's consistent with https://trino.io/docs/current/overview/concepts.html#data-sources
The docs for this are also pretty unclear - terms like "three tier connector" are not commonplace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if its still unclear
catalog_name, ConnectorDetail() | ||
) | ||
|
||
if connector_platform_details: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why can't we just always try to generate this relationship?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because there can be some trino connector platform which datahub doesn't support yet.
Eg: https://trino.io/docs/current/connector/hudi.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few small things, but mostly looks good
""" | ||
).strip() | ||
res = connection.execute(sql.text(query)) | ||
catalog_connector_dict = {row.catalog_name: row.connector_name for row in res} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
split this up into two methods - one to generate catalog_connector_dict, and the second doing the lookup
the lru_cache annotation should be on the first one
platform_instance=connector_details.platform_instance, | ||
env=connector_details.env, | ||
) | ||
elif connector_details.connector_database: # else connector is three tier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add an else clause that reports a warning that the connector_database is missing
) | ||
connector_platform_name = KNOWN_CONNECTOR_PLATFORM_MAPPING.get( | ||
connector_details.connector_platform | ||
if connector_details.connector_platform |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
connector_details.connector_platform or connector_name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Checklist