-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(lineage source): add fine grained lineage support #7904
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approach looks pretty reasonable
metadata-ingestion/src/datahub/ingestion/source/metadata/lineage.py
Outdated
Show resolved
Hide resolved
…ge.py Co-authored-by: Harshal Sheth <[email protected]>
…n/datahub into aseem-update-file-lineage-source
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nits around the defaults but otherwise lgtm
@@ -49,9 +53,44 @@ def type_must_be_supported(cls, v: str) -> str: | |||
return v | |||
|
|||
|
|||
class FineGrainedLineageConfig(ConfigModel): | |||
upstreamType: str = "NONE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably want a default of FIELD_SET
class FineGrainedLineageConfig(ConfigModel): | ||
upstreamType: str = "NONE" | ||
upstreams: Optional[List[str]] | ||
downstreamType: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should default this to FIELD
downstreamType: str | ||
downstreams: Optional[List[str]] | ||
transformOperation: Optional[str] | ||
confidenceScore: Optional[float] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default to 1.0
Waiting on #8135 for docs to get fixed |
Doc build will be fixed in the linked PR. |
please excuse me hijacking this PR - I'm just wondering if there's a roadmap entry to allow querying the transformOperation via GraphQL and / or visualising it in the UI? We have a case for this specific feature where our users would like to see the actual operation in the UI to better understand the lineage. Could someone maybe point me in the right direction if this is planned already? Otherwise happy to try and contribute back. |
@githendrik we don't have any concrete plans for it, but I think it probably does make sense do to that. Surfacing it via GraphQL feels like a no-brainer, and we'd definitely accept a PR there. I'm less sure where we'd want to surface it in the UI - might be worth thinking about more / mocking. The reason is that it's an "edge annotation", but there's a bunch of other edge annotations that we might also want to support in the future e.g. the actual SQL logic, confidence scores on our lineage generation, etc |
Checklist