Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spike] Explore integration with Dagster #3180

Open
astrojuanlu opened this issue Oct 16, 2023 · 6 comments
Open

[Spike] Explore integration with Dagster #3180

astrojuanlu opened this issue Oct 16, 2023 · 6 comments
Labels
Issue: Feature Request New feature or improvement to existing feature

Comments

@astrojuanlu
Copy link
Member

Description

I have heard from several data people that they're happy with Dagster, which is probably the only "modern", widely used orchestrator that is not mentioned in our docs.

There was a request upstream to add Kedro integration to Dagster dagster-io/dagster#2062 but it's unclear what finally happened.

@astrojuanlu astrojuanlu added Issue: Feature Request New feature or improvement to existing feature Component: Documentation 📄 Issue/PR for markdown and API documentation labels Oct 16, 2023
@stichbury
Copy link
Contributor

I'm not clear what the ticket here is for. Is this documentation along the lines of #2817 ?

@datajoely
Copy link
Contributor

I think it involves more of spike to work out how it would actually work. I think Flyte (LFAI), Dagster and Metaflow all fall into the modern orchestrator space which isn't served by Kedro. I also would push we address some of the fundamentals outlined in #3094 before doing this.

@stichbury
Copy link
Contributor

Thanks! But in that case, it's not a docs ticket so I'll remove the label.

@stichbury stichbury removed the Component: Documentation 📄 Issue/PR for markdown and API documentation label Oct 16, 2023
@astrojuanlu
Copy link
Member Author

Thanks both - yeah initially I thought about it as a docs ticket (even though the phrasing didn't match) but you're right, this should be a spike first.

And good point @datajoely on looking at Flyte and Metaflow too (let's call them Tier 3), although both have 0.1x times the PyPI downloads of Dagster, so I wouldn't consider them on the same level of adoption. For reference, Dagster and Prefect (Tier 2) have about the same number of downloads, and both have 0.05x times Airflow (Tier 1). Kedro lies between Tier 2 and 3 at the moment.

@astrojuanlu astrojuanlu changed the title Explore integration with Dagster [Spike] Explore integration with Dagster Oct 16, 2023
@datajoely
Copy link
Contributor

Aligned - I also think Dagster is closer to Kedro than the others in terms of granularity. In recent years they've really invested in their dbt integration and perhaps we can take inspiration in how they've done that.

@MatthiasRoels
Copy link

I never explored Dagster as much as I should have, I really like the idea of software defined assets. However, Dagster looks complicated as it has many concepts to understand. Also not sure on how individual task run (especially in a Kubernetes context).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: Feature Request New feature or improvement to existing feature
Projects
Status: No status
Development

No branches or pull requests

4 participants