Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new operator for ClickHouse #10893

Closed
bharatnc opened this issue Sep 12, 2020 · 17 comments
Closed

Add new operator for ClickHouse #10893

bharatnc opened this issue Sep 12, 2020 · 17 comments
Assignees
Labels
area:providers good first issue kind:new provider request label to mark request for adding new provider

Comments

@bharatnc
Copy link

bharatnc commented Sep 12, 2020

Description

Add a new operator for ClickHouse.

Use case / motivation

Use case similar to any other operators like Postgres, MySQL etc for working with ClickHouse.

Related Issues

As far as I have looked, there is already a plugin that can be installed.

But would be nice to have this part of airflow alongside other available operators.

If this feature sounds acceptable, then I can work on adding this to airflow (please feel free to assign this issue to me). However, I'm not sure if a similar feature has already been requested before (as far as I have searched through the issues, I didn't find one). Thank you!

@bharatnc bharatnc added the kind:feature Feature Requests label Sep 12, 2020
@boring-cyborg
Copy link

boring-cyborg bot commented Sep 12, 2020

Thanks for opening your first issue here! Be sure to follow the issue template!

@potiuk
Copy link
Member

potiuk commented Sep 12, 2020

Sure thing ! Go ahead @bharatnc

@mik-laj
Copy link
Member

mik-laj commented Sep 12, 2020

Should we use DB API? Thanks to this, we won't have to implement operators for this software, but we will be able to use generic operators for SQL.

@potiuk
Copy link
Member

potiuk commented Sep 12, 2020

I think there is a value in using specific operators for each DB. There is not a lot of code and DBAPI is designed rather as a very low-level API that usually is implemented by driver specific for the Database. This is how most of the DB operators work and allow to implement specific features of each operator with the API's that are much easier to use.

@bharatnc
Copy link
Author

bharatnc commented Sep 12, 2020

Thank you! I haven't taken a look at DB generic API but sounds like what @potiuk describes is true and applicable here. The generic DB API will work for basic queries but a lot of operations and other queries used in Clickhouse have rather specific syntax / functionality unique to that database. The overlap of functionality b/w generic DB API operations and Clickhouse operations are rather not very extensive. So I think it would be better to have a separate operator for this use case so that operations can be made more extensive in future. Do let me know if I am correct or if I've completely missed the intention here.

@potiuk
Copy link
Member

potiuk commented Sep 12, 2020

Regardless if we would like to implement DBAPI also for ClickHouse it is not a good idea to do it now IMHO. If we decide to do that - we should do it as a separate PR and see how much of the commonalities we can extract at all. But I really do not think adding db api when adding new database (and much different than the regular one) is a good idea now. @mik-laj let us know if you think otherwise, but please go ahead @bharatnc without it.

@eladkal
Copy link
Contributor

eladkal commented Mar 18, 2022

For anyone who wants to pick this task:
Python driver that can be used: https://github.com/mymarilyn/clickhouse-driver
Reference for PR that added another influxdb provider #17068

@eladkal eladkal added kind:new provider request label to mark request for adding new provider and removed kind:feature Feature Requests labels Mar 18, 2022
@pateash
Copy link
Contributor

pateash commented Apr 1, 2022

@eladkal, I would like to pick this up.

@subkanthi
Copy link
Contributor

subkanthi commented May 12, 2022

Is this being worked on, happy to contribute to the Async version

@eladkal
Copy link
Contributor

eladkal commented May 12, 2022

The issue here is that this means adding new provider and right now we are discussing in the mailing list about approch to new providers
https://lists.apache.org/thread/nvfc75kj2w1tywvvkw8ho5wkx1dcvgrn

Maybe you can share your thoughts on this in the mailing list?

@pateash
Copy link
Contributor

pateash commented May 17, 2022

Sorry for late response, I was OOH
As far as I understood from the conversation with @eladkal, that there is a wider conversation going on regarding whether we should add new providers or not.
I had code ready but decided not to raise a PR.
If we want to add a new provider I have changes ready.

@potiuk
Copy link
Member

potiuk commented May 19, 2022

Yeah. We want to make sure we do the right thing - see the discussion. We will resume the discussion in the devlist after the Airflow Summit (and conversations during the Summit are most welcome).

@potiuk
Copy link
Member

potiuk commented Jun 27, 2022

Hello @pateash @subkanthi @bharatnc It took a bit long but I think we got to a consensus regarding new providers, and we also propose some kind of mix-governance aproach, where (among the others) the stakeholders for the future providers (which we are going to technically split to separate repositories soon) should take a bit more responsibility for maintenance:

#24680

If that does not scare you away, and you still want to add the provider to Airflow community providers, feel free to make a PR after we merge #24672 - we are going to change the way how we keep depdencies for providers in order to prepare them to separate to different repository.

Let us know what you want to do, either close the issue or make PR and rebase, and lead it to completion after #24672 is merged.

@pateash
Copy link
Contributor

pateash commented Jun 29, 2022

thanks @potiuk, this makes sense.

I will raise the PR after #24672 is merged.

@potiuk
Copy link
Member

potiuk commented Jul 4, 2022

It's merged.

@pateash
Copy link
Contributor

pateash commented Jul 5, 2022

It's merged.

Thanks @potiuk,
Let me have a look.

@eladkal
Copy link
Contributor

eladkal commented Apr 21, 2023

Policy for adding new provider is explained at https://github.com/apache/airflow/blob/main/PROVIDERS.rst

I'm going to close this issue as the process of adding a provider require investment and commitments from the requester thus it's unlikely that someone will just pick up this request and implement it so no reason to keep this issue open.

@eladkal eladkal closed this as not planned Won't fix, can't repro, duplicate, stale Apr 21, 2023
@eladkal eladkal mentioned this issue Apr 20, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers good first issue kind:new provider request label to mark request for adding new provider
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants