Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move sources to datafusion-table-providers #43

Closed
backkem opened this issue Aug 22, 2024 · 5 comments
Closed

Move sources to datafusion-table-providers #43

backkem opened this issue Aug 22, 2024 · 5 comments

Comments

@backkem
Copy link
Collaborator

backkem commented Aug 22, 2024

I wanted to open a conversation around the concerns of datafusion-federation vs datafusion-table-providers. Historically, I created the /sources folder in datafusion-federation to show off the federation agains a couple remotes. However, it seems like datafusion-table-providers would be the better place for theseto live. Therefore, I propose to:

  • remove the sources from this repo
  • rewrite the examples in this repo using datafusion-table-providers.
  • Since this leaves datafusion-federation-sql with relatively little code: move this code to the datafusion-federation crate itself, behind a sql feature.

A couple of points to iron out:

  • Do we agree with the overall approach?
  • Is there sufficient appetite to move the flight-sql package (including executor & server ) to datafusion-table-providers or do we leave this in the datafusion-federation tree?

cc @phillipleblanc

@phillipleblanc
Copy link
Contributor

phillipleblanc commented Aug 22, 2024

Yeah, I agree with the approach above. I will submit some PRs for improvements we've made to datafusion-federation in our fork, aiming for this weekend or early next week to queue those up - which should make it smoother to integration with the table providers in our crate.

RE: flight-sql there is another implementation of FlightSQL contributed here: apache/datafusion#11938 (comment) / https://github.com/datafusion-contrib/datafusion-table-provider-flight. I've reached out to see if they are interested in moving it to the datafusion-table-providers repo. It currently doesn't support federation though. I would say let's leave flight-sql in this repo for now, and revisit depending on what happens with datafusion-table-provider-flight.

@backkem
Copy link
Collaborator Author

backkem commented Aug 22, 2024

Ok, great.

For now we'll move the SQL Flight pieces as follows:

  • Move the server side to a datafusion-flight-sql-server crate in this repo and publish it to cargo. We can also add some examples with authentication.
  • Move the client side to a datafusion-flight-sql-table-provider crate in this repo. We'll not published to cargo and look to merge it with datafusion-contrib/datafusion-table-provider-flight once we know where it ends up.

I'm still wondering on what the best place would be for the Datafusion-backed Arrow SQL Flight server-side implementation. It has a relation to the federation concept. However, the server-side implementation has no direct dependency on the federation logic (since that's client side). IDK if there is interest in the SQL Flight server in Datafusion core itself? The SQL Flight protocol seems quite "native" to the ecosystem. A canonical Datafusion-based implementation/framework seems sensible. However, I know there are sensitivities around broadening the scope of Datafusion core. cc @alamb ?

@ccciudatu
Copy link

@phillipleblanc @backkem datafusion-contrib/datafusion-table-providers#76

@phillipleblanc
Copy link
Contributor

The examples in this crate can be rewritten to use https://crates.io/crates/datafusion-table-providers now

@backkem
Copy link
Collaborator Author

backkem commented Sep 20, 2024

Closing with only #62 remaining.

@backkem backkem closed this as completed Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants