Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example for FunctionFactory #9482

Merged
merged 7 commits into from
Mar 10, 2024
Merged

Conversation

milenkovicm
Copy link
Contributor

@milenkovicm milenkovicm commented Mar 6, 2024

Which issue does this PR close?

Closes #9448.

Rationale for this change

Explained in #9448. It covers example for functionality added in #9304 & #9333

What changes are included in this PR?

Example demonstrating functionality

Are these changes tested?

Yes

Are there any user-facing changes?

No

@milenkovicm
Copy link
Contributor Author

milenkovicm commented Mar 6, 2024

@alamb when you get chance 😀 thanks ! Do we need more documentation for it ?

Another follow up functionality can be to expose functions in system catalog and/or using DESCRIBE FUNCTION

@milenkovicm milenkovicm changed the title Add example for [FunctionFactory] Add example for FunctionFactory Mar 6, 2024
@github-actions github-actions bot added the core Core DataFusion crate label Mar 7, 2024
@alamb
Copy link
Contributor

alamb commented Mar 8, 2024

Thanks @milenkovicm -- I will review this over the next day or two. I am sorry I am behind on reviews and catching up now

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much @milenkovicm -- this feature is SOO cool.

Can you just imagine a library of FunctionHandlers so people could pick from a list of types of functions they wanted to support:

  • I want support for pytorch models"
  • I want to support SQL macros
  • Oh, and also I want to support WASM

A few imports, a cargo build command and off they go. 🤩

I pushed some word smithing comments up to your branch, as well as some comments that are more in the "explain it line by line" style -- let me know what you think

We can revert them if you prefer. What you had was great and I just got excited as I had it checked out locally

@@ -826,11 +914,22 @@ async fn create_scalar_function_from_sql_statement() -> Result<()> {
.await
.is_err());

ctx.sql("select better_add(2.0, 2.0)").await?.show().await?;
let result = ctx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


#[tokio::main]
async fn main() -> Result<()> {
let runtime_config = RuntimeConfig::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example showed how awkward it was to create the appropriate SessionState, so I added a convenience methods on SessionContext to make it easier

@@ -54,6 +54,7 @@ cargo run --example csv_sql
- [`deserialize_to_struct.rs`](examples/deserialize_to_struct.rs): Convert query results into rust structs using serde
- [`expr_api.rs`](examples/expr_api.rs): Create, execute, simplify and analyze `Expr`s
- [`flight_sql_server.rs`](examples/flight/flight_sql_server.rs): Run DataFusion as a standalone process and execute SQL queries from JDBC clients
- [`function_factory.rs`](examples/function_factory.rs): Register `CREATE FUNCTION` handler to implement SQL macros
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the example


ctx.sql(sql).await?.show().await?;

// Invoke f2, and we expect to see 1 + (1 + 2) = 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed some of the other things that didn't really help teach people how to use this feature any more (they were more like ensuring all features got covered) to keep it shorter

internal_err!("This function should not get invoked!")
}

/// The simplify function is called to simply a call such as `f2(2)`. This
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to add some comments inline that explained what the functions were doing with more words, but hopefully for someone who was seeing if this worked for their usecase

@milenkovicm
Copy link
Contributor Author

Thanks a lot @alamb, your additions totally make sense.
One question for you, is there a plan to add simplify to other function types?

Also, inspired by simplification you've done in SessionContext, maybe we should add companion object to FunctionFactory so it can register configuration as well. At the moment its a bit of work to register configuration:

https://github.com/milenkovicm/torchfusion/blob/634c2a4e39b81e1969671db8faea50b96bc43c7f/src/lib.rs#L101

let mut session_config = SessionConfig::new().with_information_schema(true);

session_config
      .options_mut()
      .extensions
      // register torch factory configuration
      .insert(TorchConfig::default());

it would be much better if config could be registered when ctx.with_function_factory(...) called ... but that could be a follow up PR

@alamb
Copy link
Contributor

alamb commented Mar 10, 2024

One question for you, is there a plan to add simplify to other function types?

Excellent idea -- I have filed

@alamb
Copy link
Contributor

alamb commented Mar 10, 2024

it would be much better if config could be registered when ctx.with_function_factory(...) called ... but that could be a follow up PR

I filed a ticket to discuss this idea: #9529

@alamb
Copy link
Contributor

alamb commented Mar 10, 2024

Thanks again @milenkovicm

@alamb alamb merged commit 6710e6d into apache:main Mar 10, 2024
23 checks passed
@milenkovicm milenkovicm deleted the create_function_example branch April 12, 2024 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add example for CREATE FUNCTION & FunctionFactory
2 participants