Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary traits and wrapper types from the query crate #3884

Closed
6 tasks done
evenyag opened this issue May 8, 2024 · 5 comments
Closed
6 tasks done

Remove unnecessary traits and wrapper types from the query crate #3884

evenyag opened this issue May 8, 2024 · 5 comments
Labels
A-query Involves code in query path good first issue Good for newcomers

Comments

@evenyag
Copy link
Contributor

evenyag commented May 8, 2024

What type of enhancement is this?

Refactor

What does the enhancement do?

We defined some wrapper types and traits for the query engine

Most implementations simply forward requests to datafusion. Since we are highly coupled with datafusion and have no plan to support another query engine, we could remove these types.

Implementation challenges

For Expr:

/// Central struct of query API.
/// Represent logical expressions such as `A + 1`, or `CAST(c1 AS int)`.
#[derive(Clone, PartialEq, Eq, Hash, Debug)]
pub struct Expr {
df_expr: DfExpr,
}

We could simply remove the Expr and re-export datafusion's Expr

pub use datafusion_expr::expr::Expr;

The LogicalPlan only has one variant so we can replace it with datafusion's LogicalPlan. But we also need to refactor methods of the enum so they can take datafusion's LogicalPlan as input.

#[derive(Clone, Debug)]
pub enum LogicalPlan {
DfPlan(DfLogicalPlan),
}

We could also remove adapters like the DfPhysicalPlanAdapter if we remove the PhysicalPlan.

Many places use Expr, LogicalPlan, and PhysicalPlan. We might need to refactor them in different PRs.

@evenyag evenyag added good first issue Good for newcomers A-query Involves code in query path labels May 8, 2024
@evenyag evenyag changed the title Remove unnecessary traits and wrapper types for the query engine Remove unnecessary traits and wrapper types from the query crate May 8, 2024
@sunheyi6
Copy link

@evenyag i want to pr LogicalPlan part.I took a rough look. Is it to delete the query::plan file and replace all query::plan::LogicalPlan with datafusion_expr::LogicalPlan?

@evenyag
Copy link
Contributor Author

evenyag commented Jul 16, 2024

@evenyag i want to pr LogicalPlan part.I took a rough look. Is it to delete the query::plan file and replace all query::plan::LogicalPlan with datafusion_expr::LogicalPlan?

We can still keep the query::plan file as we already implemented some methods for LogicalPlan. We may need to implement them as helper functions that take datafusion's plan as a parameter.

@sunheyi6
Copy link

@evenyag i want to pr LogicalPlan part.I took a rough look. Is it to delete the query::plan file and replace all query::plan::LogicalPlan with datafusion_expr::LogicalPlan?

We can still keep the query::plan file as we already implemented some methods for LogicalPlan. We may need to implement them as helper functions that take datafusion's plan as a parameter.

Could you please help me check if this modification meets your means?

before:

    pub fn schema(&self) -> Result<Schema> {
        match self {
            Self::DfPlan(plan) => {
                let df_schema = plan.schema();
                df_schema
                    .clone()
                    .try_into()
                    .context(ConvertDatafusionSchemaSnafu)
            }
        }
    }

after:

    pub fn schema(&self, plan: &DfLogicalPlan) -> Result<Schema> {
        let df_schema = plan.schema();
        df_schema.clone()
            .try_into()
            .context(ConvertDatafusionSchemaSnafu)
    }

@evenyag
Copy link
Contributor Author

evenyag commented Jul 16, 2024

You should remove &self

fn plan_schema(plan: &DfLogicalPlan) -> Result<Schema> {
    let df_schema = plan.schema();
    df_schema.clone()
        .try_into()
        .context(ConvertDatafusionSchemaSnafu)
}

I also renamed the function to plan_schema() to indicate it returns a schema for a plan.

@leaf-potato
Copy link
Contributor

leaf-potato commented Jul 23, 2024

@evenyag Remove the remaining 3 traits:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-query Involves code in query path good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants