Skip to content

Commit

Permalink
Document LogicalPlan tree node transformations
Browse files Browse the repository at this point in the history
  • Loading branch information
alamb committed Apr 9, 2024
1 parent eb05741 commit 6bea98c
Show file tree
Hide file tree
Showing 4 changed files with 59 additions and 22 deletions.
39 changes: 21 additions & 18 deletions datafusion/core/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -296,11 +296,15 @@
//! A [`LogicalPlan`] is a Directed Acyclic Graph (DAG) of other
//! [`LogicalPlan`]s, each potentially containing embedded [`Expr`]s.
//!
//! [`Expr`]s can be rewritten using the [`TreeNode`] API and simplified using
//! [`ExprSimplifier`]. Examples of working with and executing `Expr`s can be found in the
//! [`expr_api`.rs] example
//! `LogicalPlan`s can be rewritten with [`TreeNode`] API, see the
//! [`tree_node module`] for more details.
//!
//! [`Expr`]s can also be rewritten with [`TreeNode`] API and simplified using
//! [`ExprSimplifier`]. Examples of working with and executing `Expr`s can be
//! found in the [`expr_api`.rs] example
//!
//! [`TreeNode`]: datafusion_common::tree_node::TreeNode
//! [`tree_node module`]: datafusion_expr::logical_plan::tree_node
//! [`ExprSimplifier`]: crate::optimizer::simplify_expressions::ExprSimplifier
//! [`expr_api`.rs]: https://github.com/apache/arrow-datafusion/blob/main/datafusion-examples/examples/expr_api.rs
//!
Expand Down Expand Up @@ -460,12 +464,23 @@
//! [`RecordBatchReader`]: arrow::record_batch::RecordBatchReader
//! [`Array`]: arrow::array::Array

/// DataFusion crate version
pub const DATAFUSION_VERSION: &str = env!("CARGO_PKG_VERSION");

extern crate core;
extern crate sqlparser;

// re-export dependencies from arrow-rs to minimize version maintenance for crate users
pub use arrow;
#[cfg(feature = "parquet")]
pub use parquet;

// Backwards compatibility
pub use common::config;
// Reexport testing macros for compatibility
pub use datafusion_common::assert_batches_eq;
pub use datafusion_common::assert_batches_sorted_eq;

/// DataFusion crate version
pub const DATAFUSION_VERSION: &str = env!("CARGO_PKG_VERSION");

pub mod catalog;
pub mod dataframe;
pub mod datasource;
Expand All @@ -477,11 +492,6 @@ pub mod prelude;
pub mod scalar;
pub mod variable;

// re-export dependencies from arrow-rs to minimize version maintenance for crate users
pub use arrow;
#[cfg(feature = "parquet")]
pub use parquet;

// re-export DataFusion sub-crates at the top level. Use `pub use *`
// so that the contents of the subcrates appears in rustdocs
// for details, see https://github.com/apache/arrow-datafusion/issues/6648
Expand All @@ -496,9 +506,6 @@ pub mod common {
}
}

// Backwards compatibility
pub use common::config;

// NB datafusion execution is re-exported in the `execution` module

/// re-export of [`datafusion_expr`] crate
Expand All @@ -521,10 +528,6 @@ pub mod physical_plan {
pub use datafusion_physical_plan::*;
}

// Reexport testing macros for compatibility
pub use datafusion_common::assert_batches_eq;
pub use datafusion_common::assert_batches_sorted_eq;

/// re-export of [`datafusion_sql`] crate
pub mod sql {
pub use datafusion_sql::*;
Expand Down
2 changes: 1 addition & 1 deletion datafusion/expr/src/logical_plan/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ pub mod dml;
mod extension;
mod plan;
mod statement;
mod tree_node;
pub mod tree_node;

pub use builder::{
build_join_schema, table_scan, union, wrap_projection_for_join_if_necessary,
Expand Down
20 changes: 19 additions & 1 deletion datafusion/expr/src/logical_plan/plan.rs
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,11 @@ pub use datafusion_common::{JoinConstraint, JoinType};
/// an output relation (table) with a (potentially) different
/// schema. A plan represents a dataflow tree where data flows
/// from leaves up to the root to produce the query result.
///
/// # See also:
/// * [`tree_node`]: visiting and rewriting API
///
/// [`tree_node`]: crate::logical_plan::tree_node
#[derive(Clone, PartialEq, Eq, Hash)]
pub enum LogicalPlan {
/// Evaluates an arbitrary list of expressions (essentially a
Expand Down Expand Up @@ -238,7 +243,10 @@ impl LogicalPlan {
}

/// Returns all expressions (non-recursively) evaluated by the current
/// logical plan node. This does not include expressions in any children
/// logical plan node. This does not include expressions in any children.
///
/// Note this method `clone`s all the expressions. When possible, the
/// [`tree_node`] API should be used instead of this API.
///
/// The returned expressions do not necessarily represent or even
/// contributed to the output schema of this node. For example,
Expand All @@ -248,6 +256,8 @@ impl LogicalPlan {
/// The expressions do contain all the columns that are used by this plan,
/// so if there are columns not referenced by these expressions then
/// DataFusion's optimizer attempts to optimize them away.
///
/// [`tree_node`]: crate::logical_plan::tree_node
pub fn expressions(self: &LogicalPlan) -> Vec<Expr> {
let mut exprs = vec![];
self.apply_expressions(|e| {
Expand Down Expand Up @@ -773,10 +783,16 @@ impl LogicalPlan {
/// Returns a new `LogicalPlan` based on `self` with inputs and
/// expressions replaced.
///
/// Note this method creates an entirely new node, which requires a large
/// amount of clone'ing. When possible, the [`tree_node`] API should be used
/// instead of this API.
///
/// The exprs correspond to the same order of expressions returned
/// by [`Self::expressions`]. This function is used by optimizers
/// to rewrite plans using the following pattern:
///
/// [`tree_node`]: crate::logical_plan::tree_node
///
/// ```text
/// let new_inputs = optimize_children(..., plan, props);
///
Expand Down Expand Up @@ -1352,6 +1368,7 @@ macro_rules! handle_transform_recursion_up {
}

impl LogicalPlan {
/// Visits a plan similarly to [`Self::visit`], but including embedded subqueries.
pub fn visit_with_subqueries<V: TreeNodeVisitor<Node = Self>>(
&self,
visitor: &mut V,
Expand All @@ -1365,6 +1382,7 @@ impl LogicalPlan {
.visit_parent(|| visitor.f_up(self))
}

/// Rewrites a plan similarly t [`Self::visit`], but including embedded subqueries.
pub fn rewrite_with_subqueries<R: TreeNodeRewriter<Node = Self>>(
self,
rewriter: &mut R,
Expand Down
20 changes: 18 additions & 2 deletions datafusion/expr/src/logical_plan/tree_node.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,24 @@
// specific language governing permissions and limitations
// under the License.

//! Tree node implementation for logical plan

//! [`TreeNode`] based visiting and rewriting for [`LogicalPlan`]s
//!
//! Visiting (read only) APIs
//! * [`LogicalPlan::visit`]: recursively visit the node and all of its inputs
//! * [`LogicalPlan::visit_with_subqueries`]: recursively visit the node and all of its inputs, including subqueries
//! * [`LogicalPlan::apply_children`]: recursively visit all inputs of this node
//! * [`LogicalPlan::apply_with_subqueries`]: recursively visit all inputs and embedded subqueries.
//! * [`LogicalPlan::apply_expressions`]: (non recursively) visit all expressions of this node
//!
//! Rewriting (update) APIs:
//! * [`LogicalPlan::rewrite`]: recursively rewrite the node and all of its inputs
//! * [`LogicalPlan::rewrite_with_subqueries`]: recursively rewrite the node and all of its inputs, including subqueries
//! * [`LogicalPlan::map_children`]: recursively rewrite all inputs of this node
//! * [`LogicalPlan::map_expressions`]: (non recursively) visit all expressions of this node
//!
//! (Re)creation APIs (these require substantial cloning and thus are slow):
//! * [`LogicalPlan::with_new_exprs`]: Create a new plan with different expressions
//! * [`LogicalPlan::expressions`]: Create a new plan with different expressions
use crate::LogicalPlan;

use datafusion_common::tree_node::{
Expand Down

0 comments on commit 6bea98c

Please sign in to comment.