Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for f ∘ g #317

Merged
merged 3 commits into from
Jan 4, 2022
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions src/parsing.jl
Original file line number Diff line number Diff line change
Expand Up @@ -59,19 +59,27 @@ function replace_dotted!(membernames, e)
Expr(:., x_new, y_new)
end

composed_or_symbol(x) = false
composed_or_symbol(x::Symbol) = true
function composed_or_symbol(x::Expr)
x.head == :call &&
x.args[1] == :∘ &&
all(composed_or_symbol, x.args[2:end])
pdeffebach marked this conversation as resolved.
Show resolved Hide resolved
end

is_simple_non_broadcast_call(x) = false
function is_simple_non_broadcast_call(expr::Expr)
expr.head == :call &&
length(expr.args) >= 2 &&
expr.args[1] isa Symbol &&
composed_or_symbol(expr.args[1]) &&
all(a -> get_column_expr(a) !== nothing, expr.args[2:end])
end

is_simple_broadcast_call(x) = false
function is_simple_broadcast_call(expr::Expr)
expr.head == :. &&
length(expr.args) == 2 &&
expr.args[1] isa Symbol &&
composed_or_symbol(expr.args[1]) &&
expr.args[2] isa Expr &&
expr.args[2].head == :tuple &&
all(a -> get_column_expr(a) !== nothing, expr.args[2].args)
Expand Down
48 changes: 42 additions & 6 deletions test/function_compilation.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,18 @@ using Test
using DataFramesMeta

@testset "function_compilation" begin
@eval begin
pdeffebach marked this conversation as resolved.
Show resolved Hide resolved
df = DataFrame(a = [1], b = [2])

testfun(x, y) = x .* y
testdotfun(x, y) = x * y
testnt(x) = (c = x,)
end

# Lazy way of making sure all functions are pre-compiled.
# @eval prevents julia from caching the intermediate anonymous functions.
for _ in 1:2
@eval begin
df = DataFrame(a = [1], b = [2])

testfun(x, y) = x .* y
testdotfun(x, y) = x * y
testnt(x) = (c = x,)

@test @select(df, :c = :a + :b) == DataFrame(c = [3])

fasttime = @timed @select(df, :c = :a + :b)
Expand Down Expand Up @@ -187,4 +189,38 @@ using DataFramesMeta
end
end
end

@testset "composed compilation" begin
@eval begin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here - I do not understand the benefit of @eval?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, @eval is necessary because we need to evaluate in global scope to prevent caching the anonymous function in the method table.

df = DataFrame(a = [1], b = [2])

f(x) = identity(x)
g(x, y) = x + y

df_wide = DataFrame(rand(10, 1000), :auto)
end

for _ in 1:2
@eval begin
@test @select(df, :y = (f ∘ g)(:a, :b)).y == [3]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think testing timing is enough? Maybe also generated code should be tested?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't check generated code anywhere else. I think this should be enough.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind it's easy to add. Added. Will merge after tests.


fasttime = @timed @select(df, :y = (f ∘ g)(:a, :b))
slowtime = @timed select(df, [:a, :b] => ((a, b) -> (f ∘ g)(a, b)) => :y )
(slowtime[2] > fasttime[2]) || @warn("Slow compilation")

@test @select(df, :y = (f ∘ g).(:a, :b)).y == [3]

fasttime = @timed @select(df, :y = (f ∘ g).(:a, :b))
slowtime = @timed select(df, [:a, :b] => ((a, b) -> (f ∘ g).(a, b)) => :y )
(slowtime[2] > fasttime[2]) || @warn("Slow compilation")

fasttime = @timed @rselect df_wide :y = (sum ∘ skipmissing)(AsTable(:))
slowtime = @timed select(df_wide, AsTable(:) => ByRow(t -> (sum ∘ skipmissing)(t)) => :y)

(slowtime[2] > fasttime[2]) || @warn("Slow compilation")
end
end

end

end # module