Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add @byrow attempt 2 #250

Merged
merged 25 commits into from
Jun 16, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
[deps]
DataFramesMeta = "1313f7d8-7da2-5740-9ea0-a2ca25f37964"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"

[compat]
Documenter = "0.25"
Documenter = "0.27"
1 change: 1 addition & 0 deletions docs/src/api/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,5 @@

```@autodocs
Modules = [DataFramesMeta]
Private = false
```
58 changes: 58 additions & 0 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ In addition, DataFramesMeta provides
convenient syntax
* `@eachrow` and `@eachrow!` for looping through rows in data frame, again with high performance and
convenient syntax.
* `@byrow` for applying functions to each row of a data frame (only supported inside other macros).
* `@linq`, for piping the above macros together, similar to [magrittr](https://cran.r-project.org/web/packages/magrittr/vignettes/magrittr.html)'s
`%>%` in R.

Expand Down Expand Up @@ -268,6 +269,63 @@ df2 = @eachrow df begin
end
```

## Row-wise transformations with `@byrow`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention @byrow at the top of the file?

Rather than starting the section with technical details, it would be more user-friendly to say what @byrow does first, then show examples, and only then mention ByRow and the fact that @byrow isn't a real macro.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the second paragraph still applies: it would be nice to start with a sentence or two saying that @byrow allows writing code that is applied to each row instead of having to vectorize it.


`@byrow` provides a convenient syntax to apply operations by-row,
without having to vectorize manually.

DataFrames.jl provides the function wrapper `ByRow`. `ByRow(f)(x, y)`
is roughly equivalent to `f.(x, y)`. DataFramesMeta.jl allows users
to construct expressions using `ByRow` function wrapper with the
syntax `@byrow`.

`@byrow` is not a "real" macro and cannot be used outside of
DataFramesMeta.jl macros. However its behavior within DataFramesMeta.jl
macros should be indistinguishable from externally defined macros.
Thought of as a macro `@byrow` accepts a single argument and
creates an anonymous function wrapped in `ByRow`. For example,

```julia
@transform(df, @byrow y = :x == 1 ? true : false)
```

is equivalent to

```julia
transform(df, :x => ByRow(x -> x == 1 ? true, false) => :y)
```

The following macros accept `@byrow`:

* `@transform` and `@transform!`, `@select`, `@select!`, and `@combine`.
`@byrow` can be used in the left hand side of expressions, e.g.
`@select(df, @byrow z = :x * :y)`.
* `@where` and `@orderby`, with syntax of the form `@where(df, @byrow :x > :y)`
* `@with`, where the anonymous function created by `@with` is wrapped in
`ByRow`, as in `@with(df, @byrow :x * :y)`.

To avoid writing `@byrow` multiple times when performing multiple
operations, it is allowed to use`@byrow` at the beginning of a block of
operations. All transformations in the block will operate by row.
pdeffebach marked this conversation as resolved.
Show resolved Hide resolved

```julia
julia> @where df @byrow begin
:a > 1
:b < 5
end
1×2 DataFrame
Row │ a b
│ Int64 Int64
─────┼──────────────
1 │ 2 4
```

`@byrow` can be used inside macros which accept `GroupedDataFrame`s,
however, like with `ByRow` in DataFrames.jl, when `@byrow` is
used, functions do not take into account the grouping, so for
example the result of `@transform(df, @byrow y = f(:x))` and
`@transform(groupby(df, :g), @byrow y = f(:x))` is the same.

bkamins marked this conversation as resolved.
Show resolved Hide resolved
## Working with column names programmatically with `cols`

DataFramesMeta provides the special syntax `cols` for referring to
Expand Down
3 changes: 2 additions & 1 deletion src/DataFramesMeta.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ using Reexport
export @with, @where, @orderby, @transform, @by, @combine, @select,
@transform!, @select!,
@eachrow, @eachrow!,
@byrow, @byrow!, @based_on # deprecated
@byrow,
@based_on # deprecated


global const DATAFRAMES_GEQ_22 = isdefined(DataFrames, :pretty_table) ? true : false
Expand Down
22 changes: 0 additions & 22 deletions src/eachrow.jl
Original file line number Diff line number Diff line change
Expand Up @@ -70,28 +70,6 @@ function eachrow_helper(df, body, deprecation_warning)
end
end

"""
@byrow!(d, expr)

Deprecated version of `@eachrow`, see: [`@eachrow`](@ref)

Acts the exact same way. It does not change the input argument `d` in-place.
"""
macro byrow!(df, body)
esc(eachrow_helper(df, body, true))
end

"""
@byrow(d, expr)

Deprecated version of `@eachrow`, see: [`@eachrow`](@ref)

Acts the exact same way.
"""
macro byrow(d, body)
esc(eachrow_helper(d, body, true))
end

"""
@eachrow(df, body)

Expand Down
Loading