Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark recommenders with fit! optimization and refactoring #50

Merged
merged 15 commits into from
Feb 27, 2022

Conversation

takuti
Copy link
Owner

@takuti takuti commented Feb 22, 2022

Relates to #26

Latest benchmark result:

```julia
julia> using BenchmarkTools
julia> using Recommendation
julia> data = load_movielens_100k()
julia> recommender = MostPopular(data)
julia> @Btime fit!(recommender)
  32.667 ms (2976 allocations: 59.84 KiB)
```

whereas the result originally was:

```julia
julia> @Btime fit!(recommender)
  58.305 ms (2322995 allocations: 47.77 MiB)
```
@codecov-commenter
Copy link

codecov-commenter commented Feb 22, 2022

Codecov Report

Merging #50 (fbd2049) into master (40a64dc) will decrease coverage by 0.66%.
The diff coverage is 83.92%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #50      +/-   ##
==========================================
- Coverage   79.29%   78.63%   -0.67%     
==========================================
  Files          23       23              
  Lines         739      716      -23     
==========================================
- Hits          586      563      -23     
  Misses        153      153              
Impacted Files Coverage Δ
src/datasets.jl 26.04% <0.00%> (ø)
src/base_recommender.jl 100.00% <100.00%> (ø)
src/baseline/co_occurrence.jl 100.00% <100.00%> (ø)
src/baseline/item_mean.jl 100.00% <100.00%> (ø)
src/baseline/most_popular.jl 100.00% <100.00%> (ø)
src/baseline/threshold_percentage.jl 100.00% <100.00%> (ø)
src/baseline/user_mean.jl 100.00% <100.00%> (ø)
src/data_accessor.jl 100.00% <100.00%> (ø)
src/evaluation/cross_validation.jl 100.00% <100.00%> (ø)
src/evaluation/evaluate.jl 100.00% <100.00%> (ø)
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 40a64dc...fbd2049. Read the comment docs.

`fit!` performance (before vs. after):

CoOccurrence [need improvement]
  108.658 ms (4658986 allocations: 96.15 MiB)
  181.440 ms (1589204 allocations: 36.40 MiB)

ItemMean
  98.345 ms (5500291 allocations: 96.25 MiB)
  50.346 ms (3180166 allocations: 48.54 MiB)

UserMean
  129.993 ms (5870962 allocations: 101.84 MiB)
  88.428 ms (5488286 allocations: 83.76 MiB)

ThresholdPercentage
  124.990 ms (5500295 allocations: 168.93 MiB)
  117.950 ms (3176480 allocations: 121.13 MiB)
```
using Recommendation
using BenchmarkTools

data = load_movielens_100k();
recommender = CoOccurrence(data, 1);
fit!(recommender);
```

Before 176.925 ms (1589204 allocations: 36.40 MiB)
After 51.136 ms (1589140 allocations: 24.48 MiB)
```
data = DataAccessor(reshape(collect(1:1000), 20, 50))
recommender = UserKNN(data)
fit!(recommender)
```

Before 796.582 μs (8632 allocations: 1.55 MiB)
After 565.204 μs (6537 allocations: 760.14 KiB)

No significant change in `predict` for now, but add a line to avoid
`k` exceeds total user size.
```
data = load_movielens_100k()
recommender = ItemKNN(data)
@Btime fit!(recommender, adjusted_cosine=true)
```

Before (process doesn't finish in minutes)
After 17.048 s (33367232 allocations: 20.75 GiB)

No significant change in `predict` for now, but add a line to avoid
`k` exceeds total item size.
e.g., `n_user` -> `n_users`, `n_factor` -> `n_factors`
@takuti
Copy link
Owner Author

takuti commented Feb 26, 2022

Quick benchmark results on a local laptop:

MacBook Pro (16-inch, 2019)
2.6 GHz 6-Core Intel Core i7
16 GB 2667 MHz DDR4

Baselines:

julia> using Recommendation
julia> using BenchmarkTools
julia> data = load_movielens_100k();

julia> recommender = UserMean(data);
julia> @btime fit!(recommender);

  88.811 ms (5489223 allocations: 83.77 MiB)

julia> recommender = ItemMean(data);
julia> @btime fit!(recommender);

  51.546 ms (3180166 allocations: 48.54 MiB)

julia> recommender = MostPopular(data);
julia> @btime fit!(recommender);

  35.052 ms (2976 allocations: 59.84 KiB)

julia> recommender = ThresholdPercentage(data, 3.0);
julia> @btime fit!(recommender);

  110.403 ms (3176482 allocations: 121.13 MiB)

julia> recommender = CoOccurrence(data, 1);
julia> @btime fit!(recommender);

  51.660 ms (1589140 allocations: 24.48 MiB)

Model-based:

julia> recommender = UserKNN(data, 20, true);
julia> @btime fit!(recommender);

  9.414 s (125878781 allocations: 2.91 GiB)

julia> recommender = ItemKNN(data);
julia> @btime fit!(recommender, adjusted_cosine=true);

  16.051 s (33367232 allocations: 20.75 GiB)

julia> recommender = SVD(data);
julia> @btime fit!(recommender);

  343.518 ms (1638163 allocations: 76.85 MiB)

julia> recommender = MatrixFactorization(data);
julia> @btime fit!(recommender, shuffled=false);

  28.784 s (329080643 allocations: 32.32 GiB)

julia> recommender = FactorizationMachines(data);
julia> @btime fit!(recommender, shuffled=false);
  138.807 s (1993371437 allocations: 215.60 GiB)

@takuti takuti changed the title Benchmark recommenders and optimize fit! Benchmark recommenders with fit! optimization and refactoring Feb 26, 2022
@takuti takuti merged commit 37cc748 into master Feb 27, 2022
@takuti takuti deleted the benchmark branch February 27, 2022 03:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants