Benchmark recommenders with `fit!` optimization and refactoring #50

takuti · 2022-02-22T14:57:13Z

Relates to #26

Latest benchmark result: ```julia julia> using BenchmarkTools julia> using Recommendation julia> data = load_movielens_100k() julia> recommender = MostPopular(data) julia> @Btime fit!(recommender) 32.667 ms (2976 allocations: 59.84 KiB) ``` whereas the result originally was: ```julia julia> @Btime fit!(recommender) 58.305 ms (2322995 allocations: 47.77 MiB) ```

codecov-commenter · 2022-02-22T14:59:03Z

Codecov Report

Merging #50 (fbd2049) into master (40a64dc) will decrease coverage by 0.66%.
The diff coverage is 83.92%.

@@            Coverage Diff             @@
##           master      #50      +/-   ##
==========================================
- Coverage   79.29%   78.63%   -0.67%     
==========================================
  Files          23       23              
  Lines         739      716      -23     
==========================================
- Hits          586      563      -23     
  Misses        153      153

Impacted Files	Coverage Δ
src/datasets.jl	`26.04% <0.00%> (ø)`
src/base_recommender.jl	`100.00% <100.00%> (ø)`
src/baseline/co_occurrence.jl	`100.00% <100.00%> (ø)`
src/baseline/item_mean.jl	`100.00% <100.00%> (ø)`
src/baseline/most_popular.jl	`100.00% <100.00%> (ø)`
src/baseline/threshold_percentage.jl	`100.00% <100.00%> (ø)`
src/baseline/user_mean.jl	`100.00% <100.00%> (ø)`
src/data_accessor.jl	`100.00% <100.00%> (ø)`
src/evaluation/cross_validation.jl	`100.00% <100.00%> (ø)`
src/evaluation/evaluate.jl	`100.00% <100.00%> (ø)`
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 40a64dc...fbd2049. Read the comment docs.

`fit!` performance (before vs. after): CoOccurrence [need improvement] 108.658 ms (4658986 allocations: 96.15 MiB) 181.440 ms (1589204 allocations: 36.40 MiB) ItemMean 98.345 ms (5500291 allocations: 96.25 MiB) 50.346 ms (3180166 allocations: 48.54 MiB) UserMean 129.993 ms (5870962 allocations: 101.84 MiB) 88.428 ms (5488286 allocations: 83.76 MiB) ThresholdPercentage 124.990 ms (5500295 allocations: 168.93 MiB) 117.950 ms (3176480 allocations: 121.13 MiB)

``` using Recommendation using BenchmarkTools data = load_movielens_100k(); recommender = CoOccurrence(data, 1); fit!(recommender); ``` Before 176.925 ms (1589204 allocations: 36.40 MiB) After 51.136 ms (1589140 allocations: 24.48 MiB)

``` data = DataAccessor(reshape(collect(1:1000), 20, 50)) recommender = UserKNN(data) fit!(recommender) ``` Before 796.582 μs (8632 allocations: 1.55 MiB) After 565.204 μs (6537 allocations: 760.14 KiB) No significant change in `predict` for now, but add a line to avoid `k` exceeds total user size.

``` data = load_movielens_100k() recommender = ItemKNN(data) @Btime fit!(recommender, adjusted_cosine=true) ``` Before (process doesn't finish in minutes) After 17.048 s (33367232 allocations: 20.75 GiB) No significant change in `predict` for now, but add a line to avoid `k` exceeds total item size.

e.g., `n_user` -> `n_users`, `n_factor` -> `n_factors`

takuti · 2022-02-26T19:18:02Z

Quick benchmark results on a local laptop:

MacBook Pro (16-inch, 2019)
2.6 GHz 6-Core Intel Core i7
16 GB 2667 MHz DDR4

Baselines:

julia> using Recommendation
julia> using BenchmarkTools
julia> data = load_movielens_100k();

julia> recommender = UserMean(data);
julia> @btime fit!(recommender);

  88.811 ms (5489223 allocations: 83.77 MiB)

julia> recommender = ItemMean(data);
julia> @btime fit!(recommender);

  51.546 ms (3180166 allocations: 48.54 MiB)

julia> recommender = MostPopular(data);
julia> @btime fit!(recommender);

  35.052 ms (2976 allocations: 59.84 KiB)

julia> recommender = ThresholdPercentage(data, 3.0);
julia> @btime fit!(recommender);

  110.403 ms (3176482 allocations: 121.13 MiB)

julia> recommender = CoOccurrence(data, 1);
julia> @btime fit!(recommender);

  51.660 ms (1589140 allocations: 24.48 MiB)

Model-based:

julia> recommender = UserKNN(data, 20, true);
julia> @btime fit!(recommender);

  9.414 s (125878781 allocations: 2.91 GiB)

julia> recommender = ItemKNN(data);
julia> @btime fit!(recommender, adjusted_cosine=true);

  16.051 s (33367232 allocations: 20.75 GiB)

julia> recommender = SVD(data);
julia> @btime fit!(recommender);

  343.518 ms (1638163 allocations: 76.85 MiB)

julia> recommender = MatrixFactorization(data);
julia> @btime fit!(recommender, shuffled=false);

  28.784 s (329080643 allocations: 32.32 GiB)

julia> recommender = FactorizationMachines(data);
julia> @btime fit!(recommender, shuffled=false);
  138.807 s (1993371437 allocations: 215.60 GiB)

takuti added 14 commits February 23, 2022 05:57

Use !iszero instead of x->x>0

04e2a38

Refactor ThresholdPercentage and UserMean with comments

a3b2d9c

Remove unnecessary copy operation of SVD input matrix

d8b3e06

Reformat MF recommender

35bd5c3

Find non-zero indices before MF/FM SGD loop

360976b

Reformat FMs fit! args

044b404

Rename k to n_factor / n_neighbor for readability

2d15fd8

Make sample shuffling in SGD configureable

ddac660

Make all n_xxx variable names plural

7d89596

e.g., `n_user` -> `n_users`, `n_factor` -> `n_factors`

Set similarity to 0 if no co-occurred items between users

fbd2049

takuti changed the title ~~Benchmark recommenders and optimize fit!~~ Benchmark recommenders with fit! optimization and refactoring Feb 26, 2022

takuti merged commit 37cc748 into master Feb 27, 2022

takuti deleted the benchmark branch February 27, 2022 03:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark recommenders with `fit!` optimization and refactoring #50

Benchmark recommenders with `fit!` optimization and refactoring #50

takuti commented Feb 22, 2022 •

edited

Loading

codecov-commenter commented Feb 22, 2022 •

edited

Loading

takuti commented Feb 26, 2022

Benchmark recommenders with fit! optimization and refactoring #50

Benchmark recommenders with fit! optimization and refactoring #50

Conversation

takuti commented Feb 22, 2022 • edited Loading

codecov-commenter commented Feb 22, 2022 • edited Loading

Codecov Report

takuti commented Feb 26, 2022

Benchmark recommenders with `fit!` optimization and refactoring #50

Benchmark recommenders with `fit!` optimization and refactoring #50

takuti commented Feb 22, 2022 •

edited

Loading

codecov-commenter commented Feb 22, 2022 •

edited

Loading