Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare for cross validation-based benchmarking #60

Merged
merged 19 commits into from
Apr 3, 2022
Merged

Prepare for cross validation-based benchmarking #60

merged 19 commits into from
Apr 3, 2022

Conversation

takuti
Copy link
Owner

@takuti takuti commented Mar 21, 2022

Review and tweak cross_validation and evaluate for #26

@codecov-commenter
Copy link

codecov-commenter commented Mar 22, 2022

Codecov Report

Merging #60 (79a757a) into master (3d7ed2e) will increase coverage by 0.09%.
The diff coverage is 98.33%.

@@            Coverage Diff             @@
##           master      #60      +/-   ##
==========================================
+ Coverage   80.14%   80.24%   +0.09%     
==========================================
  Files          26       26              
  Lines         801      815      +14     
==========================================
+ Hits          642      654      +12     
- Misses        159      161       +2     
Impacted Files Coverage Δ
src/metrics/base.jl 0.00% <0.00%> (ø)
src/metrics/ranking.jl 95.08% <94.44%> (-1.15%) ⬇️
src/base_recommender.jl 96.00% <100.00%> (ø)
src/baseline/co_occurrence.jl 100.00% <100.00%> (ø)
src/baseline/item_mean.jl 100.00% <100.00%> (ø)
src/baseline/most_popular.jl 100.00% <100.00%> (ø)
src/baseline/threshold_percentage.jl 100.00% <100.00%> (ø)
src/baseline/user_mean.jl 100.00% <100.00%> (ø)
src/data_accessor.jl 100.00% <100.00%> (ø)
src/evaluation/cross_validation.jl 100.00% <100.00%> (ø)
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3d7ed2e...79a757a. Read the comment docs.

`ealuate()` unnecessarily made predictions for all user-item pairs.
Comparison must be done between truth vs. pred.
Returned list of item-score tuples from `recommend` is already sorted by
the scores.
when a recommender is evaluated by a ranking metric.
`truth` must be a ranked list of observed items for correct evaluation.
Cross validation has some randomness, and it may or may not return very
poor/good result.
Adjust cross validation test cases to increase the probability of seeing
an empty `truth` list.
If `n` equals to the number of all samples, `n`-fold CV is same as
LOOCV.
Top-k recommendation for every single user is costly. It'd be
recommended to parallelize whenever possible.
@takuti takuti changed the title Benchmark with all {recommender, metric, dataset} pairs Prepare for cross validation-based benchmarking Apr 3, 2022
by checking the size of test samples
@takuti takuti merged commit 6082408 into master Apr 3, 2022
@takuti takuti deleted the notebook branch April 3, 2022 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants