Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement of BinaryThresholdPredictor to handle outlier detection models #400

Merged
merged 10 commits into from
Sep 15, 2021

Conversation

ablaom
Copy link
Member

@ablaom ablaom commented Sep 7, 2021

This PR continues #399 and:

  • Expands the types of probabilistic detectors that can be wrapped by BinaryThresholdPredictor, so as to include new outlier detector types added to MLJModelInterface here.
  • (breaking) Changes the field name wrapped_model to model, for consistency with the wrappers TunedModel and IteratedModel. I haven't kept backwards compatibility as a I don't think this is going to effect many users yet, but I am tagging the release as minor (breaking). @OkonSamuel may want to comment.
  • Removes a check and test that are redundant. From the original test code:
function MMI.predict(model::BinaryThresholdPredictor, fitresult, X)
   yhat = MMI.predict(model.wrapped_model, fitresult[1], X)
   length(yhat.prob_given_ref) == 2 || begin
   # Due to resampling it's possible for Predicted `AbstractVector{<:UnivariateFinite}`
   # to contain one class. Hence the need for the following warning
   @warn "Predicted `AbstractVector{<:UnivariateFinite}`"*
       " contains only 1 class. Hence predictions will only contain this class "*
       "irrrespective of the set `threshold` "

Actually, it is impossible for a correctly implemented probalistic classifier to predict UnivariateFinite having a different number of classes from the target on which it was trained, because the pool of the UnivariateFinite should always coincide with the pool of the target on which the model was trained and resampling does not effect the pool of the training target, as in this example:

julia> y = categorical([1, 1, 1, 2]);

julia> y[1:2]
2-element CategoricalArray{Int64,1,UInt32}:
 1
 1

julia> levels(y[1:2])
2-element Vector{Int64}:
 1
 2

cc @davnn

@OkonSamuel Be great if you can have a look over this. Do you when you might get a chance, assuming your'e willing?

@codecov-commenter
Copy link

codecov-commenter commented Sep 7, 2021

Codecov Report

Merging #400 (4adc491) into dev (2f67f4b) will decrease coverage by 0.53%.
The diff coverage is 94.23%.

Impacted file tree graph

@@            Coverage Diff             @@
##              dev     #400      +/-   ##
==========================================
- Coverage   75.56%   75.02%   -0.54%     
==========================================
  Files          12       12              
  Lines         937      981      +44     
==========================================
+ Hits          708      736      +28     
- Misses        229      245      +16     
Impacted Files Coverage Δ
src/builtins/ThresholdPredictors.jl 95.45% <94.23%> (-4.55%) ⬇️
src/metadata.jl 79.41% <0.00%> (-2.31%) ⬇️
src/builtins/Transformers.jl 96.37% <0.00%> (-0.17%) ⬇️
src/utilities.jl 0.00% <0.00%> (ø)
src/registry/src/Registry.jl 0.00% <0.00%> (ø)
src/registry/src/check_registry.jl 0.00% <0.00%> (ø)
src/loading.jl 62.35% <0.00%> (+0.44%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2f67f4b...4adc491. Read the comment docs.

@ablaom
Copy link
Member Author

ablaom commented Sep 15, 2021

@davnn @OkonSamuel

Wanting to move forward with this, I have added backwards compatibility to BinaryThresholdPredictor (with a depwarn for attempts to access using wrapped_model instead of model) and down-graded the new version number to a patch.

@ablaom ablaom merged commit 1568695 into dev Sep 15, 2021
This was referenced Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants