Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new traits to to support better MLJ machine type checks and composite models #18

Merged
merged 3 commits into from
Aug 17, 2021

Conversation

ablaom
Copy link
Member

@ablaom ablaom commented Aug 17, 2021

Add the following traits with specified fallbacks:

<> fit_data_scitype = Unknown - for the scitype of data in fit(model, verbosity, data...) calls (and whence machine(model, data...) calls). For example, a supervised classification model, the value might be something like like Tuple{Table(Continuous),AbstractVector{Finite}}. If the model additionally supported weights, the value might be

Union{Tuple{Table(Continuous),AbstractVector{Finite}},
      Tuple{Table(Continuous),AbstractVector{Finite},AbstractVector{Continuous}}

For current abstract Model subtypes, this trait would have a fallback inferred from existing traits, but this allows for models to have one or more custom fit signatures, and allows for a better fallback check for data bound to machines.

<> predict_scitype = Unknown - a guaranteed upper bound for the scitype of the output of predict method, as distinguished from target_scitype which is the scitype of the training "target". For supervised models these only agree in the Deterministic case, not the Probablistic case, which generally involves Density or Sampleable scitypes. For example, a probabilistic classifier might get the value AbstractVector{Density{<:Finite}}. For existing Model subtypes, this is to be inferred as far as possible from the target_scitype and prediction_type traits.

<> transform_scitype = Unknown, inverse_transform_scitype=Unknown - similar to above predict_scitype.

<> abstract_type = Any - this will generally be the direct supertype of the model. So, for example, a probabilistic classifier will get the value Probabilistic.

@codecov-commenter
Copy link

codecov-commenter commented Aug 17, 2021

Codecov Report

Merging #18 (5e6426f) into dev (d70ee0c) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##               dev       #18   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            1         1           
  Lines           54        59    +5     
=========================================
+ Hits            54        59    +5     
Impacted Files Coverage Δ
src/StatisticalTraits.jl 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d70ee0c...5e6426f. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants