You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Once implemented (#197), the counterfactual accuracy evaluation needs to be tested out against a variety of datasets/models in order to surface any issues or potential improvements. This can be run as a data analysis project using the datasets in the repo and added to the examples dir. This can be considered done with at least 2-3 dataset/model pairs, but more is fine.
Based on these results, we would also like determine a protocol for making an automated pass/fail decision based on this evaluation, ie. a rule of thumb for what a "good" result looks like. Otherwise, if the evaluation is found not to be a good fit for such a protocol, this should be documented.
Road-testing against multiple modelling problems
Protocol for automated decision-making, if appropriate.
The text was updated successfully, but these errors were encountered:
Once implemented (#197), the counterfactual accuracy evaluation needs to be tested out against a variety of datasets/models in order to surface any issues or potential improvements. This can be run as a data analysis project using the datasets in the repo and added to the
examples
dir. This can be considered done with at least 2-3 dataset/model pairs, but more is fine.Based on these results, we would also like determine a protocol for making an automated pass/fail decision based on this evaluation, ie. a rule of thumb for what a "good" result looks like. Otherwise, if the evaluation is found not to be a good fit for such a protocol, this should be documented.
The text was updated successfully, but these errors were encountered: