Releases: EducationalTestingService/rsmtool
RSMTool 12.0.0
What's Changed
- Python 3.8 and 3.9 are no longer supported since the SKLL dependency was updated to v5.0.1.
- Remove rsmextra and special sections by @desilinguist in #676
- Add type hints (part 1) by @desilinguist in #677
- Add type hints (Part 2) by @desilinguist in #679
- Add type hints (Part 3) by @desilinguist in #681
- Add Type Hints (Part 4) by @desilinguist in #682
- Add Type Hints (Final Part) by @desilinguist in #683
- Fix pandas-related warnings by @desilinguist in #684
- Separate runtime and dev dependencies by @desilinguist in #685
Full Changelog: v11.3.0...v12.0.0
RSMTool 11.3.0
💡 New features 💡
- Add section ordering for
rsmexplain
by @desilinguist in #667 - Update intermediate files notebook section for readability by @desilinguist in #670
🛠️ Bugfixes & Improvements 🛠️
- Update SHAP to the latest version 0.44.0 by @damien2012eng in #664
- Unpin dependencies and fix minor issues by @desilinguist in #666
- Fix new warnings & remove manual suppressions by @desilinguist in #668
- Refactor CLI tests to modernize codecov by @desilinguist in #669
- Pin numpy to < 2 by @desilinguist in #673
🙏🏽 Contributions & Code Reviews 🙏🏽
@damien2012eng
@desilinguist
@Frost45
@mulhod
@tamarl08
Full Changelog: v11.2.0...v11.3.0
RSMTool 11.2.0
💡 New features 💡
- Add sections for W&B logging to make information easier to find by @tamarl08 in #659
- Add configuration option to disable truncation of outliers by @damien2012eng in #661
🛠️ Bugfixes & Improvements 🛠️
🙏🏽 Contributions & Code Reviews 🙏🏽
@damien2012eng
@desilinguist
@mulhod
@tamarl08
Full Changelog: v11.1.1...v11.2.0
RSMTool 11.1.1
💡 New features 💡
- Add a new human-human confusion matrix for double-scored data by @mulhod in #649
- Allow prallelization of grid search when using SKLL models in
rsmtool
by @tamarl08 in #650
🛠️ Bugfixes & Improvements 🛠️
- Update pre-commit checks by @desilinguist in #647
- Enhance wandb logging of evaluation metrics by @tamarl08 in #651
- Fix warnings in reports by @tamarl08 in #654
🙏🏽 Contributions & Code Reviews 🙏🏽
@damien2012eng
@desilinguist
@mulhod
@tamarl08
@tazin-afrin
Full Changelog: v11.0.1...v11.1.1
RSMTool 11.0.1
💡 New features 💡
- New rsmexplain plots by @damien2012eng in #603
- Full W&B integration to allow logging of output artifacts and report by @tamarl08 in #617, #620, #621, #623, #627
- Add FAQ page to documentation by @desilinguist in #622
- Add support for Python 3.11 by @desilinguist in #628
- Add support for output files when auto-generating configurations by @desilinguist in #640
- Enhancements to fast_predict by @mulhod in #632
- NOTE: The
.model
files produced byrsmtool
are no longer SKLL model files. They are serializedrsmtool.Modeler
objects. This change should be transparent to the users if the only places they use the.model
files are withrsmpredict
andrsmexplain
. However, if those files are used outside of RSMTool and expected to contain SKLL learners, then the following change is needed: users would now need to use theModeler.load_from_file()
method to load the.model
file produced byrsmtool
and then access the SKLL learner via the.learner
attribute.
🛠️ Bugfixes & Improvements 🛠️
- Migrate nose to nose2 by @damien2012eng in #610
- Upgrade
shap
by @desilinguist in #612 - Use example IDs when specifying
sample_ids
by @desilinguist in #613 - Expect
scale_with
value of 'raw' in rsmeval by @tamarl08 in #614 - Fix
update_files
fornose2
. by @desilinguist in #616 - Fix bug in wording of what will be highlighted for disattenuated correlation by @mulhod in #594
- Pin skll version in doc requirements by @tamarl08 in #619
- Remove unnecessary warnings in HTML reports. by @desilinguist in #624
- Include system information in RSMExplain reports by @desilinguist in #633
- Suppress alt text warnings when generating reports by @desilinguist in #634
- Fix W&B tests and add to CI builds. by @desilinguist in #637
- Switch to
ruff
for pre-commit checks by @desilinguist in #639 - Fix test dir usage in test_wandb by @tamarl08 in #642
🙏🏽 Contributions & Code Reviews 🙏🏽
Full Changelog: v10.0.0...v11.0.1
RSMTool 10.0.0
This is a major new release! It includes new functionality as well as updated dependencies.️
💡 New features 💡
Dependencies
- Shap is now a required dependency. It is currently pinned to 0.41.0 but we plan to keep RSMTool updated with the latest SHAP versions as they are released.
- Numpy has been pinned to <= 1.23.5 since SHAP 0.41.0 does not work with numpy 1.24.x.
RSMExplain
- Added new command-line utility
rsmexplain
to generate an explanation report for an existing rsmtool experiment. Underlyingly,rsmexplain
leverages SHapley Additive exPlanations produced byshap
. - Added comprehensive documentation on how to run
rsmexplain
. - Added support for automated and interactive configuration generation for
rsmexplain
. - Add comprehensive functional tests for
rsmexplain
.
More reliable notebook merging
- Updated
rsmtool.reporter.merge_notebooks()
to usenbconvert
andnbformat
APIs instead of the JSON-based hack that was being used before.
🛠️ Bugfixes & Improvements 🛠️
- Use
legend_handles
instead of the deprecatedlegendHandles
attribute for matplotlib to avoid deprecation warnings in notebooks. - Minor documentations fixes in various places.
Contributions from: @damien2012eng, @desilinguist, @tamarl08, @dblandan, and @mulhod!
RSMTool 9.1.1
What's Changed
- Remove
np.warnings
fromfairness_utils.py
. by @desilinguist in #580 - Add new
fast_predict()
API method for prediction by @desilinguist in #581 - Convert all formatted strings to f-strings and add pre-commit with flynt by @desilinguist in #584
- Integrate black and run on all files by @desilinguist in #586
- Add isort, pydocstyle, flake8 as pre-commit checks by @desilinguist in #587
- Restore and increase test coverage by @desilinguist in #589
- Update contributing docs & remove extraneous whitespace by @desilinguist in #590
- Update SKLL to v3.2.0 by @desilinguist in #591
- Release v9.1.1 by @desilinguist in #592
Full Changelog: v9.0.1...v9.1.1
v9.0.1
What's Changed
This is a minor bugfix release.
- Delete the
stable
branch by @desilinguist in #573 - Disallow negative confidence intervals in fairness plots since they cause new versions of
pandas
to break by @desilinguist in #574 - Add workaround for broken SVGs in
nbconvert
by overridingclean_html
by @desilinguist in #575 - Fix bug for integer IDs when using
rsmxval
by @desilinguist in #577 - Update SKLL dependency to v3.1.0 by @desilinguist in #578
Full Changelog: v9.0.0...v9.0.1
RSMTool 9.0
This is a major new release. It includes new functionality and breaking changes to the API as well as to dependencies.
⚡️ RSMTool 9.0 is incompatible with previous versions ⚡️
💡 New features 💡
Dependencies
-
RSMTool is now compatible with SKLL v3.0 and, therefore, scikit-learn v1.0.2.
-
RSMTool now supports Python 3.10, in addition to 3.8 and 3.9. Python 3.7 is no longer supported.
-
tqdm is now a required dependency.
Native cross-validation support
-
Add native support for cross-validation experiments to RSMTool. Using a single train-test split may lead to biased estimates of performance since those estimates will depend on the specific characteristics of that split. However, using cross-validation instead can provide more accurate estimates of scoring model performance since those estimates are averaged over multiple train-test splits that are randomly selected based on the data.
-
Add new command-line utility
rsmxval
to run cross-validation experiments. Underlyingly, it leverages the RSMTool API functionsrun_experiment()
,run_evaluation()
, andrun_summary()
to generate multiple useful reports for the users. -
Add support for automated configuration generation to
rsmxval
in both batch and interactive mode. -
Add comprehensive documentation on how to run cross-validation experiments.
-
Add comprehensive functional tests for cross-validation.
API Changes
-
Add two new logging functions in
rsmtool.utils.logging
. These are only meant to be used by RSMTool developers, not users. -
Factor out the code that was used to write a dataframe to disk into a separate utility method
DataWriter.write_frame_to_disk()
so that it an also be used byrsmxval
. This can prove useful to advanced RSMTool users as well. -
Add new cross-validation specific utility functions to
rsmtool.utils.cross_validation
. -
Convert several class or static methods in various classes to instance methods in order to allow for passing and using an optional logger instance.
-
Tweak the
check_scaled_coefficients()
test utility function to take the output directory as an argument instead of taking an experiment name to allow its usage forrsmxval
functional tests.
🛠 Bugfixes & Improvements 🛠
-
Fix the behavior of the
use_thumbnails
option in RSMTool configuration files. It was generating both the thumbnail as well as the full-sized figure due to the behavior of Matplotlib’ssavefig()
. The solution was to turn off interactive plotting in all header notebooks. -
Replace deprecated methods and keywords in RSMTool code as recommended by the latest versions of pandas, numpy, and scikit-learn.
-
Fix several duplicate target warnings when compiling the documentation. Make sure included RST files have an extension of
.rst.inc
so that they are not compiled twice. Turn all web links into anonymous references so that there are no conflicts with the same target names. -
Make feature boxplots for subgroups in reports more flexible in terms of the number of features. Specifically, if the experiment has more than 150 features, no boxplots are shown. Previously this limit was 30. In addition, the message that the boxplots have been omitted is displayed more prominently when it happens. Finally, if the number of features is > 30 but <=150, a new message asking the user to enable thumbnails is shown.
-
Update Gitlab CI plan to use Python 3.8 and Azure Pipelines to use Python 3.10. Add new cross-validation tests to both CI plans.
RSMTool 8.1.2
This is a bugfix release.
- Update the code for compatibility with
pandas 1.3.0
andscikit-learn 0.24.2
.