Releases: sebp/scikit-survival
v0.17.2
This release fixes several issues with packaging scikit-survival.
Bug fixes
- Added backward support for gcc-c++ by @navashiva (#255).
- Do not install C/C++ and Cython source files.
- Add
packaging
to build requirements inpyproject.toml
. - Exclude generated API docs from source distribution.
- Add Python 3.10 to classifiers.
Documentation
- Use permutation_importance from sklearn instead of eli5.
- Build documentation with Sphinx 4.4.0.
- Fix missing documentation for classes in
sksurv.meta
.
New Contributors
- @navashiva made their first contribution in #255
Full Changelog: v0.17.1...v0.17.2
v0.17.1
This release adds support for Python 3.10.
Full Changelog: v0.17.0...v0.17.1
v0.17.0
This release adds support for scikit-learn 1.0, which includes support for feature names. If you pass a pandas dataframe to fit
, the estimator will set a feature_names_in_
attribute containing the feature names. When a dataframe is passed to predict
, it is checked that the column names are consistent with those passed to fit
. See the scikit-learn release highlights for details.
Bug fixes
- Fix a variety of build problems with LLVM (#243).
Enhancements
- Add support for
feature_names_in_
andn_features_in_
to all estimators and transforms. - Add
sksurv.preprocessing.OneHotEncoder.get_feature_names_out
. - Update bundeled version of Eigen to 3.3.9.
Backwards incompatible changes
- Drop
min_impurity_split
parameter fromsksurv.ensemble.GradientBoostingSurvivalAnalysis
. base_estimators
andmeta_estimator
attributes ofsksurv.meta.Stacking
do not contain fitted models anymore, useestimators_
andfinal_estimator_
, respectively.
Deprecations
- The
normalize
parameter ofsksurv.linear_model.IPCRidge
is deprecated and will be removed in a future version. Instead, use a sciki-learn pipeline:make_pipeline(StandardScaler(with_mean=False), IPCRidge())
.
v0.16.0
This release adds support for changing the evaluation metric that is used in estimators’ score
method. This is particular useful for hyper-parameter optimization using scikit-learn’s GridSearchCV
. You can now use sksurv.metrics.as_concordance_index_ipcw_scorer, sksurv.metrics.as_cumulative_dynamic_auc_scorer, or sksurv.metrics.as_integrated_brier_score_scorer to adjust the score
method to your needs. A detailed example is available in the User Guide.
Moreover, this release adds sksurv.ensemble.ExtraSurvivalTrees to fit an ensemble of randomized survival trees, and improves the speed of sksurv.compare.compare_survival() significantly. The documentation has been extended by a section on the time-dependent Brier score.
Bug fixes
- Columns are dropped in sksurv.column.encode_categorical() despite
allow_drop=False
(#199). - Ensure sksurv.column.categorical_to_numeric() always returns series with int64 dtype.
Enhancements
- Add sksurv.ensemble.ExtraSurvivalTrees ensemble (#195).
- Faster speed for sksurv.compare.compare_survival() (#215).
- Add wrapper classes sksurv.metrics.as_concordance_index_ipcw_scorer, sksurv.metrics.as_cumulative_dynamic_auc_scorer, and sksurv.metrics.as_integrated_brier_score_scorer to override the default
score
method of estimators (#192). - Remove use of deprecated numpy dtypes.
- Remove use of
inplace
in pandas’set_categories
.
Documentation
- Remove comments and code suggesting log-transforming times prior to training Survival SVM (#203).
- Add documentation for
max_samples
parameter to sksurv.ensemble.ExtraSurvivalTrees and sksurv.ensemble.RandomSurvivalForest (#217). - Add section on time-dependent Brier score (#220).
- Add section on using alternative metrics for hyper-parameter optimization.
v0.15.0
This release adds support for scikit-learn 0.24 and Python 3.9. scikit-survival now requires at least pandas 0.25 and scikit-learn 0.24. Moreover, if sksurv.ensemble.GradientBoostingSurvivalAnalysis or sksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis are fit with loss='coxph'
, predict_cumulative_hazard_function and predict_survival_function are now available. sksurv.metrics.cumulative_dynamic_auc now supports evaluating time-dependent predictions, for instance for a sksurv.ensemble.RandomSurvivalForest as illustrated in the User Guide.
Bug fixes
- Allow passing pandas data frames to all
fit
andpredict
methods (#148). - Allow sparse matrices to be passed to sksurv.ensemble.GradientBoostingSurvivalAnalysis.predict.
- Fix example in user guide using GridSearchCV to determine alphas for CoxnetSurvivalAnalysis (#186).
Enhancements
- Add score method to sksurv.meta.Stacking, sksurv.meta.EnsembleSelection, and sksurv.meta.EnsembleSelectionRegressor (#151).
- Add support for predict_cumulative_hazard_function and predict_survival_function to sksurv.ensemble.GradientBoostingSurvivalAnalysis. and sksurv.ensemble.GradientBoostingSurvivalAnalysis if model was fit with
loss='coxph'
. - Add support for time-dependent predictions to sksurv.metrics.cumulative_dynamic_auc See the User Guide for an example (#134).
Backwards incompatible changes
-
The score method of sksurv.linear_model.IPCRidge, sksurv.svm.FastSurvivalSVM, and sksurv.svm.FastKernelSurvivalSVM (if
rank_ratio
is smaller than 1) now converts predictions on log(time) scale to risk scores prior to computing the concordance index. -
Support for cvxpy and cvxopt solver in sksurv.svm.MinlipSurvivalAnalysis and sksurv.svm.HingeLossSurvivalSVM has been dropped. The default solver is now ECOS, which was used by cvxpy (the previous default) internally. Therefore, results should be identical.
-
Dropped the
presort
argument from sksurv.tree.SurvivalTree and sksurv.ensemble.GradientBoostingSurvivalAnalysis. -
The
X_idx_sorted
argument in sksurv.tree.SurvivalTree.fit has been deprecated in scikit-learn 0.24 and has no effect now. -
predict_cumulative_hazard_function and predict_survival_function of sksurv.ensemble.RandomSurvivalForest and sksurv.tree.SurvivalTree now return an array of sksurv.functions.StepFunction objects by default. Use
return_array=True
to get the old behavior. -
Support for Python 3.6 has been dropped.
-
Increase minimum supported versions of dependencies. We now require:
Package Minimum Version Pandas 0.25.0 scikit-learn 0.24.0
v0.14.0
This release features a complete overhaul of the documentation. It features a new visual design, and the inclusion of several interactive notebooks in the User Guide.
In addition, it includes important bug fixes. It fixes several bugs in sksurv.linear_model.CoxnetSurvivalAnalysis where predict
, predict_survival_function
, and predict_cumulative_hazard_function
returned wrong values if features of the training data were not centered. Moreover, the score function of sksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis and sksurv.ensemble.GradientBoostingSurvivalAnalysis will now correctly compute the concordance index if loss='ipcwls'
or loss='squared'
.
Bug fixes
- sksurv.column.standardize() modified data in-place. Data is now always copied.
- sksurv.column.standardize() works with integer numpy arrays now.
- sksurv.column.standardize() used biased standard deviation for numpy arrays (
ddof=0
), but unbiased standard deviation for pandas objects (ddof=1
). It always usesddof=1
now. Therefore, the output, if the input is a numpy array, will differ from that of previous versions. - Fixed sksurv.linear_model.CoxnetSurvivalAnalysis.predict_survival_function() and sksurv.linear_model.CoxnetSurvivalAnalysis.predict_cumulative_hazard_function(), which returned wrong values if features of training data were not already centered. This adds an offset_ attribute that accounts for non-centered data and is added to the predicted risk score. Therefore, the outputs of
predict
,predict_survival_function
, andpredict_cumulative_hazard_function
will be different to previous versions for non-centered data (#139). - Rescale coefficients of sksurv.linear_model.CoxnetSurvivalAnalysis if
normalize=True
. - Fix score function of sksurv.ensemble.ComponentwiseGradientBoostingSurvivalAnalysis and sksurv.ensemble.GradientBoostingSurvivalAnalysis if
loss='ipcwls'
orloss='squared'
is used. Previously, it returned1.0 - true_cindex
.
Enhancements
- Add
sksurv.show_versions()
that prints the version of all dependencies. - Add support for pandas 1.1
- Include interactive notebooks in documentation on readthedocs.
- Add user guide on penalized Cox models.
- Add user guide on gradient boosted models.
v0.13.1
This release fixes warnings that were introduced with 0.13.0.
Bug fixes
- Explicitly pass
return_array=True
in sksurv.tree.SurvivalTree.predict to avoid FutureWarning. - Fix error when fitting sksurv.tree.SurvivalTree with non-float dtype for time (#127).
- Fix RuntimeWarning: invalid value encountered in true_divide in sksurv.nonparametric.kaplan_meier_estimator.
- Fix PendingDeprecationWarning about use of matrix when fitting sksurv.svm.FastSurvivalSVM if optimizer is
PRSVM
orsimple
.
v0.13.0
The highlights of this release include the addition of sksurv.metrics.brier_score and sksurv.metrics.integrated_brier_score and compatibility with scikit-learn 0.23.
predict_survival_function
and predict_cumulative_hazard_function
of sksurv.ensemble.RandomSurvivalForest and sksurv.tree.SurvivalTree can now return an array of sksurv.functions.StepFunction, similar to sksurv.linear_model.CoxPHSurvivalAnalysis by specifying return_array=False
. This will be the default behavior starting with 0.14.0.
Note that this release fixes a bug in estimating inverse probability of censoring weights (IPCW), which will affect all estimators relying on IPCW.
Enhancements
- Make build system compatible with PEP-517/518.
- Added sksurv.metrics.brier_score and sksurv.metrics.integrated_brier_score (#101).
- sksurv.functions.StepFunction can now be evaluated at multiple points in a single call.
- Update documentation on usage of
predict_survival_function
and
predict_cumulative_hazard_function
(#118). - The default value of
alpha_min_ratio
of sksurv.linear_model.CoxnetSurvivalAnalysis will now depend on then_samples/n_features
ratio. Ifn_samples > n_features
, the default value is 0.0001 Ifn_samples <= n_features
, the default value is 0.01. - Add support for scikit-learn 0.23 (#119).
Deprecations
predict_survival_function
andpredict_cumulative_hazard_function
of sksurv.ensemble.RandomSurvivalForest and sksurv.tree.SurvivalTree will return an array of sksurv.functions.StepFunction in the future (as sksurv.linear_model.CoxPHSurvivalAnalysis does). For the old behavior, usereturn_array=True
.
Bug fixes
- Fix deprecation of importing joblib via sklearn.
- Fix estimation of censoring distribution for tied times with events. When estimating the censoring distribution, by specifying
reverse=True
when calling sksurv.nonparametric.kaplan_meier_estimator, we now consider events to occur before censoring. For tied time points with an event, those with an event are not considered at risk anymore and subtracted from the denominator of the Kaplan-Meier estimator. The change affects all functions relying on inverse probability of censoring weights, namely: - Throw an exception when trying to estimate c-index from uncomparable data (#117).
- Estimators in
sksurv.svm
will now throw an exception when trying to fit a model to data with uncomparable pairs.
v0.12.0
This release adds support for scikit-learn 0.22, thereby dropping support for older versions. Moreover, the regularization strength of the ridge penalty in sksurv.linear_model.CoxPHSurvivalAnalysis can now be set per feature. If you want one or more features to enter the model unpenalized, set the corresponding penalty weights to zero. Finally, sklearn.pipeline.Pipeline will now be automatically patched to add support for predict_cumulative_hazard_function
and predict_survival_function
if the underlying estimator supports it.
Deprecations
- Add scikit-learn's deprecation of
presort
in sksurv.tree.SurvivalTree and sksurv.ensemble.GradientBoostingSurvivalAnalysis. - Add warning that default
alpha_min_ratio
in sksurv.linear_model.CoxnetSurvivalAnalysis will depend on the ratio of the number of samples to the number of features in the future (#41).
Enhancements
- Add references to API doc of sksurv.ensemble.GradientBoostingSurvivalAnalysis (#91).
- Add support for pandas 1.0 (#100).
- Add
ccp_alpha
parameter for Minimal Cost-Complexity Pruning to sksurv.ensemble.GradientBoostingSurvivalAnalysis. - Patch sklearn.pipeline.Pipeline to add support for
predict_cumulative_hazard_function
andpredict_survival_function
if the underlying estimator supports it. - Allow per-feature regularization for sksurv.linear_model.CoxPHSurvivalAnalysis (#102).
- Clarify API docs of sksurv.metrics.concordance_index_censored (#96).
v0.11
This release adds sksurv.tree.SurvivalTree and sksurv.ensemble.RandomSurvivalForest,
which are based on the log-rank split criterion. It also adds the OSQP solver as option to sksurv.svm.MinlipSurvivalAnalysis and sksurv.svm.HingeLossSurvivalSVM, which will replace the now deprecated cvxpy
and cvxopt
options in a future release.
This release removes support for sklearn 0.20 and requires sklearn 0.21.
Deprecations
- The
cvxpy
andcvxopt
options forsolver
in sksurv.svm.MinlipSurvivalAnalysis and
sksurv.svm.HingeLossSurvivalSVM are deprecated and will be removed in a future version. Choosingosqp
is the preferred option now.
Enhancements
- Add support for pandas 0.25.
- Add OSQP solver option to sksurv.svm.MinlipSurvivalAnalysis, and
sksurv.svm.HingeLossSurvivalSVM which has no additional dependencies. - Fix issue when using cvxpy 1.0.16 or later.
- Explicitly specify utf-8 encoding when reading README.rst (#89).
- Add sksurv.tree.SurvivalTree and sksurv.ensemble.RandomSurvivalForest (#90).
Bug fixes
- Exclude Cython-generated files from source distribution because
they are not forward compatible.