-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug in nonparametric.py when calling IPCRidge #322
Comments
Could you please provide a self-contained minimal example, so I could re-produce the error? |
Here is the snippet and I am attaching the data file too. from sksurv.linear_model import IPCRidge
estimator_R = IPCRidge(alpha = 0.5,fit_intercept=True)
df_y=XY.iloc[:,:2].to_records(index=False)
df_x = XY.iloc[:,2:]
estimator_R.fit(data_x_numeric,data_y) error message: ----------------------------------------------------------------
[XY.csv](https://github.com/sebp/scikit-survival/files/9975602/XY.csv)
-----------
AssertionError Traceback (most recent call last)
Input In [113], in <cell line: 5>()
3 df_y=XY.iloc[:,:2].to_records(index=False)
4 df_x = XY.iloc[:,2:]
----> 5 estimator_R.fit(data_x_numeric,data_y)
File /opt/homebrew/Caskroom/miniforge/base/envs/teaching_env/lib/python3.10/site-packages/sksurv/linear_model/aft.py:90, in IPCRidge.fit(self, X, y)
72 """Build an accelerated failure time model.
73
74 Parameters
(...)
86 self
87 """
88 event, time = check_array_survival(X, y)
---> 90 weights = ipc_weights(event, time)
91 super().fit(X, np.log(time), sample_weight=weights)
93 return self
File /opt/homebrew/Caskroom/miniforge/base/envs/teaching_env/lib/python3.10/site-packages/sksurv/nonparametric.py:325, in ipc_weights(event, time)
322 idx = np.searchsorted(unique_time, time[event])
323 Ghat = p[idx]
--> 325 assert (Ghat > 0).all()
327 weights = np.zeros(time.shape[0])
328 weights[event] = 1.0 / Ghat
AssertionError:
|
I experience a similar problem, but the fix is not to to flip the reverse parameter. Then the IPCW is no longer using the censoring distribution, but the survival distribution (see doi:10.1016/j.jbi.2016.03.009), therefore not working correctly. I have found that Ghat becomes zero for example, when there are ties at the last time point between event and censors. Example with data: from sksurv.nonparametric import kaplan_meier_estimator, ipc_weights
import numpy as np
time = np.array([1,1,1,2,2,2,3,3,3, 4,4])
event = np.array([1,1,1, 0,0,0, 1,1,0, 1,1]).astype('bool')
# works
ipc_weights(event, time)
event[-2:] = [0,0]
# works
ipc_weights(event, time)
event[-2:] = [0,1]
# crashes and burns
ipc_weights(event, time) The problem is, that the estimated censoring estimator is 0 at this time. Technically this makes sense and the paper linked doesn't cover this case. My fix currently is to say that the events in this case are censored (which of course is wrong, especially if there is time between the event and event registration), but I am not sure, what the right solution would be and if IPCW handles this case at all. |
Thanks for the explanation! Note that setting I checked the R package prodlim, which returns the same estimates for Ghat:
From the theory, I would say that it makes sense that Ghat is zero, because if there is censoring at the last time point, the distribution drops to zero. This is equivalent to what happens when estimating the survival function and there is an event at the last time point. That said, there should be a better error message pointing to this fact. I haven't looked that the paper @Finesim97 linked to yet, they might offer a suggestion for this special case. |
Describe the bug
Running IPCRidge hangs with the following message
assert (Ghat > 0).all()
and nothing after. I found that changing the option 'reverse = False' as shown down below in kaplan_meier_estimator in the function ipc_weights in the file nonparametric.py corrects the mistake.
Error message:
Code Sample to Reproduce the Bug
used code:
Here is what I changed in the nonparametric.py in the line
unique_time, p = kaplan_meier_estimator(event, time, reverse=False) -- changed True to False
Machine and packages versions used:
The text was updated successfully, but these errors were encountered: