Alpha Behavior #41

dex314 · 2018-08-01T16:54:32Z

When I run a model like this example:
mod = coxnet.CoxnetSurvivalAnalysis(n_alphas=30, l1_ratio=1.0)
There are times (and it could be data specific) where the length of the coefficients out of the model are less than the n_alphas I specified. For example, it often stops at 5 alphas deep.
The paths might have 15 variables > 0 (coming out) at 5 alphas deep, which is fine. The strange thing I am seeing is, lets say I set n_alphas=20 on the same data set. I end up getting more variables > 0 along the path (and still stopping at 5 alphas deep) or vice verse, if I set n_alphas = 40 on the same data set, I end up getting less variables > 0 along the path and once again the algorithm is automatically stopping at 5 alphas deep. (Im referring to the parameters as variables.)

Im assuming this is a bug as the way I have working with Elastic Nets in the past is that the alpha sequence should exponentially decrease toward some min but I should see more variables > 0 as I move forward and closer to the min in this alpha curve. Such that if I see 15 variables at alpha=30, then I should see <15 at alpha<30 and the reverse.

Could there be some ratio somewhere that is picking up a variable of similar name to a global that is confusing the alpha parameters in the Elastic net?

plpxsk · 2018-08-02T18:07:02Z

Can you clarify a bit the phrase "5 alphas deep"? Not exactly sure what this means. Thanks!

dex314 · 2018-08-02T18:42:47Z

By 5 alphas deep, I mean that the coefficient path output (model.coef_) is of shape (Mx5) where M is the number of parameters in the regression and 5 is the alpha depth. I would have expected an output with a shape of (M x n_alphas).
In reference to my issue, I am seeing where of those M parameters for example, I get the following:

coxnet.CoxnetSurvivalAnalysis(n_alphas=30, l1_ratio=1.0) gives 15 params != 0
coxnet.CoxnetSurvivalAnalysis(n_alphas=20, l1_ratio=1.0) gives 20 params != 0
coxnet.CoxnetSurvivalAnalysis(n_alphas=40, l1_ratio=1.0) gives 10 params != 0

but in each instance, the model.coef_ output is (Mx5).

Thank you for replying and I apologize if there is some caveat I am missing.

sebp · 2018-08-04T15:06:50Z

I think you are mixing different concepts:

There is the grid of alpha values, which is determined by your data, n_alphas, and alpha_min_ratio. The maximum alpha is chosen such that all variables will have a coefficient of zero as determined from your dataset. The next step is to determine the minimum alpha, which is alpha_min_ratio * alpha_max. Finally, n_alphas different values from alpha_max to alpha_min are chosen equally spaced in log-scale. Therefore, when you modify n_alphas, alpha_max and alpha_min will remain the same, but alphas in between will change.
It can happen that optimization stops early if max_iter has been reached. The coefficients of the remaining alpha values will not be updated and a convergence warning will be displayed.
Usually, the number of non-zero coefficients is increasing if alpha is decreasing. This is not a strict requirement, though. In certain situation if features interact with each other, a coefficient could go back to zero.

dex314 · 2018-08-06T13:02:07Z

Yes, I understand how the alpha grid works and I agree with everything you said above, I think I may not be explaining very well and it may be one of those one off issues with the data I am working with.
The fit is not relaying any messages or errors regarding convergence and the way I have built elastic nets in the past is exactly the way you describe (calculated min and max then log scaled between them over 100 alphas).
I had assumed then if I specified n_alphas = 30 then I would get a matrix of M parameters by 30. Like wise if I specified n_alphas = 40 I should get an Mx40 matrix of coefficients and they would have nearly identical paths from the n_alphas=30 model but those same paths converging or diverging over the next 10 alphas.
It is confusing me as to why I am getting more non-zero coefficients on n_alphas=10 (nearly the full solution actually ) and significantly less on n_alphas=40 (very sparse) and both coefficient outputs being Mx5. Its entirely possible its the data I am working with as well.
I thought perhaps there was a special method or criteria you had in place and it just was not displaying any messages and maybe you might know off the top of your head.

sebp · 2018-08-06T18:06:44Z

One possible issue I could imagine when using n_alphas = 10 instead of n_alphas = 100 is that gaps between adjacent alpha values are larger. Hence, moving from one alpha to the next one will result in large updates. The algorithm does not perform step-size optimization. It is possible for updates to over-shot and miss actual minimum. I would recommend to use a relatively dense list of alphas values.

If you want to double check, you can try R's glmnet package, which implements elastic net too.

dex314 · 2018-08-07T16:24:11Z

This was a good idea. I checked the same data using glmnet with family='cox'. It stopped at an alpha depth of 52 and the paths look similar when you specify n_alphas=5. When you try something higher like n_alphas=10, the paths looks similar but the python code doesnt return the rest of coefficient matrix and I cant figure out why. For python, n_alphas is the parameter and for R its nlambda. I've seen alpha and lambda interchanged in the past, Im not confusing these in this instance as it relates to your code am I?

Here are Python paths:

Here is GLMNET in R:

sebp · 2018-08-09T14:14:00Z

You are correct, glmnet's nlambda corresponds to n_alphas.

Are you saying that scikit-survival does not return the full path of 10 alphas, but glmnet does? Could you plot the individual estimates as dots in the plots, in addition to lines, please.

dex314 · 2018-08-10T13:19:59Z

Yes, in this particular instance, it is not returning the full path of alphas no matter what n_alphas I specify. I know when I first opened the issue I was all over the place, but this is definitely the main point of my confusion. It makes me think something is inadvertently defaulting somehow within the code as I did not change anything in the code itself.

GLMNET

SKSURV

plpxsk · 2018-08-20T16:29:54Z

One quick thought: in sksurv model call, can you check alpha_min_ratio and perhaps decrease it? Then perhaps you may get the longer paths seen in glmnet?

dex314 · 2018-08-29T18:22:59Z

Sorry for the delay in response. I tried your suggestion but it did not work. It seems like a unique issue and in the end, with the way glmnet is built, I can still get a sparse solution relative to the shorter paths. Additionally, the selected variables seem intuitive and appropriate.

hermidalc · 2020-03-14T00:24:00Z

I have a feeling this might be similar to the issue or confusion I’ve been having related to alphas that I commented on in #47.

I’ve found that Coxnet will silently not use all the alphas down the autogenerated sequence once the alpha values gets too small, but it won’t raise any warnings or errors during the fit.

For example, it might calculate an alpha max of 1.5 from the data and with an alpha_min_ratio set to 0.01 it will create the alphas_ sequence of n_alphas alphas from 1.5 down to 0.015. When it does the fit it doesn’t typically use all the alphas down the sequence and this seems to be normal behavior. It doesn’t show any convergence warnings.

I only realized this when I was trying to do model selection/CV based on the gist example and got Numerical error... consider increasing alpha errors when it was fitting individual alphas from the autogenerated sequence from initial fit on the data I did to generate the sequence.

@dex314 I would consider increasing alpha_min_ratio so that the sequence of alphas don’t become too small and maybe you will see it uses more of them and more alphas are shown in coef_

Also @sebp it might be a good idea to have two default values for alpha_min_ratio like glmnet has, 1e-4 when n_features < n_samples, and 1e-2 when n_features > n_samples.

sebp · 2020-03-20T11:28:55Z

Also @sebp it might be a good idea to have two default values for alpha_min_ratio like glmnet has, 1e-4 when n_features < n_samples, and 1e-2 when n_features > n_samples.

That's a good idea, would you be able to provide a pull request with this change?

The default value of alpha_min_ratio will depend on the sample size relative to the number of features in 0.13. If `n_samples > n_features`, the current default value 0.0001 will be used. If `n_samples < n_features`, 0.01 will be used instead. See #41 (comment)

sebp · 2020-04-11T15:41:45Z

The default value for alpha_min_ratio will depend in n_features and n_samples in a future release. I added a warning to notify users about this change (see commit dfd645e)

See #41 (comment)

Sann5 · 2024-10-09T11:27:25Z

Hello scikit-survival team. Thanks for the awesome package. Is there a solution for this yet? Essentially I would like coxnet to not drop the small alphas in my regularization paths, e.g. to fit models for all the alphas I specify in the parameter alphas.

Sometimes I will specify 100 alphas, it would only use the first 6. Funny enough always only 6. I can provide you with example code if need be.

sebp mentioned this issue Mar 20, 2020

Error when running cross-validation example #47

Closed

sebp added a commit that referenced this issue Jun 27, 2020

Set alpha_min_ratio depending on n_samples/n_features ratio

0655c02

See #41 (comment)

sebp mentioned this issue Jun 27, 2020

Set alpha_min_ratio depending on n_samples/n_features ratio #124

Merged

sebp added a commit that referenced this issue Jun 27, 2020

Set alpha_min_ratio depending on n_samples/n_features ratio

112974b

See #41 (comment)

sebp added a commit that referenced this issue Jun 28, 2020

Set alpha_min_ratio depending on n_samples/n_features ratio

54c2554

See #41 (comment)

sebp added a commit that referenced this issue Jun 28, 2020

Set alpha_min_ratio depending on n_samples/n_features ratio

fb18315

See #41 (comment)

Sann5 linked a pull request Oct 21, 2024 that will close this issue

Disable early stopping criteria #484

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alpha Behavior #41

Alpha Behavior #41

dex314 commented Aug 1, 2018

plpxsk commented Aug 2, 2018

dex314 commented Aug 2, 2018

sebp commented Aug 4, 2018

dex314 commented Aug 6, 2018

sebp commented Aug 6, 2018

dex314 commented Aug 7, 2018

sebp commented Aug 9, 2018

dex314 commented Aug 10, 2018

plpxsk commented Aug 20, 2018

dex314 commented Aug 29, 2018

hermidalc commented Mar 14, 2020 •

edited

Loading

sebp commented Mar 20, 2020

sebp commented Apr 11, 2020

Sann5 commented Oct 9, 2024 •

edited

Loading

Alpha Behavior #41

Alpha Behavior #41

Comments

dex314 commented Aug 1, 2018

plpxsk commented Aug 2, 2018

dex314 commented Aug 2, 2018

sebp commented Aug 4, 2018

dex314 commented Aug 6, 2018

sebp commented Aug 6, 2018

dex314 commented Aug 7, 2018

sebp commented Aug 9, 2018

dex314 commented Aug 10, 2018

plpxsk commented Aug 20, 2018

dex314 commented Aug 29, 2018

hermidalc commented Mar 14, 2020 • edited Loading

sebp commented Mar 20, 2020

sebp commented Apr 11, 2020

Sann5 commented Oct 9, 2024 • edited Loading

hermidalc commented Mar 14, 2020 •

edited

Loading

Sann5 commented Oct 9, 2024 •

edited

Loading