-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-18705][ML][DOC] Update user guide to reflect one pass solver for L1 and elastic-net #16139
Conversation
Test build #69661 has finished for PR 16139 at commit
|
Unlike the original dataset which can only be stored in a distributed system, | ||
these statistics can be loaded into memory on a single machine if the number of features is relatively small, and then we can solve the objective function through Cholesky factorization on the driver. | ||
This objective function has an analytic solution and it requires only one pass over the data to collect necessary statistics to solve. For an | ||
$n \times m$ data matrix, these statistics require only $O(m^2)$ storage and so can be stored on a single machine when $n$ (the number of features) is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$n$ (the number of features)
-> $m$ (the number of features)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, thanks!
Test build #69671 has finished for PR 16139 at commit
|
ping @yanboliang |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sethah Thanks for working on this, I left some comments.
@@ -59,17 +59,22 @@ Given $n$ weighted observations $(w_i, a_i, b_i)$: | |||
|
|||
The number of features for each observation is $m$. We use the following weighted least squares formulation: | |||
`\[ | |||
minimize_{x}\frac{1}{2} \sum_{i=1}^n \frac{w_i(a_i^T x -b_i)^2}{\sum_{k=1}^n w_k} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2 | |||
\min_{\mathbf{x}}\frac{1}{2} \sum_{i=1}^n \frac{w_i(\mathbf{a}_i^T \mathbf{x} -b_i)^2}{\sum_{k=1}^n w_k} + \frac{1}{2}\frac{\lambda}{\delta}\sum_{j=1}^m(\sigma_{j} x_{j})^2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This formulation is out of date, we should update it to include L1/elasticNet regularization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch. Done.
|
||
WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization and standardization. | ||
In order to make the normal equation approach efficient, WeightedLeastSquares requires that the number of features be no more than 4096. For larger problems, use L-BFGS instead. | ||
Spark ML currently supports two types of solvers for the normal equations: Cholesky factorization and Quasi-Newton methods (L-BFGS/OWL-QN). Cholesky factorization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ML
-> MLlib
, "Spark ML" is not an official name of the component.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Spark ML currently supports two types of solvers for the normal equations: Cholesky factorization and Quasi-Newton methods (L-BFGS/OWL-QN). Cholesky factorization | ||
depends on a positive definite covariance matrix (e.g. columns of the data matrix must be linearly independent) and will fail if this condition is violated. Quasi-Newton methods | ||
are still capable of providing a reasonable solution even when the covariance matrix is not positive definite, so the normal equation solver can also fall back to | ||
Quasi-Newton methods in this case. This fallback is currently always enabled for the `LinearRegression` estimator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and GeneralizedLinearRegression
estimator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
are still capable of providing a reasonable solution even when the covariance matrix is not positive definite, so the normal equation solver can also fall back to | ||
Quasi-Newton methods in this case. This fallback is currently always enabled for the `LinearRegression` estimator. | ||
|
||
`WeightedLeastSquares` supports L1, L2, and elastic-net regularization and provides options to enable or disable regularization and standardization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding following clarification should be more clear?
- For L2 or no regularization, Cholesky solver is the default choice and will fall back to Quasi-Newton solver if the covariance matrix is not positive definite.
- For L1/elasticNet regularization, Quasi-Newton solver is the default and only choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a note about it. It's a bit unclear to me who the audience is here. Since WLS is private, this seems more informational than anything. So I just mentioned that L1 has no analytical solution and requires QN solver. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the current note suffices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The audience here is developers/contributors, and the current change is perfectly OK.
Test build #69766 has finished for PR 16139 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except for the one nit.
WeightedLeastSquares only supports L2 regularization and provides options to enable or disable regularization and standardization. | ||
In order to make the normal equation approach efficient, WeightedLeastSquares requires that the number of features be no more than 4096. For larger problems, use L-BFGS instead. | ||
Spark MLlib currently supports two types of solvers for the normal equations: Cholesky factorization and Quasi-Newton methods (L-BFGS/OWL-QN). Cholesky factorization | ||
depends on a positive definite covariance matrix (e.g. columns of the data matrix must be linearly independent) and will fail if this condition is violated. Quasi-Newton methods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"e.g." -> "i.e."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
Test build #69838 has finished for PR 16139 at commit
|
LGTM, merged into master and branch-2.1. Thanks for all. |
…or L1 and elastic-net ## What changes were proposed in this pull request? WeightedLeastSquares now supports L1 and elastic net penalties and has an additional solver option: QuasiNewton. The docs are updated to reflect this change. ## How was this patch tested? Docs only. Generated documentation to make sure Latex looks ok. Author: sethah <[email protected]> Closes #16139 from sethah/SPARK-18705. (cherry picked from commit 8225361) Signed-off-by: Yanbo Liang <[email protected]>
…or L1 and elastic-net ## What changes were proposed in this pull request? WeightedLeastSquares now supports L1 and elastic net penalties and has an additional solver option: QuasiNewton. The docs are updated to reflect this change. ## How was this patch tested? Docs only. Generated documentation to make sure Latex looks ok. Author: sethah <[email protected]> Closes apache#16139 from sethah/SPARK-18705.
…or L1 and elastic-net ## What changes were proposed in this pull request? WeightedLeastSquares now supports L1 and elastic net penalties and has an additional solver option: QuasiNewton. The docs are updated to reflect this change. ## How was this patch tested? Docs only. Generated documentation to make sure Latex looks ok. Author: sethah <[email protected]> Closes apache#16139 from sethah/SPARK-18705.
What changes were proposed in this pull request?
WeightedLeastSquares now supports L1 and elastic net penalties and has an additional solver option: QuasiNewton. The docs are updated to reflect this change.
How was this patch tested?
Docs only. Generated documentation to make sure Latex looks ok.