-
-
Notifications
You must be signed in to change notification settings - Fork 25.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DotProduct Kernel not non-negative definite after rounding errors #8252
Comments
Thanks for reporting the issue and providing a dataset. A full code example that breaks works really help someone trying to fix it. |
Here is code to reproduce. The result is:
import urllib2, csv
import sklearn, random
from sklearn import gaussian_process
import sklearn.gaussian_process.kernels as kernels
f = urllib2.urlopen('http://pims.structuralbiology.eu/X.csv')
X = [[float(i) for i in row] for row in csv.reader(f, delimiter=',') ]
f.close()
model = gaussian_process.GaussianProcessRegressor(
kernel=kernels.DotProduct(),
optimizer='fmin_l_bfgs_b',
random_state=None
)
model.fit(X, [0.0]*len(X)) |
I can reproduce with a conda environment using openblas (although I got a complaint about the 140-th leading minor and not the 114-th one). I can not reproduce the error with a conda environment using mkl. For the record readability counts, a lot! Please use triple back-quotes aka fenced code blocks to format error messages and code snippets. Also, it does not seem to matter in this particular case, but setting the random_state to a integer (rather than None) will make behaviour deterministic. |
I think that the underlying problem is that |
Yes, you correctly say that the Cholesky decomposition assumes a non-negative definite matrix. Any valid kernel for GPR must be non-negative definite. From the point of view of abstract mathematics, DotProduct meets this criterion, so correctly scikit offers it in the kernels package. The defect here is that when I put these together as above, rounding errors create some negative eigenvectors. So a mathematically correct regression failed due to numerical issues. The defect is not in the Cholesky decomposition. It is rounding errors in DotProduct. It is not sufficient for this kernel to calculate the individual dot products to the available precision. To be usable for GPR, it must also ensure that the resulting matrix as a whole is non-negative definite. I appreciate that this might be difficult. There is a possible workaround, which is to supply an identity kernel, one that has a covariance of 0 between different entries and 1 on the diagonal. Then people who hit this issue could add in a small multiple of the identity kernel as follows:
|
import numpy as np
np.linalg.cholesky([[1., 0.], [0., 0.]]) # LinAlgError: Matrix is not positive definite
This seems like a reasonable work-around although I am not an expert in Gaussian processes at all. |
Lesteve has identified another defect. Kernels must be postive semidefinite, i.e. have no negative eigenvalues, but can have zero eigenvalues. There are implementations of a Cholesky-style factorization that can handle this, and that is what is required for GPR. Just to be clear, the Identity kernel I suggest is not implemented yet. This is a request to implement it. This would also supply a workaround for the issue lesteve has identified. It is analogous to adding one to a series before reciprocating to avoid divide by zero errors. |
It seems like |
ping @jmetzen who reimplemented the gaussian_process subpackage for 0.18. He may have an informed opinion about this. |
Thanks for reporting this. @lesteve is right that adding a |
This would make sense indeed. |
@chrishmorris does using |
Yes, thank you! |
OK great, thanks for following up. I opened #8384 to improve the error message. |
I also suggest that in http://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.GaussianProcessRegressor.html where it says
this is modified to read
|
PR welcome ! Your change amounts to adding this (quite hard to figure out visually, next time it is probably better to post a diff or directly open a PR), right?
@jmetzen what do you think? |
Yes, adding this to the docstring would make sense. It can be added to #8384 |
Description
GPR will fail in the Cholesky decomposition if it finds negative eigenvectors. Cholesky decomposition itself is numerically stable. Unfortunately rounding errors in the kernel can cause this error, and this occurs with the DotProduct kernel.
Steps/Code to Reproduce
model = gaussian_process.GaussianProcessRegressor(
kernel=kernels.DotProduct(),
optimizer='fmin_l_bfgs_b',
random_state=None)
model.fit(X,Y)
Expected Results
Mathematically speaking, Dot Product is a kernel, i.e. symmetric and non-negative definite, so this should succeed.
Actual Results
A LinAlg error may be thrown in the Cholesky decomposition. The error is not in this routine. Rather, the value of kernels.DotProduct()(X) can have small negative eigenvalues after rounding errors.
An example of values for which this fails is at http://pims.structuralbiology.eu/X.csv
Versions
Linux-3.10.0-514.6.1.el7.x86_64-x86_64-with-centos-7.3.1611-Core
('Python', '2.7.12 |Continuum Analytics, Inc.| (default, Jul 2 2016, 17:42:40) \n[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]')
('NumPy', '1.11.2')
('SciPy', '0.18.1')
('Scikit-Learn', '0.18')
The text was updated successfully, but these errors were encountered: