-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some rounding error issue between CPU & GPU #23
Comments
Thanks for the find @haampie ! This is very mysterious to me. I haven't played with Julia/CUDA interop before, so I'm unsure how Julia code gets compiled for GPU execution. The fact that most of the limbs are accurate is especially puzzling; if it was simply the case that CUDA's arithmetic/fma operations are improperly rounded, then you would expect all of limbs 2-8 to be garbage. But the fact that limbs 1-6 are right, while limbs 7-8 are wrong, rules out all of my easy hypotheses for what could be going wrong. I'll look into this the next time I work on MultiFloats.jl (which honestly might take a while... grad student life has me swamped these days) |
Just to add another data point. I tried exactly the same code on my GPU (1080ti), I am getting the accuracy 1e-114. Is this the expected accuracy level? 100×100 Matrix{MultiFloat{Float64, 8}}: |
Hey @kunyuan, thanks for your interest in MultiFloats.jl! No, this is not the expected accuracy, and I'm afraid to report I still don't understand what's going on with GPU |
Hi @dzhang314, do you know CAMPARY? |
Is this still an issue in v2? |
@rguerrab I'm not sure about the current status of this issue, since I don't do any Julia GPU development. I originally wanted to investigate this myself, but after a few years, I've never found the opportunity or resources. What I can tell you is that MultiFloats.jl does not contain any GPU-specific code -- it is written purely in terms of I'm going to close this issue until someone can demonstrate that this is a fault in MultiFloats.jl as opposed to, say, an overly-aggressive optimization pass in CUDA.jl. Note that the algorithms in MultiFloats.jl are all sensitive to rounding mode (must be round-to-nearest, ties-to-even) and contraction (rounding must occur exactly in the places where IEEE 754 specifies it, no more and no less). If you have an aggressive optimizing compiler that doesn't strictly preserve IEEE 754 semantics, then MultiFloats.jl will silently and catastrophically break. |
I've tried on MultiFloats.jl on the GPU, but I'm getting loss of precision compared to the CPU:
Gives me
but
eps(Float64x8)
is5.9091063153828709e-126
.What explain this? The order of iteration?
The text was updated successfully, but these errors were encountered: