Some rounding error issue between CPU & GPU #23

haampie · 2021-04-25T15:55:16Z

I've tried on MultiFloats.jl on the GPU, but I'm getting loss of precision compared to the CPU:

using CUDA, MultiFloats
A = rand(Float64x8, 100, 100)
B = rand(Float64x8, 100, 100)
A * B - Array(CuArray(A) * CuArray(B))

Gives me

100×100 Matrix{MultiFloat{Float64, 8}}:
 -1.23827e-98    9.35263e-99  -8.83181e-99  …  -4.70324e-99   -1.3348e-98
 -1.98421e-99    8.20389e-99   1.67043e-98      1.45499e-98    2.32225e-98
 -2.77264e-99   -3.30951e-99   1.32426e-98     -1.09181e-98    7.84157e-100
  1.92544e-98    6.35776e-99  -8.85547e-99      1.29435e-98   -4.89252e-99
 -5.52038e-99    5.35901e-99  -3.705e-98        1.53947e-99    7.38954e-99
 -2.16904e-98    1.64505e-98  -1.16536e-98  …  -3.19036e-98    7.5397e-99
  6.72487e-98    6.07349e-99  -2.87359e-98      ...

but eps(Float64x8) is 5.9091063153828709e-126.

What explain this? The order of iteration?

The text was updated successfully, but these errors were encountered:

dzhang314 · 2021-04-29T03:41:30Z

Thanks for the find @haampie ! This is very mysterious to me. I haven't played with Julia/CUDA interop before, so I'm unsure how Julia code gets compiled for GPU execution. The fact that most of the limbs are accurate is especially puzzling; if it was simply the case that CUDA's arithmetic/fma operations are improperly rounded, then you would expect all of limbs 2-8 to be garbage. But the fact that limbs 1-6 are right, while limbs 7-8 are wrong, rules out all of my easy hypotheses for what could be going wrong.

I'll look into this the next time I work on MultiFloats.jl (which honestly might take a while... grad student life has me swamped these days)

kunyuan · 2022-03-07T12:23:22Z

Just to add another data point. I tried exactly the same code on my GPU (1080ti), I am getting the accuracy 1e-114. Is this the expected accuracy level?

100×100 Matrix{MultiFloat{Float64, 8}}:
-1.14468e-115 2.54748e-115 -6.94563e-115 -4.96936e-115 2.13842e-115 8.64325e-115 1.6073e-115 … 3.4182e-115 6.56281e-115 3.55884e-115 4.6424e-115 -1.1596e-115 2.23816e-115
...

dzhang314 · 2022-03-08T04:49:16Z

Hey @kunyuan, thanks for your interest in MultiFloats.jl! No, this is not the expected accuracy, and I'm afraid to report I still don't understand what's going on with GPU MultiFloat calculations. I'll report back when I have time to take a look at this in detail.

orkolorko · 2022-06-20T07:01:01Z

Hi @dzhang314, do you know CAMPARY?
CAMPARY
It is a library whose idea is similar to your Multifloat library but made to work on the GPU.
I did a reimplementation in Julia some times ago, on the CPU, so quite similar to your library.
If you're interested I can try to take off some dust and compare with Multifloat...

rguerrab · 2024-09-25T14:59:14Z

Is this still an issue in v2?

dzhang314 · 2024-09-26T00:48:44Z

@rguerrab I'm not sure about the current status of this issue, since I don't do any Julia GPU development. I originally wanted to investigate this myself, but after a few years, I've never found the opportunity or resources.

What I can tell you is that MultiFloats.jl does not contain any GPU-specific code -- it is written purely in terms of Float64 arithmetic operations (including FMA) that are specified by IEEE 754 to return identical results on any standards-conforming CPU or GPU. Therefore, any discrepancy between CPU and GPU indicates a fault in either the hardware or the compiler toolchain, not MultiFloats.jl itself. This was always true in MultiFloats.jl v1.0 and remains true in v2.0.

I'm going to close this issue until someone can demonstrate that this is a fault in MultiFloats.jl as opposed to, say, an overly-aggressive optimization pass in CUDA.jl. Note that the algorithms in MultiFloats.jl are all sensitive to rounding mode (must be round-to-nearest, ties-to-even) and contraction (rounding must occur exactly in the places where IEEE 754 specifies it, no more and no less). If you have an aggressive optimizing compiler that doesn't strictly preserve IEEE 754 semantics, then MultiFloats.jl will silently and catastrophically break.

dzhang314 closed this as completed Sep 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some rounding error issue between CPU & GPU #23

Some rounding error issue between CPU & GPU #23

haampie commented Apr 25, 2021

dzhang314 commented Apr 29, 2021

kunyuan commented Mar 7, 2022 •

edited

Loading

dzhang314 commented Mar 8, 2022

orkolorko commented Jun 20, 2022

rguerrab commented Sep 25, 2024

dzhang314 commented Sep 26, 2024 •

edited

Loading

Some rounding error issue between CPU & GPU #23

Some rounding error issue between CPU & GPU #23

Comments

haampie commented Apr 25, 2021

dzhang314 commented Apr 29, 2021

kunyuan commented Mar 7, 2022 • edited Loading

dzhang314 commented Mar 8, 2022

orkolorko commented Jun 20, 2022

rguerrab commented Sep 25, 2024

dzhang314 commented Sep 26, 2024 • edited Loading

kunyuan commented Mar 7, 2022 •

edited

Loading

dzhang314 commented Sep 26, 2024 •

edited

Loading