Some statistical data on double rounding

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Some statistical data on double rounding

KHMan
Hi all,

(I will avoid opinions in the following. Y'all are free to decide
whatever y'all wish to do. :-p)

Tests were done on generic MinGW binaries of lua-5.4.0-work1. The
32-bit binary has the double-rounding issue due to extended
precision that was discussed in the past few days. The 64-bit
binary uses the 64-bit FPU datapath and is free of double-rounding.

Tests were done on the output of math.random(), which is between 0
and 1. Any zeros are skipped. The first tests are A+B, A-B, A*B
and A/B. A and B do not change for the 4 ops. The number of
different results between the 32-bit binary and the 64-bit binary
is counted.

Operation / Op_Count / Diff_Count
=================================
A+B  1,000,000,000  0
A-B  1,000,000,000  0
A*B  1,000,000,000  243,584
A/B  1,000,000,000  244,107

For A*B and A/B, this is roughly 1 in 4000.

All different results differ by 1 ULP. Nothing more than 1 ULP was
seen. Here is a look at one result:

Operation : *
n1  = 0.18987016622488206 0x1.84daa65360288p-3
n2  = 0.98339548253950804 0x1.f77f9cd91528bp-1
32b = 0.18671746373457448 0x1.7e65b9c2a810ep-3
64b = 0.18671746373457451 0x1.7e65b9c2a810fp-3

Plugging the values into an online calculator that has binary128
[1], we can do a check with a 'better' result, assuming n1 and n2
is precise to many more digits. The comparison is as follows:

fpu8087   = 0.18671746373457448
binary128 = 0.1867174637345744950213132394217624
fpu64bit  = 0.18671746373457451

So results that differ have intermediate results that are almost
exactly in-between, this is why there is a double-rounding issue.
A casual check with a few other numbers show the same trend.

However, note that those decimal representations of the 64-bit
floats are not exactly equal to the exact value of the binary
representation -- they just have enough digits for round-tripping.
Here are the hex representations of the significands:

fpu8087   = 7e65b9c2a810e
binary128 = 7E65B9C2A810E5EAB7A33195161D
fpu64bit  = 7e65b9c2a810f

Using this comparison, it appears that the binary128 result is
nearer to fpu8087 than to fpu64bit. We can also verify this by
getting a more accurate decimal version of the 64-bit binary
representation (done by manual fiddling, so not definitive):

fpu8087   = 0.1867174637345744847571893387794262
binary128 = 0.1867174637345744950213132394217624
fpu64bit  = 0.1867174637345745125127649544083397

Thus it is confirmed that the fpu8087 result is closer to the
binary128 result, at least for this example.

This ends the first part.

In order to generate hits for A+B and A-B, the magnitude of B is
changed. B is divided by 1e3, 1e6, 1e9 or 1e12, and then the
addition is performed and the comparison made.

Operation / Op_Count / Diff_Count
=================================
A+(B/1e3)   1,000,000,000  0
A+(B/1e6)   1,000,000,000  0
A+(B/1e9)   1,000,000,000  244,158
A+(B/1e12)  1,000,000,000  243,489

Now we can see the double-rounding differences. For the last two
operations, the rate is also roughly 1 in 4000. Again, all
different results differ by 1 ULP. Nothing more than 1 ULP was seen.

This ends the second part.

[1] http://weitz.de/ieee/

--
Cheers,
Kein-Hong Man (esq.)
Selangor, Malaysia



Reply | Threaded
Open this post in threaded view
|

Re: Some statistical data on double rounding

Albert Chan
On Jun 8, 2018, at 5:47 AM, KHMan <[hidden email]> wrote:

Operation / Op_Count / Diff_Count
=================================
A+B  1,000,000,000  0
A-B  1,000,000,000  0
A*B  1,000,000,000  243,584
A/B  1,000,000,000  244,107

For A*B and A/B, this is roughly 1 in 4000.

All different results differ by 1 ULP. Nothing more than 1 ULP was seen.

I found another blog doing the same test, error also about 1:4000


Just some math behind the observation:

Double rounding occurs when both rounded in the same direction.
Difference between 53-bits and 64-bits are 11-bits.
--> chances of same direction rounding is about 50%
--> chances of double rounding = 50% / 2^11 = 1 / 4096

With just 1 op, of course the error is at most 1 ULP

Operation : *
n1  = 0.18987016622488206    0x1.84daa65360288p-3
n2  = 0.98339548253950804    0x1.f77f9cd91528bp-1

Here are the hex representations of the significands:

fpu8087   = 7e65b9c2a810e
binary128 = 7E65B9C2A810E5EAB7A33195161D
fpu64bit  = 7e65b9c2a810f

Using this comparison, it appears that the binary128 result is nearer to fpu8087 than to fpu64bit. We can also verify this by getting a more accurate decimal version of the 64-bit binary representation (done by manual fiddling, so not definitive):

You do realize we have to work in 53-bits for comparison.
Getting more precise result with more bits does not show anything.

My guess is you set (n1, n2) as decimal string -> __float128.
Had you use the actual hexfloats, you should get this:

n1 * n2
0x1.84daa65360288p-3 * 0x1.f77f9cd91528bp-1
= 0x1.7e65b9c2a810e 800eb4c1577ec p-3

--> value should round-up, to 0x1.7e65b9c2a810fp-3
--> 53-bits rounding is doing the best it can with 2 doubles.