Hi all,

(I will avoid opinions in the following. Y'all are free to decide

whatever y'all wish to do. :-p)

Tests were done on generic MinGW binaries of lua-5.4.0-work1. The

32-bit binary has the double-rounding issue due to extended

precision that was discussed in the past few days. The 64-bit

binary uses the 64-bit FPU datapath and is free of double-rounding.

Tests were done on the output of math.random(), which is between 0

and 1. Any zeros are skipped. The first tests are A+B, A-B, A*B

and A/B. A and B do not change for the 4 ops. The number of

different results between the 32-bit binary and the 64-bit binary

is counted.

Operation / Op_Count / Diff_Count

=================================

A+B 1,000,000,000 0

A-B 1,000,000,000 0

A*B 1,000,000,000 243,584

A/B 1,000,000,000 244,107

For A*B and A/B, this is roughly 1 in 4000.

All different results differ by 1 ULP. Nothing more than 1 ULP was

seen. Here is a look at one result:

Operation : *

n1 = 0.18987016622488206 0x1.84daa65360288p-3

n2 = 0.98339548253950804 0x1.f77f9cd91528bp-1

32b = 0.18671746373457448 0x1.7e65b9c2a810ep-3

64b = 0.18671746373457451 0x1.7e65b9c2a810fp-3

Plugging the values into an online calculator that has binary128

[1], we can do a check with a 'better' result, assuming n1 and n2

is precise to many more digits. The comparison is as follows:

fpu8087 = 0.18671746373457448

binary128 = 0.1867174637345744950213132394217624

fpu64bit = 0.18671746373457451

So results that differ have intermediate results that are almost

exactly in-between, this is why there is a double-rounding issue.

A casual check with a few other numbers show the same trend.

However, note that those decimal representations of the 64-bit

floats are not exactly equal to the exact value of the binary

representation -- they just have enough digits for round-tripping.

Here are the hex representations of the significands:

fpu8087 = 7e65b9c2a810e

binary128 = 7E65B9C2A810E5EAB7A33195161D

fpu64bit = 7e65b9c2a810f

Using this comparison, it appears that the binary128 result is

nearer to fpu8087 than to fpu64bit. We can also verify this by

getting a more accurate decimal version of the 64-bit binary

representation (done by manual fiddling, so not definitive):

fpu8087 = 0.1867174637345744847571893387794262

binary128 = 0.1867174637345744950213132394217624

fpu64bit = 0.1867174637345745125127649544083397

Thus it is confirmed that the fpu8087 result is closer to the

binary128 result, at least for this example.

This ends the first part.

In order to generate hits for A+B and A-B, the magnitude of B is

changed. B is divided by 1e3, 1e6, 1e9 or 1e12, and then the

addition is performed and the comparison made.

Operation / Op_Count / Diff_Count

=================================

A+(B/1e3) 1,000,000,000 0

A+(B/1e6) 1,000,000,000 0

A+(B/1e9) 1,000,000,000 244,158

A+(B/1e12) 1,000,000,000 243,489

Now we can see the double-rounding differences. For the last two

operations, the rate is also roughly 1 in 4000. Again, all

different results differ by 1 ULP. Nothing more than 1 ULP was seen.

This ends the second part.

[1]

http://weitz.de/ieee/--

Cheers,

Kein-Hong Man (esq.)

Selangor, Malaysia