Converting floats to strings

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Converting floats to strings

Robert Virding-2
When does Lua decide to trim floats when doing a tostring? For example:

> "a"..tostring(0.30000000000000004).."b"
a0.3b
> "a"..tostring(0.3000000000000004).."b"
a0.3b
> "a"..tostring(0.300000000000004).."b"
a0.3b
> "a"..tostring(0.30000000000004).."b"
a0.30000000000004b

These are all valid floats. Added the a adn b to make the string clearer.

> 0.3 ~= 0.1+0.1+0.1
true
> 0.30000000000000004 == 0.1+0.1+0.1
true

I am interested as I am implementing Lua and want it to behave the same way.

Robert

Reply | Threaded
Open this post in threaded view
|

Re: Converting floats to strings

Andrew Gierth
>>>>> "Robert" == Robert Virding <[hidden email]> writes:

 Robert> When does Lua decide to trim floats when doing a tostring?

LUA_NUMBER_FMT in luaconf.h, I believe.

--
Andrew.

Reply | Threaded
Open this post in threaded view
|

Re: Converting floats to strings

Philippe Verdy
In reply to this post by Robert Virding-2
Interesting question: because  you demonstrate that the number  0.30000000000000004 (with 17 significant decimal digits) is different from the number 0.3 but is still converted to the same string '0.3' by tostring().

And that's an inconsistency of Lua's implementation of tostring(number) that discards some significant precision in numbers, to only 14 significant decimal digits (excluding the most significant 0's) instead of 17 as required for this representation of Lua numbers based on IEEE 64-bit doubles (the 17th digit is needed because of rounding for the inverse operation, with numbers of any exponential scale, because the range of integers represented exactly in those doubles can have up to 16 digits but not their maximum and rounding takes effect but with 17 decimal digits the rounding will always lead to the expected binary64 number). 

And the assumption that two different Lua numbers should have two different string representations with the tostring() function is then false in this implementation: that conversion of numbers to strings is lossy, not reversible.

So tostring() performs some (unspecified?) rounding before the actual conversion, probably because it was felt "sufficient" to avoid "overlong" strings (and then save up to 3 bytes in strings)...

With this implementation, the 14 decimal digits, a possible dot, a possible negative sign, the "e" prefix for a possible sign of the decimal exponent, and up to 3 digits for an absolute decimal exponent gives a string up to 20 bytes (instead of 23), and I don't see a rationale for limiting the string length... except to "hide" the rounding errors that unavoidably occur when using trigonometric functions (like sin or cos with some some multiples of pi), or exponential functions (with ^ or log with base 10), or unexact arithmetic (like divisions by non-powers of two), which could cause problems in badly written financial/accounting applications (storing amounts directly as numbers with 2 or 3 fractional decimals, without scaling them first to integers before adding them and without proper roundings when needed).

As well, if the 17 digits were displayed and correctly rounded, the amount 0.10+0.10+0.10 would display 0.30000000000000004 with the unrounded tostring() formatter by default (and not 0.30 when using a better formatter); it would not be wrong but may not fit well in tabular columns. But even in this case tostring(0.30000000000000004) still doesn't display '0.30' but '0.3', which does not correctly line up in tabular results (where there should be a fixed amount of decimals).

So that rationale is IMHO probably broken: the default rounding to 14 decimal digits in tostring() is only a facility for lazy programmers of financial/accounting applications, or for Lua programming beginners. You'd better override that builtin function in your Lua programs/modules/libraries to replace that (stupid) default.




Le ven. 31 mai 2019 à 17:02, Robert Virding <[hidden email]> a écrit :
When does Lua decide to trim floats when doing a tostring? For example:

> "a"..tostring(0.30000000000000004).."b"
a0.3b
> "a"..tostring(0.3000000000000004).."b"
a0.3b
> "a"..tostring(0.300000000000004).."b"
a0.3b
> "a"..tostring(0.30000000000004).."b"
a0.30000000000004b

These are all valid floats. Added the a adn b to make the string clearer.

> 0.3 ~= 0.1+0.1+0.1
true
> 0.30000000000000004 == 0.1+0.1+0.1
true

I am interested as I am implementing Lua and want it to behave the same way.

Robert

Reply | Threaded
Open this post in threaded view
|

Re: Converting floats to strings

Francisco Olarte
Philipe...

mmm, another difficult to follow top quote, anyway, you said answering to

> Le ven. 31 mai 2019 à 17:02, Robert Virding <[hidden email]> a écrit :
>> When does Lua decide to trim floats when doing a tostring? For example:

On Sat, Jun 1, 2019 at 10:18 AM Philippe Verdy <[hidden email]> wrote:
> Interesting question: because  you demonstrate that the number  0.30000000000000004 (with 17 significant decimal digits) is different from the number 0.3 but is still converted to the same string '0.3' by tostring().

That's obvious and correct. TFM states "tostring (v) / Receives a
value of any type and converts it to a string in a human-readable
format. (For complete control of how numbers are converted, use
string.format.)"

Converts to a string? check. Human readable? check. From the
definition it could convert all numbers to "a number".

> And that's an inconsistency of Lua's implementation of tostring(number) that discards some significant precision in numbers,

A couple of problems here.You have to be consistent with some thing,
as there is no other thing to compare, I assume you mean consistent
with itself. And it is consistent with itself, it always considers
about 14 digits.

Also, "significant precision", standalone, has no meaning. You need to
state the problem to know what is the significant precision.

FYI, tostring is the basic conversion for display of a number. Like
C++ ostream inserters, stdio's printf an similar things. In nearly all
languages this conversion is done to give something similar to what
the language things was used to initialize the number. Given 0.3
cannot be exactly represented as double, the truncation on default
conversion just tries to avoid things like

> string.format("%.30f", 0.3)
0.299999999999999988897769753748

Which is extremely useful as a default behaviour ( and given tostring
has no control args and is dessigned for things like debugging, t's
not that bad ).

> to only 14 significant decimal digits (excluding the most significant 0's) instead of 17 as required for this representation of Lua numbers based on IEEE 64-bit doubles (the 17th digit is needed because of rounding for the inverse operation, with numbers of any exponential scale, because the range of integers represented exactly in those doubles can have up to 16 digits but not their maximum and rounding takes effect but with 17 decimal digits the rounding will always lead to the expected binary64 number).

Nah, if you are in a binary machine and you want exact (
round-tripable ) tostring, just use a binary representation:

> string.format("%a", 0.3)
0x1.3333333333333p-2
> 0x1.3333333333333p-2-0.3
0.0

Not too useful for a tostring, but compact and exact.

> And the assumption that two different Lua numbers should have two different string representations with the tostring() function is then false in this implementation: that conversion of numbers to strings is lossy, not reversible.

This is what happens when you make incorrect asumptions. Nearly no
programming language does default conversion of floating points to
numbers with that property.

> So tostring() performs some (unspecified?) rounding before the actual conversion, probably because it was felt "sufficient" to avoid "overlong" strings (and then save up to 3 bytes in strings)...

Like nearly any similar function. They normally do not do it for
"space" saving, they do it for "pleasant" reading of results in the
normal use.

> With this implementation, the 14 decimal digits, a possible dot, a possible negative sign, the "e" prefix for a possible sign of the decimal exponent, and up to 3 digits for an absolute decimal exponent gives a string up to 20 bytes (instead of 23), and I don't see a rationale for limiting the string length... except to "hide" the rounding errors that unavoidably occur when using trigonometric functions (like sin or cos with some some multiples of pi), or exponential functions (with ^ or log with base 10), or unexact arithmetic (like divisions by non-powers of two), which could cause problems in badly written financial/accounting applications (storing amounts directly as numbers with 2 or 3 fractional decimals, without scaling them first to integers before adding them and without proper roundings when needed).

Again, exact => use binary output ( hex, actually ) Also, pi is not
exactly representable in floats. Also, as you said, "problems in badly
written", if you do not knwo financial calculations cannot be done in
float, precision loss in to string is far, far awy from being one of
your major problems.


> As well, if the 17 digits were displayed and correctly rounded, the amount 0.10+0.10+0.10 would display 0.30000000000000004 with the unrounded tostring() formatter by default (and not 0.30 when using a better formatter); it would not be wrong but may not fit well in tabular columns. But even in this case tostring(0.30000000000000004) still doesn't display '0.30' but '0.3', which does not correctly line up in tabular results (where there should be a fixed amount of decimals).

But you are mixing precission here. 0.10 is about
0.10000000000000000555 in binary. ADDING it three times gives
0.30000000000000004441, which would puzzle someone not familiar with
how FP works, knowing that the EXACT decimal result of adding
0.10000000000000000555 is 0.30000000000000001665, you would not expect
a 4 there.

Also, tabular results with tostring ? Any novice programmer knows
tables ( aligning the old line printer way ) are not done this way,
this is what string.format is for ( and the requsites are going to be
the ones which dictates precision / alginment ).


> So that rationale is IMHO probably broken: the default rounding to 14 decimal digits in tostring() is only a facility for lazy programmers of financial/accounting applications, or for Lua programming beginners. You'd better override that builtin function in your Lua programs/modules/libraries to replace that (stupid) default.

YOU do it. I, and I suspect many more,  prefer to have a function with
does tostring(0.1 * 3)=>0.3 instead of 30000000000000004
Francisco Olarte.

Reply | Threaded
Open this post in threaded view
|

Re: Converting floats to strings

Philippe Verdy
Le sam. 1 juin 2019 à 11:43, Francisco Olarte <[hidden email]> a écrit :
Also, "significant precision", standalone, has no meaning. You need to
state the problem to know what is the significant precision.
I stated that: a "significant precision" is the precision by which two numbers can remain distinct for == or ~= in Lua.

It is that precision that the tostring() should preserve to create an equivalence between (x == y) and (tostring(x) == tostring(y)), where x and y are ANY Non-NaN Lua number values (including infinite values, or denormal values, or negative zero if they are supported).
This is notably needed to create safe (de)serializers.