Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8

classic Classic list List threaded Threaded
66 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8

Dirk Laurie-2
2018-07-12 13:38 GMT+02:00 Frédéric van der Plancke <[hidden email]>:

>
> On 10/07/2018 23:56, Gregg Reynolds wrote:
>
>
>
> On Tue, Jul 10, 2018, 4:44 PM Gregg Reynolds <[hidden email]> wrote:
>
>>  (e.g. numbers in ltr scripts).
>
>
> Correction: numbers in rtl scripts. Unicode says that numbers in e.g. Arabic are ltr. This is complete BS, but it is also a fact on the ground that cannot be fixed. Extra credit: estimate the cost of this very fundamental mistake.
>
>
> I'm not sure it's a mistake, it may be a well-though design compromise.
>
> In arabic, the numbers are written in the same orientation as we do in european languages, because of a double inversion: from right to left, they first write the unit, then the 10s, then the 100s... the end result being that in both writing systems, the units go to the right and the heavier digits go to the left.

I once was an examiner in a mathematics competition in which some of
the competitors wrote in Arabic. The mathematical formulae were
written as Western mathematicians write them, and so were the numbers.
When I made a remark to this effect to my Algerian colleague, he
responded by asking me to add two ten-digit numbers that he had
written down. Naturally I started at the unit as I was taught in
primary school, and he stopped me right there. "Why do you start the
addition at the right, if you do everthing else from left to right?" I
replied "Because that's how the algorithm works." He then said: "But
it's our algorithm.  An Arab mathematician invented it, and it starts
at the right because we write numbers that way."

He was right, you know!

Reply | Threaded
Open this post in threaded view
|

Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8

Javier Guerra Giraldez
On 12 July 2018 at 14:00, Dirk Laurie <[hidden email]> wrote:
> replied "Because that's how the algorithm works." He then said: "But
> it's our algorithm.  An Arab mathematician invented it, and it starts
> at the right because we write numbers that way."
>
> He was right, you know!

so written numbers were little-endian all the time!

30 years lamenting Intel's choice and now it turns there were
right.... I'll have to get drunk tonight.


--
Javier

Reply | Threaded
Open this post in threaded view
|

Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8

Viacheslav Usov
In reply to this post by Dirk Laurie-2
On Thu, Jul 12, 2018 at 3:00 PM Dirk Laurie <[hidden email]> wrote:

> An Arab mathematician invented it, and it starts at the right because we write numbers that way.

> He was right, you know!

No, he was not. Karl Menninger in Number Words and Number Symbols: A Cultural History of Numbers, p. 530:

The absurdity of this result, of course, at once shows that the numbers themselves, with their decreasing orders of magnitude from left to right must have been incorporated into the Arabic script as foreign borrowings. In the Indian writing, which is read from left to right, the orders of magnitude of the digits also consistently decrease from left to right, but in a right-to-left form of writing like the Arabic alphabet they should also decrease from right to left, as do the ranks of the alphabetical numbers written with the Arabic alphabet.

(end)

Babylonian numerals were also written with the least significant "digits" on the right. Uta Merzbach and Carl Boyer in A History of Mathematics p. 24

In a precisely analogous way, the Babylonians made multiple use of such a symbol as II [1]. When they wrote || || ||, clearly separating the three groups of two wedges each, they understood the right-hand group to mean two units, the next group to mean twice their base, 60, and the left-hand group to signify twice the square of their base.

(end)

However great the achievements of Arab mathematicians were, our current numeral system was not one of them.

Cheers,
V.

[1] I replaced the "two wedges" symbol (meaning number 2) with II; I have tried using Unicode codepoint U+12416 which I think should mean just that, but I was never able to convince Gmail to render it correctly in my browser.
Reply | Threaded
Open this post in threaded view
|

Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8

Gregg Reynolds-2
In reply to this post by Dirk Laurie-2


On Thu, Jul 12, 2018 at 8:00 AM, Dirk Laurie <[hidden email]> wrote:
2018-07-12 13:38 GMT+02:00 Frédéric van der Plancke <[hidden email]>:
>
> On 10/07/2018 23:56, Gregg Reynolds wrote:
>
>
>
> On Tue, Jul 10, 2018, 4:44 PM Gregg Reynolds <[hidden email]> wrote:
>
>>  (e.g. numbers in ltr scripts).
>
>
> Correction: numbers in rtl scripts. Unicode says that numbers in e.g. Arabic are ltr. This is complete BS, but it is also a fact on the ground that cannot be fixed. Extra credit: estimate the cost of this very fundamental mistake.
>
>
> I'm not sure it's a mistake, it may be a well-though design compromise.
>
> In arabic, the numbers are written in the same orientation as we do in european languages, because of a double inversion: from right to left, they first write the unit, then the 10s, then the 100s... the end result being that in both writing systems, the units go to the right and the heavier digits go to the left.

I once was an examiner in a mathematics competition in which some of
the competitors wrote in Arabic. The mathematical formulae were
written as Western mathematicians write them, and so were the numbers.
When I made a remark to this effect to my Algerian colleague, he
responded by asking me to add two ten-digit numbers that he had
written down. Naturally I started at the unit as I was taught in
primary school, and he stopped me right there. "Why do you start the
addition at the right, if you do everthing else from left to right?" I
replied "Because that's how the algorithm works." He then said: "But
it's our algorithm.  An Arab mathematician invented it, and it starts
at the right because we write numbers that way."

He was right, you know!

Actually, he was wrong on several counts. Al-Khwarizmi was a Persian, not an Arab, even though he wrote in Arabic. His work introducing decimal positioning is lost, all we have is a Latin translation, so we don't really know what his original algorithm was. In any case, it's not about the algorithm, it's about the positioning; you can use whatever algorithm you want so long as you evaluate the positions correctly.

Traditionally, in Arabic numbers were read Least Significant Digit first - RTL. This is still acceptable today - I have heard radio announcers do it.

Where Unicode went off the rails is in claiming RTL languages are somehow inherently bidirectional. This is patently false, and biased: it assumes that Most Significant Digit first is "normal". If the shoe were on the other foot, it would be the LTR languages that are bidirectional. But in fact all languages (writing systems) are unidirectional with respect to numbers.

The reason Unicode numbers are MSD-first is because the legacy encodings were, and they were that way because they were invented in the days of data processing with punch cards. Doing  it that way allowed the use of the math routines that were already available for MSD-first numbers.

Gregg

Reply | Threaded
Open this post in threaded view
|

Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8

Albert Chan
In reply to this post by Dirk Laurie-2
> I once was an examiner in a mathematics competition in which some of
> the competitors wrote in Arabic. The mathematical formulae were
> written as Western mathematicians write them, and so were the numbers.
> When I made a remark to this effect to my Algerian colleague, he
> responded by asking me to add two ten-digit numbers that he had
> written down. Naturally I started at the unit as I was taught in
> primary school, and he stopped me right there. "Why do you start the
> addition at the right, if you do everthing else from left to right?" I
> replied "Because that's how the algorithm works." He then said: "But
> it's our algorithm.  An Arab mathematician invented it, and it starts
> at the right because we write numbers that way."
>
> -- Dirk

So, if printing a billion digits of pi, you won't know how big pi
is until the last page ? (where the decimal point finally shown)

Going "Reverse Polish" on the numbers ...
Is this a joke ?



Reply | Threaded
Open this post in threaded view
|

Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8

Coda Highland
On Thu, Jul 12, 2018 at 2:05 PM, Albert Chan <[hidden email]> wrote:

>> I once was an examiner in a mathematics competition in which some of
>> the competitors wrote in Arabic. The mathematical formulae were
>> written as Western mathematicians write them, and so were the numbers.
>> When I made a remark to this effect to my Algerian colleague, he
>> responded by asking me to add two ten-digit numbers that he had
>> written down. Naturally I started at the unit as I was taught in
>> primary school, and he stopped me right there. "Why do you start the
>> addition at the right, if you do everthing else from left to right?" I
>> replied "Because that's how the algorithm works." He then said: "But
>> it's our algorithm.  An Arab mathematician invented it, and it starts
>> at the right because we write numbers that way."
>>
>> -- Dirk
>
> So, if printing a billion digits of pi, you won't know how big pi
> is until the last page ? (where the decimal point finally shown)
>
> Going "Reverse Polish" on the numbers ...
> Is this a joke ?

The same argument could be made in the other direction -- if printing
some LARGE number from left to right, how will you know how big it is
until you reach the decimal point?

/s/ Adam

1234