Internationalization of Lua

classic Classic list List threaded Threaded
48 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Internationalization of Lua

Mathieu Stumpf Guntz

Hello, if you are interested in Lua, internationalization and possibly programming languages localization, you might be interested in this thread.

A bit of background

You can skip this section if you are less interested in the human story stuffs and more interested on technical aspects of Lua internationalization.

My initial motivation, was to have a programming languages that only use phonetic signs.

Well, one way to do that is to assign an unique phonetic value to each sign usable in the programming language. For example in Lua which currently can only use ASCII tokens, a 256 sign mapping is enough (string content apart). That rather easy, and you can even easily have a monosyllabic map. For example in my native language we have 16 vowels (V) and and 20 consonant (C), that's more than enough to make a CV (or VC) sign for each ASCII element. A quick and dummy mapping would just assign them following some arbitrary order, and a somewhat less dummy solution would try to make a mapping with some mnemonic relation with the usual ASCII denoted sign. But even then, to my mind that would stay a rather impractical solution for anything useful.

So I looked for a spoken language which had a phonetic transcription and possibly would have in bonus friendly morpho-syntaxic properties for a programming language use case. On this regards Lojban might be a good choice I guess, at least it passed within my radars. But an other aspect regarding usefulness is the number of speaker. On that point, without forgetting the previous one, Esperanto make a better candidate. So I began to write research projects about it, namely Fabeleblaĵo and Algoritmaro, but in my native language as I didn't feel skilled enough in Esperanto at the time.

Most recently I transferred some courses of International Academy of Sciences San Marino, which use Esperanto as a common working language. Indeed, as I wanted to begin to translate my still in progress works on Fableblaĵo and Algoritmaro to Esperanto, I discovered that the Esperanto version of Wikiversity is still in beta version. I'm trying to change that by adding some courses and in the process so creating useful wiki templates and make feedback and tickets. The later are grouped on Wikimedia phabricator Eliso tag (Eliso stands for Esperanto kaj LIbera Scio, ie. Esperanto and free knowledge). For now, I completely finished only one wikification the course on internationalization.

I also made an Esperanto localization for Babylscript. I'm not completely satisfied with this solution, as JS, at least the version implemented in the Rhino branch from which Babyscript is derived, doesn't even allow you to import other scripts, which is a huge restriction. Then, as a Wikimedia contributor, I met Lua, which is used there to create frontend editable modules, as you may know. So came the wish to make an Esperantist version of Lua.

Lupa and Mallupa

For those who are really only interested on Lua internationalization, you might skip this section and its subsection. The current section mainly focus on presenting (still in progress) project which so far took more an approach of direct localization to Esperanto, problems encountered and solutions used or considered.

Lupa

So far Lupa aims to provide an Esperantist version of Lua. At first I just wanted to make it a pure Esperanto version of Lua. Now, thanks to Luiz Henrique de Figueiredo advises and implementation suggests, I already shifted from a complete replacement of keywords to a more backward compatible approach which only aliases for misc. built in tokens. The current implementation is not in a sane state, as for example a simple single : will make lupe (the lua interpreter counterpart) crash.

Still, it already enable to write some little peace of code like se 1 plus 1 egalas 2 tiam printu 'tio estas bona aritmetiko' hop, and it works.

As Esperanto as a very regular grammar, unlike much spoken languages out there, parsing it is a rather practicable task. Even without such a support, you can already make most statements coinciding with Esperanto semantically sound sentences, if you chose your tokens carefully. That's an other driving criteria behind the list of lexems translations on the project wiki. For example one can write the statement tabelo['enigo'] = 3 as tabelo kies 'enigo' ero iĝu 3, the latter also being an plain Esperanto sentence meaning "table whose element 'entry' become 3". This Esperanto version is a bit longer than it's graphemo-ideographic mix up counterpart, but keep in mind that the tokens are only aliased so one can also use the former mixup. Also note that plain Esperanto also offer shorter ways to express the same thing, like tabelenigiĝu 3, or in a more parser friendly version which is still valid Esperanto tabel-enig-iĝu 3. But of course, that kind of syntax can't be treated within the scope of mere relexicalisation.

Even sticking to the scope of "static aliases only", there are still some problem to localize Lua toward Esperanto. First, Lua doesn't provide support for Unicode in identifiers and other built in tokens. Esperanto, do have a set of non-ASCII characters in it's usual way to write, namely ĉ, ĝ, ĥ, ĵ, ŝ and ŭ. But when it's not possible to use them, it's a recognized practice to append -h or -x the the letter without it's diacritic. As "x" isn't part of Esperanto alphabet, it's less problematic regarding possible lexem collisions. So far, Lupa use the -x notation to circumvent the script encoding limitation.

A minor problem is that, as far as Esperanto is concerned, number normally use coma as a decimal separator rather than a dot, at least if you refer to most authoritative sources. It's minor in the sense that in practice, usage vary, and not every Esperantist take great care of typographic "subtlety". On the technical side, it's more annoying as 3,14 do have a well defined completely different meaning in Lua. Babylscript for example propose to use space separated coma to resolve ambiguity for the similar case of French. As far as I'm concerned, I would rather use a token like plie (and ... as well, and also, together with) as list separator operator. On a broader internationalization perspective, the number recognition of the lexer would require far more thought to support more diverse numbering system, such as १.६ for Hindi.

Future development in Lupa should somewhat reverse it's approach to modify the official interpreter as little as possible. Hence the a Lua-i18n project presented bellow, which should focus on providing internationalization facilities, ideally with an approach that allow to build on top of it other tools which are flexible enough to support some syntactic changes. Lupa then could base it's later evolutions on top of this Lua-i18n.

Mallupa

While Lupa modify directly Lua, Mallupa just translate a localized dialect to a plain old lua script. Currently it uses ltokenp, which itself reuse the Lua lexer, to retrieve lexems. And it includes a Lupa dialect, which already provide more feature than Lupa. As the main part of the code is in Lua, it make the development far more easier. On the other hand it comes with it's cons, it's a source-to-source compiler, so it make debugging harder due to the additional layer of translation.

As it rely on the Lua Lexer, there are some flexions which still can't be performed. In particular I wanted to add the support for the numeral suffix "-a" to digit which make sense regarding table locations. But a string like 1a will be taken as a malformed number by the lexer and it will never reach the dialect converter script. To avoid that, ever the lexer should be changed, or the project should rely on an other lexer.

Lua-i18n

So, Lua-i18n is focused on providing internationalization facilities by modifying as less as possible to official Lua release to do so.

Some relative issues have been added and described on the project page.

  1. Internationalization of built in messages
  2. Internationalization of built in tokens
  3. Unicode support

For the last one, Luiz suggested me the following:

A hack to allow unicode identifiers is to set chars over 128 be letters.
You can do this by editing lctype.c.
Ask in the mailing list about this.

He also provided me the attached file with this comment:

Here is what I had I mind for a token filter in C. This piece of C code
centralizes all needed changes. Just add <<#include "proxy.c">> just
before the definition of luaX_next in llex.c. That's the only change in
the whole Lua C code.

So, so far I can't tell I miss help or a path that need more deep exploring, and I thank again Luiz for all this. But still, if you are interested in Lua-i18n, have any advice, comment, or question, please feel free to reply there or add it in the relevant project issue tracker.


Kind regards,
Mathieu


luai18n.md (10K) Download Attachment
proxy.c (1K) Download Attachment
TW
Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

TW
Can you elaborate on the benefits of programming language
localization?  I am quite skeptical about this.  Learning a
programming language is like learning a new language and IMO it hardly
makes it more difficult to learn the English keywords. But learning a
localized programming language makes it harder to apply one's gathered
knowledge beyond private projects or other programming languages. It's
also much harder to find help.

I'm less fundamentally skeptical concerning the internationalization
of error messages, though there are problems as well - like finding
help by feeding a search engine with a localized error message.  I
usually use `LANG=C` when I want help on errors, so this is not a big
problem, at least not on *nix systems.

The bigger problem with internationalization is that in theory it is
good, but alas, in practice, translations are frequently hair-raising,
confusing and misleading.  Commercial applications might be doing
slightly better on average than open source applications, depending on
their budget.  Quality assurance is hard as maintainers don't
understand all the languages.  My impression is that in many cases,
well-meaning enthusiasts without deeper knowledge of the technical
terminology in the respective domain just do a quick translation
without too much thought.  I looked at a Babylscript translation that
is a perfect example of that.  I hope no one ever attempted to learn
or will learn JavaScript using that translation.  Translation is hard
and requires quite some thought.  In programming language design,
especially much thought should have been spent on the choice of the
original keywords.

So, to really let people benefit from translations, true experts in
both languages, the subject to translate and the respective technical
terminologies in both languages are needed.  And I'm not at all
convinced that one does people a favor by translating programming
languages. I think it becomes harder rather than easier to learn the
language because learning resources are very limited or not existent
at all.  There is a plethora of books and free material on e.g.
JavaScript in many languages, some of it excellent, but probably
hardly any material for translated versions of JavaScript/Babylscript,
if any at all.

Sorry for the negativity - I hope my reasoning has been rational and
unoffensive.  I'd encourage translations of messages with advice to
install a thorough review process before unleashing things on confused
users.

Thomas W.



2016-11-20 22:20 GMT+01:00 mathieu stumpf guntz <[hidden email]>:

> Hello, if you are interested in Lua, internationalization and possibly
> programming languages localization, you might be interested in this thread.
>
> A bit of background
>
> You can skip this section if you are less interested in the human story
> stuffs and more interested on technical aspects of Lua internationalization.
>
> My initial motivation, was to have a programming languages that only use
> phonetic signs.
>
> Well, one way to do that is to assign an unique phonetic value to each sign
> usable in the programming language. For example in Lua which currently can
> only use ASCII tokens, a 256 sign mapping is enough (string content apart).
> That rather easy, and you can even easily have a monosyllabic map. For
> example in my native language we have 16 vowels (V) and and 20 consonant
> (C), that's more than enough to make a CV (or VC) sign for each ASCII
> element. A quick and dummy mapping would just assign them following some
> arbitrary order, and a somewhat less dummy solution would try to make a
> mapping with some mnemonic relation with the usual ASCII denoted sign. But
> even then, to my mind that would stay a rather impractical solution for
> anything useful.
>
> So I looked for a spoken language which had a phonetic transcription and
> possibly would have in bonus friendly morpho-syntaxic properties for a
> programming language use case. On this regards Lojban might be a good choice
> I guess, at least it passed within my radars. But an other aspect regarding
> usefulness is the number of speaker. On that point, without forgetting the
> previous one, Esperanto make a better candidate. So I began to write
> research projects about it, namely Fabeleblaĵo and Algoritmaro, but in my
> native language as I didn't feel skilled enough in Esperanto at the time.
>
> Most recently I transferred some courses of International Academy of
> Sciences San Marino, which use Esperanto as a common working language.
> Indeed, as I wanted to begin to translate my still in progress works on
> Fableblaĵo and Algoritmaro to Esperanto, I discovered that the Esperanto
> version of Wikiversity is still in beta version. I'm trying to change that
> by adding some courses and in the process so creating useful wiki templates
> and make feedback and tickets. The later are grouped on Wikimedia
> phabricator Eliso tag (Eliso stands for Esperanto kaj LIbera Scio, ie.
> Esperanto and free knowledge). For now, I completely finished only one
> wikification the course on internationalization.
>
> I also made an Esperanto localization for Babylscript. I'm not completely
> satisfied with this solution, as JS, at least the version implemented in the
> Rhino branch from which Babyscript is derived, doesn't even allow you to
> import other scripts, which is a huge restriction. Then, as a Wikimedia
> contributor, I met Lua, which is used there to create frontend editable
> modules, as you may know. So came the wish to make an Esperantist version of
> Lua.
>
> Lupa and Mallupa
>
> For those who are really only interested on Lua internationalization, you
> might skip this section and its subsection. The current section mainly focus
> on presenting (still in progress) project which so far took more an approach
> of direct localization to Esperanto, problems encountered and solutions used
> or considered.
>
> Lupa
>
> So far Lupa aims to provide an Esperantist version of Lua. At first I just
> wanted to make it a pure Esperanto version of Lua. Now, thanks to Luiz
> Henrique de Figueiredo advises and implementation suggests, I already
> shifted from a complete replacement of keywords to a more backward
> compatible approach which only aliases for misc. built in tokens. The
> current implementation is not in a sane state, as for example a simple
> single : will make lupe (the lua interpreter counterpart) crash.
>
> Still, it already enable to write some little peace of code like se 1 plus 1
> egalas 2 tiam printu 'tio estas bona aritmetiko' hop, and it works.
>
> As Esperanto as a very regular grammar, unlike much spoken languages out
> there, parsing it is a rather practicable task. Even without such a support,
> you can already make most statements coinciding with Esperanto semantically
> sound sentences, if you chose your tokens carefully. That's an other driving
> criteria behind the list of lexems translations on the project wiki. For
> example one can write the statement tabelo['enigo'] = 3 as tabelo kies
> 'enigo' ero iĝu 3, the latter also being an plain Esperanto sentence meaning
> "table whose element 'entry' become 3". This Esperanto version is a bit
> longer than it's graphemo-ideographic mix up counterpart, but keep in mind
> that the tokens are only aliased so one can also use the former mixup. Also
> note that plain Esperanto also offer shorter ways to express the same thing,
> like tabelenigiĝu 3, or in a more parser friendly version which is still
> valid Esperanto tabel-enig-iĝu 3. But of course, that kind of syntax can't
> be treated within the scope of mere relexicalisation.
>
> Even sticking to the scope of "static aliases only", there are still some
> problem to localize Lua toward Esperanto. First, Lua doesn't provide support
> for Unicode in identifiers and other built in tokens. Esperanto, do have a
> set of non-ASCII characters in it's usual way to write, namely ĉ, ĝ, ĥ, ĵ, ŝ
> and ŭ. But when it's not possible to use them, it's a recognized practice to
> append -h or -x the the letter without it's diacritic. As "x" isn't part of
> Esperanto alphabet, it's less problematic regarding possible lexem
> collisions. So far, Lupa use the -x notation to circumvent the script
> encoding limitation.
>
> A minor problem is that, as far as Esperanto is concerned, number normally
> use coma as a decimal separator rather than a dot, at least if you refer to
> most authoritative sources. It's minor in the sense that in practice, usage
> vary, and not every Esperantist take great care of typographic "subtlety".
> On the technical side, it's more annoying as 3,14 do have a well defined
> completely different meaning in Lua. Babylscript for example propose to use
> space separated coma to resolve ambiguity for the similar case of French. As
> far as I'm concerned, I would rather use a token like plie (and ... as well,
> and also, together with) as list separator operator. On a broader
> internationalization perspective, the number recognition of the lexer would
> require far more thought to support more diverse numbering system, such as
> १.६ for Hindi.
>
> Future development in Lupa should somewhat reverse it's approach to modify
> the official interpreter as little as possible. Hence the a Lua-i18n project
> presented bellow, which should focus on providing internationalization
> facilities, ideally with an approach that allow to build on top of it other
> tools which are flexible enough to support some syntactic changes. Lupa then
> could base it's later evolutions on top of this Lua-i18n.
>
> Mallupa
>
> While Lupa modify directly Lua, Mallupa just translate a localized dialect
> to a plain old lua script. Currently it uses ltokenp, which itself reuse the
> Lua lexer, to retrieve lexems. And it includes a Lupa dialect, which already
> provide more feature than Lupa. As the main part of the code is in Lua, it
> make the development far more easier. On the other hand it comes with it's
> cons, it's a source-to-source compiler, so it make debugging harder due to
> the additional layer of translation.
>
> As it rely on the Lua Lexer, there are some flexions which still can't be
> performed. In particular I wanted to add the support for the numeral suffix
> "-a" to digit which make sense regarding table locations. But a string like
> 1a will be taken as a malformed number by the lexer and it will never reach
> the dialect converter script. To avoid that, ever the lexer should be
> changed, or the project should rely on an other lexer.
>
> Lua-i18n
>
> So, Lua-i18n is focused on providing internationalization facilities by
> modifying as less as possible to official Lua release to do so.
>
> Some relative issues have been added and described on the project page.
>
> Internationalization of built in messages
> Internationalization of built in tokens
> Unicode support
>
> For the last one, Luiz suggested me the following:
>
> A hack to allow unicode identifiers is to set chars over 128 be letters.
> You can do this by editing lctype.c.
> Ask in the mailing list about this.
>
> He also provided me the attached file with this comment:
>
> Here is what I had I mind for a token filter in C. This piece of C code
> centralizes all needed changes. Just add <<#include "proxy.c">> just
> before the definition of luaX_next in llex.c. That's the only change in
> the whole Lua C code.
>
> So, so far I can't tell I miss help or a path that need more deep exploring,
> and I thank again Luiz for all this. But still, if you are interested in
> Lua-i18n, have any advice, comment, or question, please feel free to reply
> there or add it in the relevant project issue tracker.
>
>
> Kind regards,
> Mathieu

Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Mathieu Stumpf Guntz



Le 21/11/2016 à 08:08, TW a écrit :
Can you elaborate on the benefits of programming language
localization?  I am quite skeptical about this.  Learning a
programming language is like learning a new language and IMO it hardly
makes it more difficult to learn the English keywords. But learning a
localized programming language makes it harder to apply one's gathered
knowledge beyond private projects or other programming languages. It's
also much harder to find help.
As said, I'm in fact more interested with an Esperanto localization, though I have no fundamental  opposition to localization in various native languages, it's not my main interest. Esperanto is far easier to learn than many, in not any, other spoken language out there. In fact it seems that, in the same amount of time of learning only one language such as English, French or German, learning Esperanto first then an other language make people more skilled at last one studied alone. I won't spend to much time advocating this topic here, you can look at the Wikipedia's Esperanto article for more sources on this topic, and Esperanto myth busting essays of Claude Piron which give you point point of view of someone who worked as translator for UN. Note that I'm open to further discussion on this topic, but if I don't mistake, this list is not the fittest place for such a conversation, is it?

For more (mostly academic) sources about use of localized programming language and compilers, here is a few links:
If you are aware of other interesting article on the subject, I would welcome feedback.
I'm less fundamentally skeptical concerning the internationalization
of error messages, though there are problems as well - like finding
help by feeding a search engine with a localized error message.  I
usually use `LANG=C` when I want help on errors, so this is not a big
problem, at least not on *nix systems.
I didn't used other systems over the last years, but I doubt there are many system which doesn't have variable environment, aren't there?
The bigger problem with internationalization is that in theory it is
good, but alas, in practice, translations are frequently hair-raising,
confusing and misleading. 
I agree. Now, note that this formulation tacitly imply that the opposite situation is better, and with that I don't agree. This is just a situation where you have to chose between two solutions which both are their cons and pros. Sure, letting whole latitude to diversity always come with a cost, but the cult of standardized monoculture to. Again, I'm not encouraging further debate over this on this mailing list. Please answer in private or suggest a fitter public canal. This hold for anything in this answer which is not directly related to Lua-i18n, Lupa, or Mallupa, unless a wide consensus is express to do otherwise in some way.

 Commercial applications might be doing
slightly better on average than open source applications, depending on
their budget.  Quality assurance is hard as maintainers don't
understand all the languages.  My impression is that in many cases,
well-meaning enthusiasts without deeper knowledge of the technical
terminology in the respective domain just do a quick translation
without too much thought.  I looked at a Babylscript translation that
is a perfect example of that.  I hope no one ever attempted to learn
or will learn JavaScript using that translation.  Translation is hard
and requires quite some thought.  In programming language design,
especially much thought should have been spent on the choice of the
original keywords.
Again, I do agree, and in my translation I do take time to make not only translation which is relevant to the given context, but which also provides a coherent lexicon so that semantic proximity is reflected by lexical proximity. I also tend to take the length of the lexems in the equation, privileging shorter words where the lexem stay meaningful and longer lexems when existing shorter options are meaningless. For the babylscript's Esperanto translation, you might consult the token translation document and the error message document, which both provide explanation of proposed choices, and other alternatives. If you have feedback to improve this documents, please do it in documents themselves. For other Babylscript translation, I'm not involved, but a concrete example of what you mean would be welcome.
So, to really let people benefit from translations, true experts in
both languages, the subject to translate and the respective technical
terminologies in both languages are needed.  And I'm not at all
convinced that one does people a favor by translating programming
languages. I think it becomes harder rather than easier to learn the
language because learning resources are very limited or not existent
at all.  There is a plethora of books and free material on e.g.
JavaScript in many languages, some of it excellent, but probably
hardly any material for translated versions of JavaScript/Babylscript,
if any at all.
To my mind, that's sounds more like an egg-and-chicken problem than a fundamental problem of localized programming languages.

Sorry for the negativity - I hope my reasoning has been rational and
unoffensive.  I'd encourage translations of messages with advice to
install a thorough review process before unleashing things on confused
users.
No offense. Constructive critics comments and questions are always welcome. Thank you for having taken time to give some feedback.


Thomas W.



2016-11-20 22:20 GMT+01:00 mathieu stumpf guntz [hidden email]:
Hello, if you are interested in Lua, internationalization and possibly
programming languages localization, you might be interested in this thread.

A bit of background

You can skip this section if you are less interested in the human story
stuffs and more interested on technical aspects of Lua internationalization.

My initial motivation, was to have a programming languages that only use
phonetic signs.

Well, one way to do that is to assign an unique phonetic value to each sign
usable in the programming language. For example in Lua which currently can
only use ASCII tokens, a 256 sign mapping is enough (string content apart).
That rather easy, and you can even easily have a monosyllabic map. For
example in my native language we have 16 vowels (V) and and 20 consonant
(C), that's more than enough to make a CV (or VC) sign for each ASCII
element. A quick and dummy mapping would just assign them following some
arbitrary order, and a somewhat less dummy solution would try to make a
mapping with some mnemonic relation with the usual ASCII denoted sign. But
even then, to my mind that would stay a rather impractical solution for
anything useful.

So I looked for a spoken language which had a phonetic transcription and
possibly would have in bonus friendly morpho-syntaxic properties for a
programming language use case. On this regards Lojban might be a good choice
I guess, at least it passed within my radars. But an other aspect regarding
usefulness is the number of speaker. On that point, without forgetting the
previous one, Esperanto make a better candidate. So I began to write
research projects about it, namely Fabeleblaĵo and Algoritmaro, but in my
native language as I didn't feel skilled enough in Esperanto at the time.

Most recently I transferred some courses of International Academy of
Sciences San Marino, which use Esperanto as a common working language.
Indeed, as I wanted to begin to translate my still in progress works on
Fableblaĵo and Algoritmaro to Esperanto, I discovered that the Esperanto
version of Wikiversity is still in beta version. I'm trying to change that
by adding some courses and in the process so creating useful wiki templates
and make feedback and tickets. The later are grouped on Wikimedia
phabricator Eliso tag (Eliso stands for Esperanto kaj LIbera Scio, ie.
Esperanto and free knowledge). For now, I completely finished only one
wikification the course on internationalization.

I also made an Esperanto localization for Babylscript. I'm not completely
satisfied with this solution, as JS, at least the version implemented in the
Rhino branch from which Babyscript is derived, doesn't even allow you to
import other scripts, which is a huge restriction. Then, as a Wikimedia
contributor, I met Lua, which is used there to create frontend editable
modules, as you may know. So came the wish to make an Esperantist version of
Lua.

Lupa and Mallupa

For those who are really only interested on Lua internationalization, you
might skip this section and its subsection. The current section mainly focus
on presenting (still in progress) project which so far took more an approach
of direct localization to Esperanto, problems encountered and solutions used
or considered.

Lupa

So far Lupa aims to provide an Esperantist version of Lua. At first I just
wanted to make it a pure Esperanto version of Lua. Now, thanks to Luiz
Henrique de Figueiredo advises and implementation suggests, I already
shifted from a complete replacement of keywords to a more backward
compatible approach which only aliases for misc. built in tokens. The
current implementation is not in a sane state, as for example a simple
single : will make lupe (the lua interpreter counterpart) crash.

Still, it already enable to write some little peace of code like se 1 plus 1
egalas 2 tiam printu 'tio estas bona aritmetiko' hop, and it works.

As Esperanto as a very regular grammar, unlike much spoken languages out
there, parsing it is a rather practicable task. Even without such a support,
you can already make most statements coinciding with Esperanto semantically
sound sentences, if you chose your tokens carefully. That's an other driving
criteria behind the list of lexems translations on the project wiki. For
example one can write the statement tabelo['enigo'] = 3 as tabelo kies
'enigo' ero iĝu 3, the latter also being an plain Esperanto sentence meaning
"table whose element 'entry' become 3". This Esperanto version is a bit
longer than it's graphemo-ideographic mix up counterpart, but keep in mind
that the tokens are only aliased so one can also use the former mixup. Also
note that plain Esperanto also offer shorter ways to express the same thing,
like tabelenigiĝu 3, or in a more parser friendly version which is still
valid Esperanto tabel-enig-iĝu 3. But of course, that kind of syntax can't
be treated within the scope of mere relexicalisation.

Even sticking to the scope of "static aliases only", there are still some
problem to localize Lua toward Esperanto. First, Lua doesn't provide support
for Unicode in identifiers and other built in tokens. Esperanto, do have a
set of non-ASCII characters in it's usual way to write, namely ĉ, ĝ, ĥ, ĵ, ŝ
and ŭ. But when it's not possible to use them, it's a recognized practice to
append -h or -x the the letter without it's diacritic. As "x" isn't part of
Esperanto alphabet, it's less problematic regarding possible lexem
collisions. So far, Lupa use the -x notation to circumvent the script
encoding limitation.

A minor problem is that, as far as Esperanto is concerned, number normally
use coma as a decimal separator rather than a dot, at least if you refer to
most authoritative sources. It's minor in the sense that in practice, usage
vary, and not every Esperantist take great care of typographic "subtlety".
On the technical side, it's more annoying as 3,14 do have a well defined
completely different meaning in Lua. Babylscript for example propose to use
space separated coma to resolve ambiguity for the similar case of French. As
far as I'm concerned, I would rather use a token like plie (and ... as well,
and also, together with) as list separator operator. On a broader
internationalization perspective, the number recognition of the lexer would
require far more thought to support more diverse numbering system, such as
१.६ for Hindi.

Future development in Lupa should somewhat reverse it's approach to modify
the official interpreter as little as possible. Hence the a Lua-i18n project
presented bellow, which should focus on providing internationalization
facilities, ideally with an approach that allow to build on top of it other
tools which are flexible enough to support some syntactic changes. Lupa then
could base it's later evolutions on top of this Lua-i18n.

Mallupa

While Lupa modify directly Lua, Mallupa just translate a localized dialect
to a plain old lua script. Currently it uses ltokenp, which itself reuse the
Lua lexer, to retrieve lexems. And it includes a Lupa dialect, which already
provide more feature than Lupa. As the main part of the code is in Lua, it
make the development far more easier. On the other hand it comes with it's
cons, it's a source-to-source compiler, so it make debugging harder due to
the additional layer of translation.

As it rely on the Lua Lexer, there are some flexions which still can't be
performed. In particular I wanted to add the support for the numeral suffix
"-a" to digit which make sense regarding table locations. But a string like
1a will be taken as a malformed number by the lexer and it will never reach
the dialect converter script. To avoid that, ever the lexer should be
changed, or the project should rely on an other lexer.

Lua-i18n

So, Lua-i18n is focused on providing internationalization facilities by
modifying as less as possible to official Lua release to do so.

Some relative issues have been added and described on the project page.

Internationalization of built in messages
Internationalization of built in tokens
Unicode support

For the last one, Luiz suggested me the following:

A hack to allow unicode identifiers is to set chars over 128 be letters.
You can do this by editing lctype.c.
Ask in the mailing list about this.

He also provided me the attached file with this comment:

Here is what I had I mind for a token filter in C. This piece of C code
centralizes all needed changes. Just add <<#include "proxy.c">> just
before the definition of luaX_next in llex.c. That's the only change in
the whole Lua C code.

So, so far I can't tell I miss help or a path that need more deep exploring,
and I thank again Luiz for all this. But still, if you are interested in
Lua-i18n, have any advice, comment, or question, please feel free to reply
there or add it in the relevant project issue tracker.


Kind regards,
Mathieu

    

Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Soni "They/Them" L.
In reply to this post by Mathieu Stumpf Guntz


On 20/11/16 07:20 PM, mathieu stumpf guntz wrote:
>
> [snip]
>
>
> Kind regards,
> Mathieu
>

Why localize at all?

Why not *un*localize?

Replace every keyword with symbols, that'll be the basis to getting
"universal lua" working. Then allow just about any unicode character in
identifiers. Then rename the "require" function somehow, and use
something like _ENV = require("esperanto") to switch language.

--
Disclaimer: these emails may be made public at any given time, with or without reason. If you don't agree with this, DO NOT REPLY.


Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Mathieu Stumpf Guntz


Le 21/11/2016 à 12:46, Soni L. a écrit :

>
>
> On 20/11/16 07:20 PM, mathieu stumpf guntz wrote:
>>
>> [snip]
>>
>>
>> Kind regards,
>> Mathieu
>>
>
> Why localize at all?
>
> Why not *un*localize?
>
> Replace every keyword with symbols, that'll be the basis to getting
> "universal lua" working. Then allow just about any unicode character
> in identifiers. Then rename the "require" function somehow, and use
> something like _ENV = require("esperanto") to switch language.
>
Well, this description of "unlocalizing", is somewhat an internalization
process, isn't it? Or am I missing some point?


Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Soni "They/Them" L.


On 21/11/16 11:34 AM, mathieu stumpf guntz wrote:

>
>
> Le 21/11/2016 à 12:46, Soni L. a écrit :
>>
>>
>> On 20/11/16 07:20 PM, mathieu stumpf guntz wrote:
>>>
>>> [snip]
>>>
>>>
>>> Kind regards,
>>> Mathieu
>>>
>>
>> Why localize at all?
>>
>> Why not *un*localize?
>>
>> Replace every keyword with symbols, that'll be the basis to getting
>> "universal lua" working. Then allow just about any unicode character
>> in identifiers. Then rename the "require" function somehow, and use
>> something like _ENV = require("esperanto") to switch language.
>>
> Well, this description of "unlocalizing", is somewhat an
> internalization process, isn't it? Or am I missing some point?
>
>
Yes, but language-agnostic rather than language-specific. It requires no
change to the VM in order to add new languages - just create a module
for the new language. It also supports multiple languages at the same
time (in the same VM).

--
Disclaimer: these emails may be made public at any given time, with or without reason. If you don't agree with this, DO NOT REPLY.


Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Mathieu Stumpf Guntz



Le 21/11/2016 à 15:03, Soni L. a écrit :
Yes, but language-agnostic rather than language-specific.

Again, I don't see the difference with internationalization, which aims to make software as as independant as possible from any cultural specificity. Do you mean that the interpreter should be change in a way that get rid of all English token in it's internal (at least the one exposed by default externally) and substitute them with some relevant Unicode mathematical characters? If yes, to my mind it would be a funny idea, but doesn't match the "change as little as possible in the current code base as possible". Also be aware that whatever character set you may elect, it will never be a culturally agnostic practice, even relying on mere integer you won't go out of a cultural practice.

It requires no change to the VM in order to add new languages - just create a module for the new language. It also supports multiple languages at the same time (in the same VM).

If you mean a solution that might work out of the box with the current Lua interpreter, I can't figure how it would be possible.

If you mean a solution that would require "no further" change in the Lua VM for each specific language, then that fall in the definition of internationalization. So if you have more detailed feedback, suggestions, questions or code to provide on this, please go on. :)

Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Soni "They/Them" L.


On 21/11/16 01:19 PM, mathieu stumpf guntz wrote:

>
>
>
> Le 21/11/2016 à 15:03, Soni L. a écrit :
>> Yes, but language-agnostic rather than language-specific.
>
> Again, I don't see the difference with internationalization, which
> aims to make software as as independant as possible from any cultural
> specificity. Do you mean that the interpreter should be change in a
> way that get rid of all English token in it's internal (at least the
> one exposed by default externally) and substitute them with some
> relevant Unicode mathematical characters
> <http://unicode.org/reports/tr25/tr25-6.html#_Toc24>? If yes, to my
> mind it would be a funny idea, but doesn't match the "change as little
> as possible in the current code base as possible". Also be aware that
> whatever character set you may elect, it will never be a culturally
> agnostic practice, even relying on mere integer you won't go out of a
> cultural practice.
>
>> It requires no change to the VM in order to add new languages - just
>> create a module for the new language. It also supports multiple
>> languages at the same time (in the same VM).
>>
> If you mean a solution that might work out of the box with the current
> Lua interpreter, I can't figure how it would be possible.
>
> If you mean a solution that would require "no further" change in the
> Lua VM for each specific language, then that fall in the definition of
> internationalization. So if you have more detailed feedback,
> suggestions, questions or code to provide on this, please go on. :)
>

I don't wanna learn Esperanto to write code. I'd be happy to write code
on my own language, however, and have it interoperate with code written
in other languages. The easiest way to do that is to define symbols for
keywords - for best cultural compatibility these symbols would either a)
be specific to the bastardized Lua you'd make, and would use the private
use areas of unicode, or b) be emoji - and then use "environment wrapper
libraries" to define the globals.

"Wrapper libraries" would be used to adapt libraries across languages,
too - the "soquete" library would simply translate the names for the
"socket" library, rather than reimplementing sockets in an incompatible
way (however, sharing objects returned by such library would require
some sort of "wrap" and "unwrap" functionality, also exposed by the
wrapper library. It'd be common practice to wrap function parameters and
unwrap function arguments).

If you wanna bastardize something, at least do it in a way that everyone
wins. Translating a programming language isn't the way to go, but
removing any and all references to a specific language and/or culture
is. Even Perl has yet to manage this. Actually the only programming
language I know that has managed this is plain old brainfuck.

So yeah.

--
Disclaimer: these emails may be made public at any given time, with or without reason. If you don't agree with this, DO NOT REPLY.


Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Mathieu Stumpf Guntz



Le 21/11/2016 à 16:41, Soni L. a écrit :

I don't wanna learn Esperanto to write code. I'd be happy to write code on my own language, however, and have it interoperate with code written in other languages. The easiest way to do that is to define symbols for keywords - for best cultural compatibility these symbols would either a) be specific to the bastardized Lua you'd make, and would use the private use areas of unicode, or b) be emoji - and then use "environment wrapper libraries" to define the globals.
I am happy to learn in your interest of coding in your native language, that make Lua-i18n all the more relevant, and concerns you are feedbacking are exactly the reason Lua-i18n is a separate project from Lupa and Mallupa, on which it have no dependency. On the contrary, in later development Lupa and Mallupa may rely on Lua-i18n, but you don't have to care about that if you are not interesting in Esperanto.

Regarding what you are talking about, I agree that make having the ability to make something like `_ENV = require("my_dialect")` would be an interesting idea.

You don't necessarily need to change tokens used internally for that though: they can be decorrelated from the token exposed in the interpreter. Like it's done with the solution proposed by Luiz. This solution however, unlike what you are suggesting, is a static compile time process. But maybe it might be used as a point of start for implementing something like you are suggesting.


"Wrapper libraries" would be used to adapt libraries across languages, too - the "soquete" library would simply translate the names for the "socket" library, rather than reimplementing sockets in an incompatible way (however, sharing objects returned by such library would require some sort of "wrap" and "unwrap" functionality, also exposed by the wrapper library. It'd be common practice to wrap function parameters and unwrap function arguments).
Sure this would be a good point to also provide translated libraries. This doesn't really require any change to the interpreter, does it? Aliasing function is straight forward in plain Lua, as far as I can tell, so this might be done with the current language facility. Or is there a problem I miss here?

Actually, the goal here is somewhat providing the same flexibility given for function identifiers to reserved keywords.


If you wanna bastardize something, at least do it in a way that everyone wins.
Yes, that's what Lua-i18n is all about. However I would rather name it allow dialectal extensions than bastardize, because "mal nommer les choses c'est ajouter au malheur du monde", and that I prefer to add misery in the world in a far more subtle devilish fashion. :)

Translating a programming language isn't the way to go, but removing any and all references to a specific language and/or culture is.
You can abstract things, but you can't remove any and all references to a specific language and/or culture. Abstracting things is obviously a cultural practice. Now if you want to speak ethnology or even ethology with me, please make that in a private answer.

Even Perl has yet to manage this. Actually the only programming language I know that has managed this is plain old brainfuck.
Actually, Perligata let you code in a Latin dialect. Your welcome. ;)
Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Coda Highland
On Mon, Nov 21, 2016 at 8:27 AM, mathieu stumpf guntz
<[hidden email]> wrote:
>> Even Perl has yet to manage this. Actually the only programming language I
>> know that has managed this is plain old brainfuck.
>
> Actually, Perligata let you code in a Latin dialect. Your welcome. ;)

Some old BASIC dialects got translated too. IIRC that was one of the
reasons BASIC was so successful in Russian -- it was the only
available language that could be used on a screen that was displaying
Cyrillic characters.

/s/ Adam

Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Soni "They/Them" L.
In reply to this post by Mathieu Stumpf Guntz


On 21/11/16 02:27 PM, mathieu stumpf guntz wrote:

>
>
>
> Le 21/11/2016 à 16:41, Soni L. a écrit :
>>
>> I don't wanna learn Esperanto to write code. I'd be happy to write
>> code on my own language, however, and have it interoperate with code
>> written in other languages. The easiest way to do that is to define
>> symbols for keywords - for best cultural compatibility these symbols
>> would either a) be specific to the bastardized Lua you'd make, and
>> would use the private use areas of unicode, or b) be emoji - and then
>> use "environment wrapper libraries" to define the globals.
> I am happy to learn in your interest of coding in your native
> language, that make Lua-i18n all the more relevant, and concerns you
> are feedbacking are exactly the reason Lua-i18n is a separate project
> from Lupa and Mallupa, on which it have no dependency. On the
> contrary, in later development Lupa and Mallupa may rely on Lua-i18n,
> but you don't have to care about that if you are not interesting in
> Esperanto.
>
> Regarding what you are talking about, I agree that make having the
> ability to make something like `_ENV = require("my_dialect")` would be
> an interesting idea.
>
> You don't necessarily need to change tokens used internally for that
> though: they can be decorrelated from the token exposed in the
> interpreter. Like it's done with the solution proposed by Luiz. This
> solution however, unlike what you are suggesting, is a static compile
> time process. But maybe it might be used as a point of start for
> implementing something like you are suggesting.

It's definitely a great way to make syntax highlighting impossible.
Seems like it'd be less trouble to just use lisp or forth instead.

>
>>
>> "Wrapper libraries" would be used to adapt libraries across
>> languages, too - the "soquete" library would simply translate the
>> names for the "socket" library, rather than reimplementing sockets in
>> an incompatible way (however, sharing objects returned by such
>> library would require some sort of "wrap" and "unwrap" functionality,
>> also exposed by the wrapper library. It'd be common practice to wrap
>> function parameters and unwrap function arguments).
> Sure this would be a good point to also provide translated libraries.
> This doesn't really require any change to the interpreter, does it?
> Aliasing function is straight forward in plain Lua, as far as I can
> tell, so this might be done with the current language facility. Or is
> there a problem I miss here?

It does need unicode support.

>
> Actually, the goal here is somewhat providing the same flexibility
> given for function identifiers to reserved keywords.

Remove the concept of "keywords" - they shouldn't be words, try
something like "keysymbols" instead.

>
>>
>> If you wanna bastardize something, at least do it in a way that
>> everyone wins.
> Yes, that's what Lua-i18n is all about. However I would rather name it
> allow dialectal extensions than bastardize, because "mal nommer les
> choses c'est ajouter au malheur du monde", and that I prefer to add
> misery in the world in a far more subtle devilish fashion. :)
>

I love the term "bastardize". I do it all the time. e.g.
https://bitbucket.org/SoniEx2/mdxml/overview

>> Translating a programming language isn't the way to go, but removing
>> any and all references to a specific language and/or culture is.
> You can abstract things, but you can't remove any and all references
> to a specific language and/or culture. Abstracting things is obviously
> a cultural practice. Now if you want to speak ethnology or even
> ethology with me, please make that in a private answer.
>
>> Even Perl has yet to manage this. Actually the only programming
>> language I know that has managed this is plain old brainfuck.
> Actually, Perligata
> <http://www.csse.monash.edu.au/%7Edamian/papers/HTML/Perligata.html/>
> let you code in a Latin dialect. Your welcome. ;)

Are you implying latin isn't a language and/or culture? Is perligata
made of random symbols with no particular meaning arranged in a way that
makes physical sense? (e.g. brainfuck's loop, which looks like a box -
this is also why I think emoji is a good candidate for proper
internationalization, as it's more physical even tho different cultures
may associate different things to each emoji)

--
Disclaimer: these emails may be made public at any given time, with or without reason. If you don't agree with this, DO NOT REPLY.


Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Viacheslav Usov
In reply to this post by Coda Highland
On Mon, Nov 21, 2016 at 5:41 PM, Coda Highland <[hidden email]> wrote:

> Some old BASIC dialects got translated too. IIRC that was one of the reasons BASIC was so successful in Russian -- it was the only available language that could be used on a screen that was displaying Cyrillic characters.

This is completely off-topic in this list, but.

What you are saying is very likely not true. I'm not aware of any "translated" implementations of BASIC. All that I have personally seen in Russia and read about in Russian literature had English keywords. Some of them could use Cyrillic identifiers, and almost all could have Cyrillic chars in strings.

There were programming languages intended for beginners that were entirely Cyrillic (keywords et al.) [1]. Despite the fact that they usually were better languages than BASIC, they never really gained traction, to a certain degree because of the reasons mentioned earlier here.

Cheers,
V.


Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Javier Guerra Giraldez
On 21 November 2016 at 17:29, Viacheslav Usov <[hidden email]> wrote:
> What you are saying is very likely not true. I'm not aware of any
> "translated" implementations of BASIC.


microsoft's basic for applications was.

and after seeing one program translated to spanish, i'm eternally
grateful to nave have used that abomination.

--
Javier

Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Peter Hickman-3
Wouldn't localisation actually hinder learning the (programming) language via code reading. Lets say I learn lua in Gujarati and want to learn from someone else's existing code only to find that it is written in the Chinese lua. I wouldn't stand a chance, my ability to learn would be severely diminished by my need to learn a dozen human languages just to be able to read the code.

Maybe we should go the MUMPS way and have one letter commands (I am seriously not serious about this) or the APL way and have characters that nobody can type (except maybe the Greeks).

Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Dirk Laurie-2
2016-11-21 19:53 GMT+02:00 Peter Hickman <[hidden email]>:

> Wouldn't localisation actually hinder learning the (programming) language
> via code reading. Lets say I learn lua in Gujarati and want to learn from
> someone else's existing code only to find that it is written in the Chinese
> lua. I wouldn't stand a chance, my ability to learn would be severely
> diminished by my need to learn a dozen human languages just to be able to
> read the code.
>
> Maybe we should go the MUMPS way and have one letter commands (I am
> seriously not serious about this) or the APL way and have characters that
> nobody can type (except maybe the Greeks).

I can type them (but I'm a Linux user, so I suppose that does not count).

Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Mathieu Stumpf Guntz
In reply to this post by Soni "They/Them" L.


Le 21/11/2016 à 17:44, Soni L. a écrit :
>
> It does need unicode support.
Sure, but this is not a API aliasing specific problem, the uncode issue
is indeed in the issue tracker of Lua-i18n.

> Remove the concept of "keywords" - they shouldn't be words, try
> something like "keysymbols" instead.
Lexem is the usual term in linguistic you probably want to refer to.
There's also lexical item (or lexical unit or lexical entry) if the term
may refer either to a single word, a part of a word, or a chain of words.

>
> Are you implying latin isn't a language and/or culture? Is perligata
> made of random symbols with no particular meaning arranged in a way
> that makes physical sense? (e.g. brainfuck's loop, which looks like a
> box - this is also why I think emoji is a good candidate for proper
> internationalization, as it's more physical even tho different
> cultures may associate different things to each emoji)
All apologies, I misunderstood what was said, I thought it was told that
there was no localized language out there.

To my mind, making a programming language which "remove any and all
references to a specific language and/or culture" is as meaningful as
"performing the nothingness".

Now if you wish to have a programming language as you describe, with
emoji or whaever symbol, I certainly won't discourage you, go ahead and
show us. But this is not want I want to target within the Lua-i18n
project, so if you really want to develop something as you said, please
launch an other thread if you think it's relevant to discuss of that on
this mailing list.

❤🌐,
mathieu

Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Coda Highland
In reply to this post by Viacheslav Usov
On Mon, Nov 21, 2016 at 9:29 AM, Viacheslav Usov <[hidden email]> wrote:

> On Mon, Nov 21, 2016 at 5:41 PM, Coda Highland <[hidden email]> wrote:
>
>> Some old BASIC dialects got translated too. IIRC that was one of the
>> reasons BASIC was so successful in Russian -- it was the only available
>> language that could be used on a screen that was displaying Cyrillic
>> characters.
>
> This is completely off-topic in this list, but.
>
> What you are saying is very likely not true. I'm not aware of any
> "translated" implementations of BASIC. All that I have personally seen in
> Russia and read about in Russian literature had English keywords. Some of
> them could use Cyrillic identifiers, and almost all could have Cyrillic
> chars in strings.

It's mostly true. I just had my history confused. It wasn't BASIC I
was thinking of (I was confusing VBA as Javier alluded to); it was
ALGOL-68 that was translated into Russian.

/s/ Adam

Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Gregg Reynolds-2
In reply to this post by Peter Hickman-3

On Nov 21, 2016 11:54 AM, "Peter Hickman" <[hidden email]> wrote:
>
> Wouldn't localisation actually hinder learning the (programming) language via code reading. Lets say I learn lua in Gujarati and want to learn from someone else's existing code only to find that it is written in the Chinese lua. I wouldn't stand a chance, my ability to learn would be severely diminished by my need to learn a dozen human languages just to be able to read the code.
>

easy peasy: you just crank up your chinese-lua to gujarati-lua translator.  it's a formal language, all the keywords etc. will translate perfectly.

objection: names for vars, fns, etc. won't translate. but that's not a new problem, and it's already possible to use a local language for those, and programmers in east Asia do so. even in English, as often as not the names are badly chosen, you have to read the code.

having said that, it seems pretty clear that "Programmer's English" works pretty well even for programmers who don't know much english.  I'm currently following a project whose primary devs are Korean and indian.  their English documentation totally sucks, but I can read their code.

-g

Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

KHMan
In reply to this post by Mathieu Stumpf Guntz
On 11/22/2016 2:21 AM, mathieu stumpf guntz wrote:

> Le 21/11/2016 à 17:44, Soni L. a écrit :
>>
>> It does need unicode support.
> Sure, but this is not a API aliasing specific problem, the uncode
> issue is indeed in the issue tracker of Lua-i18n.
>
>> Remove the concept of "keywords" - they shouldn't be words, try
>> something like "keysymbols" instead.
> Lexem is the usual term in linguistic you probably want to refer
> to. There's also lexical item (or lexical unit or lexical entry)
> if the term may refer either to a single word, a part of a word,
> or a chain of words.
>
>>
>> Are you implying latin isn't a language and/or culture? Is
>> perligata made of random symbols with no particular meaning
>> arranged in a way that makes physical sense? (e.g. brainfuck's
>> loop, which looks like a box - this is also why I think emoji is
>> a good candidate for proper internationalization, as it's more
>> physical even tho different cultures may associate different
>> things to each emoji)
> All apologies, I misunderstood what was said, I thought it was
> told that there was no localized language out there.
>
> To my mind, making a programming language which "remove any and
> all references to a specific language and/or culture" is as
> meaningful as "performing the nothingness".

IMHO the whole kaboodle is:

Quite interesting _for_some_folks_ in ivory towers.

Quite useless in the real world.

Clearly it's something that can be implemented and some academic
papers written _for_a_particular_academic_niche_. But nobody who
wants to get code written in the real world is going to touch
this. It's just an ivory tower toy, one that is of some value to
linguistics, but of zero value to CS/IT.

IIRC I have seen in the past on the Internet where someone
replaced the keywords of a programming language with non-English
words, a proof of concept thing. But if this kind of thing is
actually worthwhile to CS/IT folks, we might have heard a bit more
in the past >50 years of programming language development. Instead
it's more like haute couture, if you can 'sell' it, or get it
funded, then great, everything is awesome.

Just IMHO though. Good luck.

> [snip snip]

--
Cheers,
Kein-Hong Man (esq.)
Selangor, Malaysia


Reply | Threaded
Open this post in threaded view
|

Re: Internationalization of Lua

Coda Highland
In reply to this post by Gregg Reynolds-2
On Mon, Nov 21, 2016 at 1:21 PM, Gregg Reynolds <[hidden email]> wrote:

> On Nov 21, 2016 11:54 AM, "Peter Hickman" <[hidden email]>
> wrote:
>>
>> Wouldn't localisation actually hinder learning the (programming) language
>> via code reading. Lets say I learn lua in Gujarati and want to learn from
>> someone else's existing code only to find that it is written in the Chinese
>> lua. I wouldn't stand a chance, my ability to learn would be severely
>> diminished by my need to learn a dozen human languages just to be able to
>> read the code.
>>
>
> easy peasy: you just crank up your chinese-lua to gujarati-lua translator.
> it's a formal language, all the keywords etc. will translate perfectly.
>
> objection: names for vars, fns, etc. won't translate. but that's not a new
> problem, and it's already possible to use a local language for those, and
> programmers in east Asia do so. even in English, as often as not the names
> are badly chosen, you have to read the code.
>
> having said that, it seems pretty clear that "Programmer's English" works
> pretty well even for programmers who don't know much english.  I'm currently
> following a project whose primary devs are Korean and indian.  their English
> documentation totally sucks, but I can read their code.
>
> -g

Once upon a time, I did some backend work for a site that was using
Woltlab Burning Board. The code was a mishmash of English and German.
I did learn that "zugriff" means something along the lines of "access"
with regards to a database.

/s/ Adam

123