String access & metamethods

classic Classic list List threaded Threaded
32 messages Options
12
Reply | Threaded
Open this post in threaded view
|

RE: String access & metamethods

Jerome Vuarand-2
Luiz Henrique de Figueiredo wrote:
>> (I don't know if a token filter can distinguish between the two, but
>> that would be a nice addition if it doesn't).
> 
> It can't because, by definition, it seems strings *after* the lexer
> has turned text into tokens. All strings are the same, regardless of
> what text form was used to define them. --lhf 

Would it be computationally expensive to have the tokenizer generate
three different tokens (maybe more if you want to distinguish [[]] from
[=[]=]), and just have the parser treat them equally ? In plain Lua this
would have no purpose, but in a token filtered Lua it could be useful
like in that translation case.


Reply | Threaded
Open this post in threaded view
|

Re: String access & metamethods

Eric Tetz
In reply to this post by Brett Kugler
On Dec 13, 2007 6:35 AM, Brett Kugler <[hidden email]> wrote:
> First I have stringent memory limits I must stay within.  My concern with
> creating new variables for each string would be that my heap burden would
> grow significantly.

If that's a concern then you don't want to have every translation in
memory at the same time, as all the schemes we've been playing around
with have done.

The real KISS approach is just to load a file like:

   -- localizations.lua
   if language == "German" then
      DidNotWork = "Operasie het nie geslaag nie"
      NoAnswer = "Antwoord nie gefind nie"
   elseif language == "Martian" then
      DidNotWork = "Pqfsbtjf!ifu!ojf!hftmbbh!ojf"
      NoAnswer = "Bouxppse!ojf!hfgjoe!ojf"
   else -- default to English
      DidNotWork = "Operation did not succeed"
      NoAnswer = "No answer found"
   end

   -- yourapp.lua
   language = "German"
   dofile("localizations.lua")
   ...
   print(DidNotWork)
   print(NoAnswer)

Then you have *one* set of text and some relatively short identifier
names loaded in memory.

Cheers,
Eric

Reply | Threaded
Open this post in threaded view
|

Re: String access & metamethods

Brett Kugler
On Dec 13, 2007 10:37 AM, Eric Tetz <[hidden email]> wrote:
On Dec 13, 2007 6:35 AM, Brett Kugler <[hidden email]> wrote:
> First I have stringent memory limits I must stay within.  My concern with
> creating new variables for each string would be that my heap burden would
> grow significantly.

If that's a concern then you don't want to have every translation in
memory at the same time, as all the schemes we've been playing around
with have done.

The real KISS approach is just to load a file like:

  -- localizations.lua
  if language == "German" then
     DidNotWork = "Operasie het nie geslaag nie"
     NoAnswer = "Antwoord nie gefind nie"
  elseif language == "Martian" then
     DidNotWork = "Pqfsbtjf!ifu!ojf!hftmbbh!ojf"
     NoAnswer = "Bouxppse!ojf!hfgjoe!ojf"
  else -- default to English
     DidNotWork = "Operation did not succeed"
     NoAnswer = "No answer found"
  end

  -- yourapp.lua
  language = "German"
  dofile("localizations.lua")
  ...
  print(DidNotWork)
  print(NoAnswer)

Then you have *one* set of text and some relatively short identifier
names loaded in memory.

Cheers,
Eric

Yeah, the table was more a simplification for the discussion vs. something I was actually planning on implementing.

Brett

Reply | Threaded
Open this post in threaded view
|

Re: String access & metamethods

Luiz Henrique de Figueiredo
In reply to this post by Jerome Vuarand-2
> Would it be computationally expensive to have the tokenizer generate
> three different tokens (maybe more if you want to distinguish [[]] from
> [=[]=]), and just have the parser treat them equally ?

This would defeat the goal of token filters, which is to use the Lua lexer
unchanged. Plus it would complicate filters as well. It is doable and not
expensive, but you'll have to patch the lexer. (Plus I find it ugly...)

Reply | Threaded
Open this post in threaded view
|

Re: String access & metamethods

Rici Lake-2
In reply to this post by Eric Tetz
Eric Tetz wrote:

> The real KISS approach is just to load a file like:
>
>    -- localizations.lua
>    if language == "German" then
>       DidNotWork = "Operasie het nie geslaag nie"
>       NoAnswer = "Antwoord nie gefind nie"
>    elseif language == "Martian" then
>       DidNotWork = "Pqfsbtjf!ifu!ojf!hftmbbh!ojf"
>       NoAnswer = "Bouxppse!ojf!hfgjoe!ojf"
>    else -- default to English
>       DidNotWork = "Operation did not succeed"
>       NoAnswer = "No answer found"
>    end

A possibly simpler approach, which avoids compiling a file with all
localized strings, is:

-- l12n/german.lua
DidNotWork = "Operasie het nie geslaag nie"
NoAnswer = "Antwoord nie gefind nie"

-- l12n/martian.lua
DidNotWork = "Pqfsbtjf!ifu!ojf!hftmbbh!ojf"
NoAnswer = "Bouxppse!ojf!hfgjoe!ojf"

-- l12n/english.lua
DidNotWork = "Operation did not succeed"
NoAnswer = "No answer found"

-- yourapp.lua
pcall(require, "l12n/german") or require"l12n/english"

You could use a fallback loader to avoid the pcall.

I don't see the advantage of using globals, though. Stashing the
localizations in a table called L (with a __call metamethod) does not
use more space, and avoids namespace pollution.


Reply | Threaded
Open this post in threaded view
|

Re: String access & metamethods

Rici Lake-2
In reply to this post by Luiz Henrique de Figueiredo
Luiz Henrique de Figueiredo wrote:
>> Would it be computationally expensive to have the tokenizer generate
>> three different tokens (maybe more if you want to distinguish [[]] from
>> [=[]=]), and just have the parser treat them equally ?
>
> This would defeat the goal of token filters, which is to use the Lua lexer
> unchanged. Plus it would complicate filters as well. It is doable and not
> expensive, but you'll have to patch the lexer. (Plus I find it ugly...)
>

I agree. However, there is nothing stopping a token filter from
recognizing constructs like:

   L"foo"

   utf8"foo"

etc., and translating those into other string constants (instead of
letting Lua translate them as function calls).


Reply | Threaded
Open this post in threaded view
|

Re: String access & metamethods

Javier Guerra Giraldez
In reply to this post by Rici Lake-2


On 12/13/07, Rici Lake <[hidden email]> wrote:
I don't see the advantage of using globals, though. Stashing the
localizations in a table called L (with a __call metamethod) does not
use more space, and avoids namespace pollution.

why the __call ?    i find

print(L.DidNotWork)

far more readable and less error-prone than

print(L"Operation did not succeed")


--
Javier
Reply | Threaded
Open this post in threaded view
|

Re: String access & metamethods

Luiz Henrique de Figueiredo
In reply to this post by Rici Lake-2
> I agree. However, there is nothing stopping a token filter from
> recognizing constructs like:
> 
>    L"foo"
> 
>    utf8"foo"
> 
> etc., and translating those into other string constants (instead of
> letting Lua translate them as function calls).

No, indeed, and it'd be a nice application of token filters.

Reply | Threaded
Open this post in threaded view
|

Re: String access & metamethods

Rici Lake-2
In reply to this post by Javier Guerra Giraldez
Javier Guerra wrote:
> On 12/13/07, Rici Lake <[hidden email]> wrote:
>
>> I don't see the advantage of using globals, though. Stashing the
>> localizations in a table called L (with a __call metamethod) does not
>> use more space, and avoids namespace pollution.
>>
>
> why the __call ?    i find
>
> print(L.DidNotWork)
>
> far more readable and less error-prone than
>
> print(L"Operation did not succeed")

Because tastes differ :)

The advantage of L"Operation did not succeed" is that it provides a
built-in default, which avoids the problem of incomplete translation
files. On the other hand, it makes translation files somewhat more error
prone.

That's an old argument, as can be seen from the difference between
localization libraries (compare GNU gettext with token-based localization
systems) and I wouldn't really want to get into it: the point is that Lua
can easily support both styles.

R.


Reply | Threaded
Open this post in threaded view
|

Re: String access & metamethods

Eric Tetz
In reply to this post by Rici Lake-2
On Dec 13, 2007 9:10 AM, Rici Lake <[hidden email]> wrote:
> I don't see the advantage of using globals, though.

Yeah, of course not. I wouldn't put all those symbols into the global
namespace, either. The code was just for illustration of the
principle.

I would probably go for the approach of having all localizations in
one file, though, just because I hate applications that install 3000
files (with all the slack that entails), rather than using some sort
of packaging system (assuming 12n/english.lua, l12n/martian.lua, etc.
are not themselves stored in some kind of 'wad' file).

Cheers,
Eric

Reply | Threaded
Open this post in threaded view
|

Re: String access & metamethods

Ashwin Hirschi-3
In reply to this post by Eric Tetz
>    if language == "German" then
>       DidNotWork = "Operasie het nie geslaag nie"
>       NoAnswer = "Antwoord nie gefind nie"
>    elseif language == "Martian" then
>       DidNotWork = "Pqfsbtjf!ifu!ojf!hftmbbh!ojf"
>       NoAnswer = "Bouxppse!ojf!hfgjoe!ojf"
>    else -- default to English
>       DidNotWork = "Operation did not succeed"
>       NoAnswer = "No answer found"
>    end

That Martian bit is spot on, obviously. But I'm afraid the German section still needs some work [;-)].

Ashwin.
-- 
no signature is a signature.


Reply | Threaded
Open this post in threaded view
|

Re: String access & metamethods

steve donovan
Yes, someone got some Afrikaans in there somewhere !  I know that a
linguist would consider that a _kind of German_, but so is English ;)

I agree with lhf, this is a classic case where token filters would
work nicely. Remember also that you don't have to _define_ new token
types, because something like '$' will pass through untranslated -
print($NoResourcesLeft), etc.

The big payoff is that your translation occurs at compile time. So you
could use luac to generate bytecode and store that - although in my
experience token filters are suprisingly fast.

steve d.


On Dec 13, 2007 10:32 PM, Ashwin Hirschi <[hidden email]> wrote:
>
> >    if language == "German" then
> >       DidNotWork = "Operasie het nie geslaag nie"
> >       NoAnswer = "Antwoord nie gefind nie"
> >    elseif language == "Martian" then
> >       DidNotWork = "Pqfsbtjf!ifu!ojf!hftmbbh!ojf"
> >       NoAnswer = "Bouxppse!ojf!hfgjoe!ojf"
> >    else -- default to English
> >       DidNotWork = "Operation did not succeed"
> >       NoAnswer = "No answer found"
> >    end
>
> That Martian bit is spot on, obviously. But I'm afraid the German section still needs some work [;-)].
>
> Ashwin.
> --
> no signature is a signature.
>

12