[ANN] lglob 0.8 Extended Globals Checker for Lua

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[ANN] lglob 0.8 Extended Globals Checker for Lua

steve donovan
Recently we've been talking about globals and how the undisciplined use of them can cause problems for both Lua application developers and for casual scripters.  lglob is a practical demonstration of how static code analysis can go a long way to relieve anxiety and get more compile-time feedback.  The technique is discussed in David Manura's excellent wiki entry on detecting undeclared variables:


In particular, it's derived from his globalsplus.lua script, which uses luac output to track globals, and the _fields_ of globals, so 'math.sine' is also an error.

Jay alluded to the difference between 'formal' and 'informal' use of Lua, and I don't believe that all Lua code needs to meet formal standards. By default it is strict:

$ cat > script.lua
function dump(x)
  print('value is '..tostring(x))
end

dump (42)

$ lglob script.lua
lglob: script.lua:1: undefined set dump
lglob: script.lua:5: undefined get dump

But the '-g' (for globals) flag accounts for globals defined in a script:

$ lglob -g script.lua
(fine)

(It's not currently checking that a global is defined at the _point of usage_, however, but this remains a work in progress)

By default, lglob uses the 'usual' contents of _G, but you can add whitelists to do this using the '-w' flag, an exclusive whitelist with the '-wx' flag, or a blacklist with '-b'. (These files are straightforward files containing Lua assignments). By default, it tracks use of require() and local aliases to modules; you can use the -wl whitelist to statically define the meaning of require().

Just type 'lglob' for the full set of arguments, and read the readme.  It has a lot of options because lglob is meant to be customizable to meet your needs, and not just impose a one-fits-all solution on all Lua files.

Until lglob gets into main LuaRocks repo, use:

$ sudo luarocks --from=http://rocks.moonscript.org install lglob


steve d.

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Luiz Henrique de Figueiredo
For a simpler take on this, see
        http://lua-users.org/lists/lua-l/2012-12/msg00397.html

which does much less, of course.

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

steve donovan
On Mon, Apr 29, 2013 at 2:38 PM, Luiz Henrique de Figueiredo <[hidden email]> wrote:
For a simpler take on this, see
        http://lua-users.org/lists/lua-l/2012-12/msg00397.html


Absolutely ;)  But when a script develops its own command-line parser, then it will collect as many flags as can fit on a typical terminal screen (many GNU utilities don't follow this simple constraint, alas)  It puts out warnings in a form that editors understand, so e.g. I bind lglob to F7 in SciTE instead of plain luac.

lglob is pushing luac analysis to the point where to go any further will lead to problems. I will quote my favourite Irish joke: A gentleman goes to Dublin and gets lost. He asks a pedestrian "Tell me my good man, how do I get to the National Museum?". And the man responds "Well, sir, I would not go from here if I was you"[1]

Very applicable to software projects!  In particular, lglob will never make Petitle Abeile happy because it can't handle his _ENV magic ;) It does know about the _ENV={} module trick however, just can't handle multiple such scopes.

It is suprisingly tricky to handle this kind of aliasing

local T = table
print(T.join{1,2})

and still get an error on using 'table.join'.  The resulting code will not earn any elegance awards.

steve d.

[1] I'm half Irish so I can tell non-offensive Irish jokes

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Petite Abeille

On Apr 29, 2013, at 3:16 PM, steve donovan <[hidden email]> wrote:

>  In particular, lglob will never make Petitle Abeile happy because it can't handle his _ENV magic ;)

Monsieur! I object to your frivolous use of the "m" word! Please elaborate why using _ENV in 5.2 involves any dark magic!


Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Andrew Starks



On Mon, Apr 29, 2013 at 12:25 PM, Petite Abeille <[hidden email]> wrote:

On Apr 29, 2013, at 3:16 PM, steve donovan <[hidden email]> wrote:

>  In particular, lglob will never make Petitle Abeile happy because it can't handle his _ENV magic ;)

Monsieur! I object to your frivolous use of the "m" word! Please elaborate why usin tourtured g _ENV in 5.2 involves any dark magic!



Steve is an _ENV hater. I've given up on him. ;)

(Cue Steve and his tortured linking of package.seeall to _ENV in 3...2...)

-Andrew

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Petite Abeille

On Apr 29, 2013, at 8:17 PM, Andrew Starks <[hidden email]> wrote:

> Steve is an _ENV hater. I've given up on him. ;)

Haters gonna hate.

And there is nothing obviously 'tortured' about using _ENV:

--8<--

do
  local _ENV = {}
  _NAME = 'BAZ'
  function Foo() if _NAME then return end end
  function Bar() Foo() end
end

-->8--

Neither Steve's lglob...


$ lua lglob.lua TestGlobal.lua

lglob: TestGlobal.lua:3: undefined get _NAME
lglob: TestGlobal.lua:4: undefined get Foo


… nor Luiz's globals52...


$ luac -p -l TestGlobal.lua | lua globals52.lua | sort
TestGlobal.lua 3 undef _NAME
TestGlobal.lua 4 undef Foo


… can handle the most trivial _ENV usage… sigh… oh, well…

I'm at loss about all the hate. Give _ENV some love!


 


Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

steve donovan
In reply to this post by Petite Abeille
On Mon, Apr 29, 2013 at 7:25 PM, Petite Abeille <[hidden email]> wrote:
Monsieur! I object to your frivolous use of the "m" word! Please elaborate why using _ENV in 5.2 involves any dark magic!

Ah, magic it may be, but not _dark_ magic. That is setfenv.

I like to look at code and know immediately how to look up any symbol.  This lesson I learned from excessive use of 'with' in Pascal.

And the tortuous business of learning C++ symbol lookup rules well enough to implement them has left permanent scars.... 

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Coda Highland
On Mon, Apr 29, 2013 at 11:45 AM, steve donovan
<[hidden email]> wrote:

> On Mon, Apr 29, 2013 at 7:25 PM, Petite Abeille <[hidden email]>
> wrote:
>>
>> Monsieur! I object to your frivolous use of the "m" word! Please elaborate
>> why using _ENV in 5.2 involves any dark magic!
>
>
> Ah, magic it may be, but not _dark_ magic. That is setfenv.
>
> I like to look at code and know immediately how to look up any symbol.  This
> lesson I learned from excessive use of 'with' in Pascal.
>
> And the tortuous business of learning C++ symbol lookup rules well enough to
> implement them has left permanent scars....
>

Mm, I'll be doing that fairly soon myself, though I managed to get
through the preprocessor and the lexer without losing my sanity.

/s/ Adam

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Petite Abeille
In reply to this post by steve donovan

On Apr 29, 2013, at 8:45 PM, steve donovan <[hidden email]> wrote:

> Ah, magic it may be, but not _dark_ magic. That is setfenv.

Ehe :P

> I like to look at code and know immediately how to look up any symbol.

So I guess metamethods are not your cup of tea either…?

But enough about personal preferences!

The more relevant question is: why a tool branding itself as a global checker cannot check globals in 5.2?

Isn't there enough information in 'luac -p -l' to distinguish between various access? Or?




Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Dirk Laurie-2
2013/4/29 Petite Abeille <[hidden email]>:

> Isn't there enough information in 'luac -p -l' to distinguish between various access? Or?

Also available in the latest lbci.

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Petite Abeille

On Apr 29, 2013, at 9:14 PM, Dirk Laurie <[hidden email]> wrote:

> 2013/4/29 Petite Abeille <[hidden email]>:
>
>> Isn't there enough information in 'luac -p -l' to distinguish between various access? Or?
>
> Also available in the latest lbci.

Hmmm… test.lua?

As per http://www.tecgraf.puc-rio.br/~lhf/ftp/lua/5.2/lbci.tar.gz ?

$ lua test.lua TestGlobal.lua


globals
        TestGlobal.lua 4 GET _NAME
        TestGlobal.lua 5 GET Foo


So I guess not.


Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Luiz Henrique de Figueiredo
In reply to this post by Petite Abeille
> Isn't there enough information in 'luac -p -l' to distinguish between various access?

There isn't enough information in 'luac -p -l' but there is in 'luac -p -l -l'.
However, the list of locals is printed after the bytecode and one would need
to read the whole listing to find SETTABLE and GETTABLE that use a local
named _ENV.

This particular task is easier to program with lbci. The code at the end
shows an example. However, it does not works in all cases; it is fooled
by this code:

do
  do local x,y,z end -- _ENV will use the same slot as x
  local _ENV = {}
  _NAME = 'BAZ'
  function Foo() if _NAME then return end end
  function Bar() Foo() end
end

To fix this requires a more sophisticaed look at the list of locals, taking
end of scope into consideration. All this information is available both in
luac long listings and via lci.

Here is the lbci code for the simple case.

local inspector=require"bci"
local function globals(f,all)
 local F=inspector.getheader(f)
 for i=1,F.instructions do
  local a,b,c,d,e=inspector.getinstruction(f,i)
  if b=="GETTABUP" and inspector.getupvalue(f,d+1)=="_ENV" then
   print("",F.source,a,"GET ",inspector.getconstant(f,-e))
  elseif b=="SETTABUP" and inspector.getupvalue(f,c+1)=="_ENV" then
   print("",F.source,a,"SET*",inspector.getconstant(f,-d))
  elseif b=="GETTABLE" and inspector.getlocal(f,d+1)=="_ENV" then
   print("",F.source,a,"GET ",inspector.getconstant(f,-e))
  elseif b=="SETTABLE" and inspector.getlocal(f,c+1)=="_ENV" then
   print("",F.source,a,"SET*",inspector.getconstant(f,-d))
  end
 end
 if all then
  for i=1,F.functions do
   globals(inspector.getfunction(f,i),all)
  end
 end
end

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Petite Abeille

On Apr 29, 2013, at 9:40 PM, Luiz Henrique de Figueiredo <[hidden email]> wrote:

> There isn't enough information in 'luac -p -l' but there is in 'luac -p -l -l'.
> However, the list of locals is printed after the bytecode and one would need
> to read the whole listing to find SETTABLE and GETTABLE that use a local
> named _ENV.
>
> This particular task is easier to program with lbci. The code at the end
> shows an example. However, it does not works in all cases; it is fooled
> by this code:

Thanks for the code example.

But... now I'm a bit confused by what you consider a, err,  'global'…

In the simple, original case… globals now returns all access to _ENV:

globals
        TestGlobal.lua 3 SET* _NAME
        TestGlobal.lua 4 SET* Foo
        TestGlobal.lua 5 SET* Bar
        TestGlobal.lua 4 GET _NAME
        TestGlobal.lua 5 GET Foo

But… _ENV is not the global environment... _G is.

Contrast:

do
  local _ENV = {}
  _NAME = 'BAZ'
  function Foo() if _NAME then return end end
  function Bar() Foo() end
end

Which is equivalent to:

do
  local _ENV = {}
  _ENV._NAME = 'BAZ'
  function _ENV.Foo() if _ENV._NAME then return end end
  function _ENV.Bar() Foo() end
end


With:

do
  local _BAZ = {}
  _BAZ._NAME = 'BAZ'
  function _BAZ.Foo() if _BAZ._NAME then return end end
  function _BAZ.Bar() _BAZ.Foo() end
end

Neither variants access, nor set any globals. And yet,  the _ENV flavor reports 5 global access. As oppose the _BAZ flavor which reports none. As it should.  

What gives?





Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Luiz Henrique de Figueiredo
> But... now I'm a bit confused by what you consider a, err,  'global'?

Yes, I think Lua 5.2 now blurs the distinction between free variables
and fields in _ENV. Once it applies the transformation x -> _ENV.x,
there is no way by looking at the generated code to know that x was
once free.

Thus, being practical, "global" must mean a field of _ENV, not a free name.

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Philipp Janda
In reply to this post by Petite Abeille
Am 29.04.2013 23:31 schröbte Petite Abeille:
>
> But… _ENV is not the global environment... _G is.

To figure out which environment should be considered a "global"
environment would require some convention like: only environments passed
to one of the load* functions are global environments, or any
environment that contains (most of) the Lua standard library is a global
environment, or if there's a field _ENV._G that refers back to _ENV,
then _ENV is considered a global environment.
But all this can only be checked at runtime anyway:

     if "y" == io.read() then
       _ENV = { print = print }
     end
     print( "hello" )  --> global access or not?


Philipp



Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Petite Abeille
In reply to this post by Luiz Henrique de Figueiredo

On Apr 30, 2013, at 12:27 AM, Luiz Henrique de Figueiredo <[hidden email]> wrote:

>> But... now I'm a bit confused by what you consider a, err,  'global'?
>
> Yes, I think Lua 5.2 now blurs the distinction between free variables
> and fields in _ENV. Once it applies the transformation x -> _ENV.x,
> there is no way by looking at the generated code to know that x was
> once free.
>
> Thus, being practical, "global" must mean a field of _ENV, not a free name.

Hmmm… yes… perhaps practical in the sense of feasible, but not practical in the sense of useful.

When referring to 'global' one colloquially understands 'global in the global environment' … global global so to say… not, err, local global in a, hmmm, local environment… which is as interesting as having keys in any old table… in other words, not very much.

Is there a way, in 5.2, to identify these 'global global' through byte code analyses? In the same way one could clearly and easily identify global access by checking for ([GS])ETGLOBAL in 5.1?

I suspect not… oh, well… so much for global checker I guess...





Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Petite Abeille
In reply to this post by Philipp Janda

On Apr 30, 2013, at 12:47 AM, Philipp Janda <[hidden email]> wrote:

> Am 29.04.2013 23:31 schröbte Petite Abeille:
>>
>> But… _ENV is not the global environment... _G is.
>
> To figure out which environment should be considered a "global" environment would require some convention like: only environments passed to one of the load* functions are global environments, or any environment that contains (most of) the Lua standard library is a global environment, or if there's a field _ENV._G that refers back to _ENV, then _ENV is considered a global environment.

Nah. _G is always at LUA_RIDX_GLOBALS. And that is that.

http://www.lua.org/manual/5.2/manual.html#2.2
http://www.lua.org/manual/5.2/manual.html#4.5

> But all this can only be checked at runtime anyway:

Most likely yes.

R.I.P. byte code analyzes for globals in 5.2. Oh, well...


Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Tim Hill
In reply to this post by Petite Abeille
Ok so my 2c worth having read the discussion of the past few days.

First, is there really a problem? At the language level, I think not. I really like the overall Lua model, the elegance of globals as just another table, _ENV, and all that goes with that. But at the secondary "human" level than yes, I think there is a problem. Only a few weeks ago I was bitten by a very subtle spelling bug in a global that I stared at for TWO DAYS until I spotted it. Even though I more or less KNEW what it had to be. Argh, more gray hairs :)

Second, what to do? I don't like some of the suggestions: Changing the language makes me shudder; if every change suggested in this list were enacted Lua would be a bigger mess than JavaScript. I also get scared by tools that try to duplicate some aspects of the parser -- it's asking for trouble.

But the best parser is, well, Lua itself!! Deep inside Lua is a block of code that knows when it has a global variable and converts "x" to "_ENV.x" .. that's fundamental to the way the language works now. It seems to me that the best approach would be to modify luac and add a switch to get it to emit a line of output each time it prefixes "_ENV" onto a global/free variable. Something like the line number, column number, and variable name. With this as input, it should be easy to create analysis tools that then provided information on globals usage (that mis-spelled global would easily be found).

Of course, such a tool is mostly useful for "batch" mode on complete Lua programs, rather than the partial ones that editors have to deal with.

--Tim



On Apr 29, 2013, at 4:12 PM, Petite Abeille <[hidden email]> wrote:

>
> On Apr 30, 2013, at 12:27 AM, Luiz Henrique de Figueiredo <[hidden email]> wrote:
>
>>> But... now I'm a bit confused by what you consider a, err,  'global'?
>>
>> Yes, I think Lua 5.2 now blurs the distinction between free variables
>> and fields in _ENV. Once it applies the transformation x -> _ENV.x,
>> there is no way by looking at the generated code to know that x was
>> once free.
>>
>> Thus, being practical, "global" must mean a field of _ENV, not a free name.
>
> Hmmm… yes… perhaps practical in the sense of feasible, but not practical in the sense of useful.
>
> When referring to 'global' one colloquially understands 'global in the global environment' … global global so to say… not, err, local global in a, hmmm, local environment… which is as interesting as having keys in any old table… in other words, not very much.
>
> Is there a way, in 5.2, to identify these 'global global' through byte code analyses? In the same way one could clearly and easily identify global access by checking for ([GS])ETGLOBAL in 5.1?
>
> I suspect not… oh, well… so much for global checker I guess...
>
>
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Philipp Janda
In reply to this post by Petite Abeille
Am 30.04.2013 01:12 schröbte Petite Abeille:

>
> On Apr 30, 2013, at 12:27 AM, Luiz Henrique de Figueiredo <[hidden email]> wrote:
>
>>> But... now I'm a bit confused by what you consider a, err,  'global'?
>>
>> Yes, I think Lua 5.2 now blurs the distinction between free variables
>> and fields in _ENV. Once it applies the transformation x -> _ENV.x,
>> there is no way by looking at the generated code to know that x was
>> once free.
>>
>> Thus, being practical, "global" must mean a field of _ENV, not a free name.
>
> Hmmm… yes… perhaps practical in the sense of feasible, but not practical in the sense of useful.

Depends on how (often) you use _ENV.

>
> When referring to 'global' one colloquially understands 'global in the global environment' … global global so to say… not, err, local global in a, hmmm, local environment… which is as interesting as having keys in any old table… in other words, not very much.
>
> Is there a way, in 5.2, to identify these 'global global' through byte code analyses? In the same way one could clearly and easily identify global access by checking for ([GS])ETGLOBAL in 5.1?

It wasn't possible in Lua 5.1 either in the presence of setfenv (and
module).

>
> I suspect not… oh, well… so much for global checker I guess...
>

I propose the following definition of "globals" in the context of static
global checkers:

*   Any access to a chunk's _ENV upvalue (not a local variable) is a
globals access, unless the chunk itself or any function sharing the same
_ENV upvalue potentially assigns to the _ENV upvalue.
*   Any access to a functions _ENV upvalue (not a local variable) is a
globals access, if the _ENV upvalue of the chunk was the only _ENV in
scope during the functions definition *and* unless any function sharing
the same _ENV upvalue potentially assigns to the _ENV upvalue.
*   Anything else not covered above is not a globals access.

I believe that is both useful and still statically checkable (with some
effort). We won't get `module` without `package.seeall` right (or any
other use of `debug.setupvalue` to change _ENV), but that's deprecated
anyway.

That said, since I don't use _ENV tricks that often, I'm quite satisfied
with the state of the global checkers ...

Philipp




Reply | Threaded
Open this post in threaded view
|

Re: [ANN] lglob 0.8 Extended Globals Checker for Lua

Luiz Henrique de Figueiredo
In reply to this post by Petite Abeille
> Is there a way, in 5.2, to identify these 'global global' through byte code analyses? In the same way one could clearly and easily identify global access by checking for ([GS])ETGLOBAL in 5.1?

No because what used to be globals in 5.1 are now free names in 5.2 and
access to these is rewritten via _ENV.
 
> I suspect not? oh, well? so much for global checker I guess...

Perhaps if you want to make it strict but in practice static analysis
does work for ordinary program that do no mess explicitly with _ENV.

OTOH, even in 5.1 one could play obscure games like
        local a="var"
        _G[a]=42
and this may or may not be an access to the global var, depending on
what _G holds. (Recall that _G is not a reserved name and can hold any
value.)

12