Globals (more ruminations)

classic Classic list List threaded Threaded
84 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

Globals (more ruminations)

Mark Hamburg
As someone I think pointed out, the reason the globals issue seems to keep coming up is that globals are viewed as "bad" and Lua makes them easy to create. In addition, that ease of creation can also hide errors caused by typos.

Why are globals "bad"?

The primary reason is the same reason they are "bad" in other languages: They create opportunities for unexpected coupling. Code in one script can unexpectedly interact with code in another script through shared global names. This problem can be mitigated by assigning each script its own environment table when it is loaded. These tables can use a metatable with an __index entry to access an intentionally shared namespace of values. It is also worth noting that this sort of coupling is exactly why globals are useful in interactive mode where we need to tie together a series of separately compiled chunks.

(Side note: Unexpected coupling is also why having the module function add the module to the global namespace is "bad".)

The secondary reason is that globals are slower than locals or upvalues. How often does this speed difference matter? Probably not all that often. Still it may be more often than one might think: Lightroom uses a chain of environments for exactly the reasons cited above and profiling showed a surprising amount of time spent dealing with global lookups until we started getting more aggressive about caching standard library functions into locals. And if you post a benchmark result asserting that "Lua is slow" and use globals rather than locals, someone from the Lua community is likely to tell you that you aren't using the language properly thereby implying that the easy path is not the proper path. So, performance may or may not be a reason to view globals as problematic. That said, I believe LuaJIT more or less eliminates the issue, so this perhaps can fade as an issue over time.

Why are globals "good"?

As noted above, they provide essential coupling between chunks in interactive mode. The alternative would be some way to pre-populate chunks with a set of pre-bound upvalues. One could then harvest the upvalues from one chunk and pre-populate them into the next chunk. If you thought global environments were hard to reason about, this seems much harder (though perhaps useful for experts).

Globals also provide ways to do interesting things by using special environments. The module function switches the environment so that assignments go into the module table. Some class systems do similar things for class definition. LuaGravity redefines the global environment so that function declarations turn into reactors. That said, this is a case where "in env do ... end" made a certain amount of sense. These uses also tend to have trouble with being intended to rebind writing but in so doing also messing with reading unless one uses a custom environment that balances between the two needs.

How to balance between these two?

As a starting point, I think the _ENV approach in 5.2work3 provides an interesting opportunity by allowing one to distinguish between chunk level global accesses and using globals to access an explicitly created _ENV variable. This makes some of the special environment tricks work better though they are a bit uglier syntactically unless we also re-introduce "in env do ... end" as sugar for "do local _ENV = env; ... end" but that leads to a bunch of other issues around what one is trying to accomplish and around expectations about how environments work.

That then leaves the question of whether it needs to either be harder to create chunk level globals and/or whether it needs to be easier to create chunk level locals and/or whether we just need better performance analysis tools.

For example, the vast majority of the code I write outside of interactive mode would be simpler with the addition of a simple import statement:

        import foo

which translates into:

        local foo = require "foo"

Additional syntax could provide name overrides or member import, but I would be cautious here. Given this, I could then see banning both reading and writing of globals at the chunk level outside of interactive mode. This makes a common case easier to write, but it would also slap me and other developers on the team whenever we slipped onto the seemingly easy path.

This could probably be handled via a combination of a token filter and a byte-code analyzer, but I haven't looked closely into the former and the latter is going to take a bit more thought in 5.2work3.

Mark

P.S. With regard to typos, in addition to detecting inadvertent globals, a lint should perhaps also check the names of messages used in message sends...

Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

Mark Hamburg
And I should perhaps make it clear where I stand:

I consider the current situation to be a bit warty, but I've clearly come to live with it and it doesn't bother me much. I'm chiming in and ruminating on this topic because the warts seem to bother some people a lot more (and we did go to some trouble in our Lightroom tooling to mitigate the issues which would indicate that there are issues).

Mark

Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

Patrick Donnelly
In reply to this post by Mark Hamburg
Hi Mark,

On Thu, Jul 8, 2010 at 12:45 PM, Mark Hamburg <[hidden email]> wrote:
> Why are globals "bad"?
>
> The primary reason is the same reason they are "bad" in other languages: They create opportunities for unexpected coupling. Code in one script can unexpectedly interact with code in another script through shared global names. This problem can be mitigated by assigning each script its own environment table when it is loaded. These tables can use a metatable with an __index entry to access an intentionally shared namespace of values. It is also worth noting that this sort of coupling is exactly why globals are useful in interactive mode where we need to tie together a series of separately compiled chunks.
>
> (Side note: Unexpected coupling is also why having the module function add the module to the global namespace is "bad".)

Coupling is certainly one issue. A major problem Nmap's NSE (Nmap
Scripting Engine) has had is libraries used by multiple scripts
accidentally using a global instead of a local. Note that these
libraries use module so each one has its own environment. Still, all
our scripts are coroutines so they step on each others' toes when
setting/getting globals within the library! (For example, an http.get
function would have a global (when it should be local) socket variable
it would set and use. Other scripts would overwrite this variable when
also using http.get!) This has been an insidious problem for us and
I've had to write custom scripts to look for these accidental global
accesses. It would be *superb* if there were some sort of compile time
solution to this.

--
- Patrick Donnelly
Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

Henk Boom-2
In reply to this post by Mark Hamburg
On 8 July 2010 12:45, Mark Hamburg <[hidden email]> wrote:
> As someone I think pointed out, the reason the globals issue seems to keep coming up is that globals are viewed as "bad" and Lua makes them easy to create. In addition, that ease of creation can also hide errors caused by typos.
>
> Why are globals "bad"?

Actually, the main reason I think globals are bad is that it's too
easy to use them by accident. Two situations I've hit are

1) I make a typing mistake and end up using an unused global instead
of the local I meant. The only symptom is that one use of the variable
seems to return nil.
2) I forget to declare a variable local, so that it ends up being
global instead. Everything works fine until the function ends up being
called re-entrantly, resulting in a tricky to find bug.

That being said, I love how useful they are for making closure-based
objects. Using setfenv I set it up so that globals are public fields,
and locals are private fields, but both using the same syntax for
accessing/setting.

    henk
Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

Rena
On Thu, Jul 8, 2010 at 15:39, Henk Boom <[hidden email]> wrote:

> On 8 July 2010 12:45, Mark Hamburg <[hidden email]> wrote:
>> As someone I think pointed out, the reason the globals issue seems to keep coming up is that globals are viewed as "bad" and Lua makes them easy to create. In addition, that ease of creation can also hide errors caused by typos.
>>
>> Why are globals "bad"?
>
> Actually, the main reason I think globals are bad is that it's too
> easy to use them by accident. Two situations I've hit are
>
> 1) I make a typing mistake and end up using an unused global instead
> of the local I meant. The only symptom is that one use of the variable
> seems to return nil.
> 2) I forget to declare a variable local, so that it ends up being
> global instead. Everything works fine until the function ends up being
> called re-entrantly, resulting in a tricky to find bug.
>
> That being said, I love how useful they are for making closure-based
> objects. Using setfenv I set it up so that globals are public fields,
> and locals are private fields, but both using the same syntax for
> accessing/setting.
>
>    henk
>

There's also the case where you declare a local variable, maybe for
debugging, then later remove the declaration but fail to remove all
assignments. If this is a fairly generically-named variable and you
happen to have another with that name higher up in scope, suddenly
your variables are "randomly" changing in value.

The same of course can happen with a module that uses globals. Your
"socket" changes because one of the module functions also refers to a
"socket" in the global namespace. Or the module function simply
doesn't work because the value isn't what it expects.

Some will always reply "don't use modules that cause problems", but
that's not the solution to everything. If we didn't use software that
has bugs, I guess we'd all be potato farmers... ;-)

--
Sent from my toaster.
Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

David Manura
In reply to this post by Mark Hamburg
On Thu, Jul 8, 2010 at 12:45 PM, Mark Hamburg <[hidden email]> wrote:
> Why are globals "bad"? ... Why are globals "good"? ...

Ok, so when do I think globals are appropriate?  The first and primary
case is when retrieving variables from the standard library:

  print(math.sqrt(2))

Here's why I think this is acceptable:

  (1) Typos can be statically detected here.  This is often done via
the "luac -p -l file.lua | grep ETGLOBAL" technique (e.g.
globals.lua), by  flagging all gets absent in a whitelist and flagging
all sets.  That whitelist can be defined statically, such as from the
Lua Reference Manual, or by dynamically querying _G.  _G has the
benefit of automatically handling any custom objects you add to the
standard library, but it requires that the Lua state be clean (e.g.
calls to `module` pollute _G), so it is best run from a separate Lua
process or using some table other than the current _G.  Typos to
members and signatures (e.g. "math.sqrrt(2,true)") are not detected in
the above approach.  Although we can slightly extend the approach by
localizing `math_sqrt`, a more general solution, which avoids
rewriting variables in your code, is to write a more intelligent
static analyzer, like the direction in luaanalyze.  So, it's just a
matter of how to make static checking more powerful and accessible,
making its use the norm rather than the exception.

  (2) Optimization is best left to the compiler.  Globals have the
performance impact of table indexing (sometimes multiple indexes).
For most cases this is acceptable.  For other cases, renaming
variables may increase performance, so it's tempting to do so.  For
example, if I'm creating a standard library with a "string trim"
function [1] that other people may use in unknown ways, and if I know
aggressive localizing can in certain cases have some measurable
impact, I may play it safe and localize even though it may slightly
uglify the code.  However, my preference, since this transformation
can be performed mechanically, is to keep the code clean and leave
optimization to the compiler, either assuming LuaJIT or writing some
preprocessor [2] or patch [3] that would perform this optimization
without bothering the original source.

  (3) Localizing every top level function (e.g. print) is cumbersome
to do manually.  Although I can localize with `local _ENV = require
"_G"` and then access `_ENV.print` or even `print`, this doesn't
automatically define a local for each top-level function in _G.  Lua
doesn't have a "import static foo.*" like Java [6].

  (4) It can work ok if custom environments are implemented carefully
or avoided.  If the current environment is changed, then the standard
library functions must remain exposed through the new environment.
Localizing these functions is one way to achieve.  It can be
preferable to expose them through a fallback (e.g. __index to _G) in
the current environment so that access works normally.  This is ok on
paper.  The problem occurs if you attempt, like package.seeall, to
reuse the environment table in some place where these standard library
functions should not be exposed.  It takes some looking, but there are
ways to solve this like package.clean [4] (with some caveats about
caching--bar using a proxy table or __setindex--in the probably rare
case of redefinition in the public namespace).  Another solution,
which I usually do and which is a very simple avoidance of these
problems, is to use a separate local table (e.g. M) for the module's
public namespace.  You may still use a local environment in the latter
solution, but it becomes mostly superfluous.  BTW, the Lua 5.2 VM
makes "M.foo", "_ENV.foo" and "foo" equally efficient.

  (5) Localization is sometimes used to bind globals earlier, making
the module more immune to changes to external global variables
following module load, but I question whether this is the right
approach.  This technique is utilized in Lua 5.1.4 strict.lua [8] to
support sandboxing.  I suspect maybe this should not be the module's
responsibility but rather that of the loader (e.g. `loadin` with
custom environment table).

Another case mentioned where globals are appropriate is the
interactive interpreter.  However, I don't think globals are necessary
unavoidable here.  There are times I've wanted this to just work
(without an enclosing "do" block):

  > local x = 1
  > local y = x + 1
  > print(x)
  nil

Mark suggested some ideas to make this work [3].  The basic idea is to
bind the locals declared in previous chunks into the current chunk,
almost as if the current chunk is lexically nested inside and at the
bottom of the previous chunk.  This is likely implementable by
patching lua.c, using either source rewriting or internal
manipulations.

Globals can also be useful in DSLs to avoid inserting `local`,
`return`, and `...` throughout in the DSL (though it can still remain
a difficult fit [7]).

Now, I think globals get messier when you have custom environments, or
even multiple environments in the same file:

  _ENV = module(...)
  local tostring = tostring
  require "baz"
 .....
  local test
  function foo()
    return class('bar', function(_ENV)
      function tostring()
        print 'foo'
        return foo
      end
    end)
  end
  function test() ..... end

Here, a nested environment is *trying to* reference a `foo` in the
parent environment and a `print` in the global _G table.  Moreover,
the method name `tostring` conflicts with a local `tostring` in the
parent.  There are ways to address this like aliasing the top-level
_ENV to a local of another name to permit disambiguation, but I think
the underlying problem is that we're trying to use some hacks with
globals to mimic lexical scopes in our custom language constructs
(module and class), but Lua's global resolution rules such as "locals
override globals" and "_ENV tables are not recursively queried up the
lexical nesting levels" may complicate making this work seamlessly.
Add to this the concerns in LuaModuleFunctionCritiqued [5].
Additionally, the function "test" is forward declared, which
unfortunately makes the definition misleadingly look like a global
definition.  This mess, and numerous ways to shoot yourself in the
foot, suggests rewriting everything above without environments/globals
but rather a straightforward lexical scoping solution that "just
works".

[1] http://lua-users.org/wiki/StringTrim
[2] http://lua-users.org/lists/lua-l/2008-04/msg00082.html
[3] http://lua-users.org/lists/lua-l/2010-07/msg00169.html
[4] http://lua-users.org/wiki/ModuleDefinition
[5] http://lua-users.org/wiki/LuaModuleFunctionCritiqued
[6] http://download.oracle.com/docs/cd/E17476_01/javase/1.5.0/docs/guide/language/static-import.html
[7] http://lua-users.org/lists/lua-l/2010-06/msg00246.html
[8] http://lua-users.org/lists/lua-l/2007-09/msg00345.html
Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

steve donovan
In reply to this post by Mark Hamburg
On Thu, Jul 8, 2010 at 6:45 PM, Mark Hamburg <[hidden email]> wrote:
> For example, the vast majority of the code I write outside of interactive mode would be simpler with the addition of a simple import statement:
>
>        import foo
>
> which translates into:
>
>        local foo = require "foo"

That would be useful; it's very easy to do a token-filter for this
kind of sugar. Except that 'import' should be reserved for this kind
of construct:

import setmetatable,type from _G

expanded as

local setmetatable,type = _G.setmetatable,_G.type

which is also a very common pattern in modules.

So then we could have 'requires', which works like your 'import'
except that it takes multiple modules

requires foo, bar, math, io

However, there is a problem.  require() is not guaranteed to return
the module, it may well just return 'true' if the module writer has
not followed recommended practice.  This is particularly an issue if
'requires' was actually part of the language, and pretty much
guarantees that it will not ;)

Thinking about the problem of globals, I'd say that the great majority
of globals are modules, especially if module() has been used
consistently; actual 'global' variables are accessed within the
namespace of their defining module.

It would be very useful if the development environment flagged
undeclared globals as 'spelling mistakes'.  The InteliJ and Eclipse
Lua projects are promising;  this clearly needs static analysis as
both David and Mark have emphasized. My next project will be to bring
this sort of goodness to SciTE, which tends to be a little more nimble
and less bureaucratic than the big boys ;)

Another point about making the IDE work harder is that it can generate
necessary scaffolding as needed, an approach which has saved many a
Java Eclipse programmer from repetitive strain injury.  E.g, the
environment sees that you have just used setmetatable (a known global)
in module scope; it will then ensure that a 'local
setmetatable=setmetatable' declaration is inserted at the start of the
file.  This will make it easier for people to give up their
package.seeall addiction.

steve d.
Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

David Manura
In reply to this post by Mark Hamburg
On Thu, Jul 8, 2010 at 12:45 PM, Mark Hamburg <[hidden email]> wrote:
> Why are globals "bad"?... Why are globals "good"?

I would qualify such discussions with the comment that there are
important aspects of globals and locals in Lua that are specific to
Lua and not to other languages.  For example, the scope of a Lua
global can be a proper subset of the scope of a local, which seemingly
is counter-intuitive, as it was for me [1].  The mantra "globals are
bad", as used elsewhere, therefore is using the term "global" not in
the exact same sense as used here.

So, here's a new page outlining the fundamental differences between
locals and globals as implemented in Lua:
http://lua-users.org/wiki/LocalsVsGlobals .

[1] http://lua-users.org/lists/lua-l/2006-02/msg00358.html -
"global/local a misnomer?"
Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

Florian Weimer
In reply to this post by Patrick Donnelly
* Patrick Donnelly:

> This has been an insidious problem for us and I've had to write
> custom scripts to look for these accidental global accesses. It
> would be *superb* if there were some sort of compile time solution
> to this.

Do you think it would be an issue if scripts still could create and
update a variable such as http.socket?  That would be a non-global
access at the bytecode level, but it still would introduce coupling.

Should modules be read-only from a user's point of view?
Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

steve donovan
On Sun, Jul 11, 2010 at 4:20 PM, Florian Weimer <[hidden email]> wrote:
> Do you think it would be an issue if scripts still could create and
> update a variable such as http.socket?  That would be a non-global
> access at the bytecode level, but it still would introduce coupling.
> Should modules be read-only from a user's point of view?

I suspect most globals in Lua programs are precisely these
module-scope variables. They are of course just as global in effect as
any other.

They can be made into pseudo-variables (essentially static properties
of the module), so that the variable socket.http is actually the
getter and setter part socket.get_http() and socket.set_http()) and
then they can be backed by thread-local storage)

Modules are 'open' to further modification, being dynamic constructs.
It would be useful to be able to declare them as 'sealed'  (it
wouldn't be difficult to define 'package.sealed' as a modifier for
module())

steve d.
Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

Jim Jennings
In reply to this post by Mark Hamburg
On Fri, 9 Jul 2010 01:57:06 -0400, David Manura <[hidden email]> wrote:
> Ok, so when do I think globals are appropriate?  The first and primary
> case is when retrieving variables from the standard library:
>
>  print(math.sqrt(2))
>
> Here's why I think this is acceptable:
>
>  (1) Typos can be statically detected here.

> Typos to
> members and signatures (e.g. "math.sqrrt(2,true)") are not detected in
> the above approach.

Typos in member references can be automatically detected if the
module's exports are known.  I wrote the Darwin module system in part
for this reason.  However, I've not yet implemented the static
analyzer -- I would prefer to use an existing one and simply feed it
the module exports to check against.  In Darwin, _G is just another
module, btw.

>  (2) Optimization is best left to the compiler.

I heartily agree.  Some functional language run-time environments
(like Scheme48) support a form of inlining that should be possible to
do in Lua.  The idea is this:  You declare a list of globals whose
values you will not change.  The compiler is then free to inline the
value of those globals instead of generating lookup code.

>  (3) Localizing every top level function (e.g. print) is cumbersome
> to do manually.  [...]  Lua
> doesn't have a "import static foo.*" like Java [6].

Darwin's import function ('structure.open', which can be melifluously
triggered by Lua's 'require') is function that creates "global"
variables (or tables thereof) through which a module's functions and
objects become accessible.  Since it's an ordinary function, it cannot
create locals.  It would be an interesting project to pre-process Lua
source that uses Darwin modules such that locals are created for the
imports.  As long as the limit on the number of locals in a chunk is
not exceeded, such a code transformation might yield performance gains
in a fully automated fashion.

>  (4) It can work ok if custom environments are implemented carefully
> or avoided.  If the current environment is changed, then the standard
> library functions must remain exposed through the new environment.

This is how Darwin works.   All of the module bindings (including _G,
which is just another module) are stashed away.  When you open a
module, you get a copy of the bindings so your module code can do
whatever it wants.  I didn't use the metatable __index approach
because I wanted Darwin to support (by trivially wrapping) as much
existing Lua code as possible, including code that installs its own
metatable for _G or other environments.

> Now, I think globals get messier when you have custom environments, or
> even multiple environments in the same file

And this is why Darwin modules (and the "main" program) do not share
the same _G.  I guess you could say that when I use Lua (i.e. with
Darwin loaded), globals are not actually global.   Everything in _G,
whether it was there ab initio or whether my code created it, is
visible only to my code.  My "main" code cannot create globals that
are visible to the modules I have loaded, nor can any module create a
global that is visible to other modules or to my "main" code.

In my opinion, this is a very useful (and well-appreciated in the
literature) meaning of "lexical scope" in the context of module
systems.   After all, the definition of lexical scope is that you know
the binding of every reference from only a lexical analysis -- i.e. by
reading the program.  Lua's out-of-the-box module system does not
provide this.  The dynamic environment into which a module is loaded
affects the bindings used inside module code.

Jim
Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

Mark Hamburg
Maybe others had thought of this before, but it just occurred to me this morning. I've never been a fan of local-by-default because the scope was ill-defined. I hadn't thought about the fact that it also arguably makes worse one of the key complaints people have about global-by-default...

On the subject of typos, it is perhaps interesting to note that the various local-by-default proposals provide no more protection than does the current global-by-default behavior. In fact, it arguably provides less since unintended locals are pretty much impossible to detect via analysis or runtime hooks.

So, that issue should probably evolve into the question: Should access to undeclared variables be allowed? Or the similiar: Should there be a standard way to disallow it? The issue that follows from this is that unless one adopts a draconian policy of only allowing a specific white list of values, then there needs to be some way to declare a global variable in addition to declaring a local variable.

(The Lightroom policy is based on bytecode scans for global accesses together with a whitelist.)

Mark

Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

Roberto Ierusalimschy
> So, that issue should probably evolve into the question: Should access
> to undeclared variables be allowed? Or the similiar: Should there be a
> standard way to disallow it? The issue that follows from this is that
> unless one adopts a draconian policy of only allowing a specific white
> list of values, then there needs to be some way to declare a global
> variable in addition to declaring a local variable.

For me this issue has been settled a long time ago. The problem
in "anything-by-default" is not the choice of "anything", but the
"by-default".

However, declaration of globals in the chunk level is mostly useless for
read opeations. If you misspell a variable name, there is a good chance
you will misspell it in the declaration too. (Probalby you will do
copy-and-paste to add the declaration.) It is like declaring variables
in C directly in the .c files instead of in a .h (plus the common good
practices around the usage of .h files). Moreover, any basic test should
detect such misspelling.

I still think the main problem is non-intended assignment to globals,
which creates hard-to-find bugs. The simple policy of requiring a global
declaration to assign to globals would force programmers to consider
whether they really want a global variable.

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

Edgar Toernig
Roberto Ierusalimschy wrote:
>
> I still think the main problem is non-intended assignment to globals,
> which creates hard-to-find bugs. The simple policy of requiring a global
> declaration to assign to globals would force programmers to consider
> whether they really want a global variable.

You would also get get that with a $-prefix for globals.  And
as it looks so ugly, people would use locals instead ;-)
At least, every global access stands out.

But however this is addressed, people will realize that accesses
to table(record/structure)-fields have *exactly* the same problem
and will start trying to "fix" that ...

ET
Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

Florian Weimer
* Edgar Toernig:

> But however this is addressed, people will realize that accesses
> to table(record/structure)-fields have *exactly* the same problem
> and will start trying to "fix" that ...

That's why I was wondering about http.socket issue.  As long as
modules behave like plain tables, it's possible to create global state
in them.  I had the impression that this is what happened in the nmap
case, so I asked about that earlier.

It seems possible to solve this, by making sure that the compiler
knows the intend export list of modules.  The downside is that you
need to add some sort of phase distinction, like EVAL-WHEN in Common
Lisp and BEGIN blocks in Perl, or you lose expressiveness in a
significant way (you cannot pragmatically created globals anymore).
It turns out to be quite difficult to describe the semantics of this
language facility (looking at some semi-formal and informal
descriptions of EVAL-WHEN is quite instructive), and I fear that it
would by far the most complex of feature once it's added to Lua.

The assignment part of the problem could be dealt with by using
metatables.  But I wonder if a function for sealing tables would be
more helpful.  It would set a read-only flag on the table and return a
setter function which bypasses the read-only flag.  The read-only flag
would only need checking during table writes, and the check would be
virtually free because it never fails on properly written code.  This
separates read and write capability and might be useful for
sandboxing, too.
Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

Patrick Donnelly
In reply to this post by Florian Weimer
On Sun, Jul 11, 2010 at 10:20 AM, Florian Weimer <[hidden email]> wrote:

> * Patrick Donnelly:
>
>> This has been an insidious problem for us and I've had to write
>> custom scripts to look for these accidental global accesses. It
>> would be *superb* if there were some sort of compile time solution
>> to this.
>
> Do you think it would be an issue if scripts still could create and
> update a variable such as http.socket?  That would be a non-global
> access at the bytecode level, but it still would introduce coupling.

We don't look at global accesses in scripts because we give each
script its own global table. That doesn't stop them from creating an
http.socket variable as you point out. Script writers haven't done
this though and I see no reason to worry about it. Our documentation
has established places to put persistent or shared variables (a table
in the nmap module). We haven't had problems with them not using it.

> Should modules be read-only from a user's point of view?

Personally I feel scripts shouldn't be able to modify a module's
table; however, again, we haven't had problems with scripts doing this
so no reason to make a change.


Usually the problem we see is a library writes to a global (and then
later reads it) that should be local. I agree with Roberto that
unintended global writes are the problem.

--
- Patrick Donnelly
Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

Roberto Ierusalimschy
In reply to this post by Edgar Toernig
> But however this is addressed, people will realize that accesses
> to table(record/structure)-fields have *exactly* the same problem
> and will start trying to "fix" that ...

This is why I clearly stated my goal: I do not want to solve the whole
"global is bad" dillema. I only want to solve a very specific problem of
unintended use of globals. When you write "t.x" (or even "_ENV.x"), it
is quite clear what you are doing.

Lua is a dynamic language with dynamic typing, and we do not intend to
change that.

-- Roberto
Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

Geoff Leyland
In reply to this post by Roberto Ierusalimschy
On 12/07/2010, at 5:14 AM, Roberto Ierusalimschy wrote:

> I still think the main problem is non-intended assignment to globals,
> which creates hard-to-find bugs. The simple policy of requiring a global
> declaration to assign to globals would force programmers to consider
> whether they really want a global variable.

One possible downside of this is that this very simple Lua program:

a = "Hello World"
print(a)

becomes:

global a = "Hello World"
print(a)

(or "local a...").  Does this make Lua harder to learn?

Geoff
Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

Casey Hawthorne
Quoting Geoff Leyland <[hidden email]>:

> On 12/07/2010, at 5:14 AM, Roberto Ierusalimschy wrote:
>
>> I still think the main problem is non-intended assignment to globals,
>> which creates hard-to-find bugs. The simple policy of requiring a global
>> declaration to assign to globals would force programmers to consider
>> whether they really want a global variable.
>
> One possible downside of this is that this very simple Lua program:
>
> a = "Hello World"
> print(a)
>
> becomes:
>
> global a = "Hello World"
> print(a)
>
> (or "local a...").  Does this make Lua harder to learn?
>
> Geoff
>

Arrrrrrrgh!

I thought ONLY within a function one would have to use the "global"  
prefix and not use the prefix at the top level.



Reply | Threaded
Open this post in threaded view
|

Re: Globals (more ruminations)

Roberto Ierusalimschy
In reply to this post by Geoff Leyland
> One possible downside of this is that this very simple Lua program:
>
> a = "Hello World"
> print(a)
>
> becomes:
>
> global a = "Hello World"
> print(a)

There should be some toggle to control this. A simple one is the
"global" declaration itself. When a chunk uses a global declaration
it switches on the control.

(You may start you chunk with "global NO, THANKS" ;)

-- Roberto
12345