Refactor a block of Lua code to a function

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Refactor a block of Lua code to a function

Abhijit Nandy
Hi,

My Lua functions often exceed 500 lines and I then need to refactor blocks in if statements and loops to a separate function. I use MSVC a bit and it has a plugin called Visual Assist which can convert a given chunk of C/C++ code into a function with the proper required arguments. Its very useful in a hurry :)

Has anyone tried something similar for Lua? Basically I would want to give it a block of code and it would need to deduce the required arguments for the function and give me back the code string as "function (args...) ... end".

I am going to try it today with regexes and maybe ltokenp, but wanted to check if someone has already tried it.

A simple converter could perhaps detect patterns like "<0 or more spaces><1 or more valid variable name chars><Lua delimiter like '(' or ',' >" and then put the variable names into the argument list if they are not present in _G. Reserved words skipped.

Not sure of the relevant part in lexer.c yet, that I could perhaps convert to do this.

Thanks,
Abhi
Reply | Threaded
Open this post in threaded view
|

Re: Refactor a block of Lua code to a function

Sergey Zakharchenko

Abhihit,

> Basically I would want to give it a block of code and it would need to deduce the required arguments for the function and give me back the code string as "function (args...) ... end".

The simplest starting point, implementable with stock tools, that comes to my (5.1) mind is putting that chunk of code into a separate file and running a global reference detector on it (luac -p|grep [GS]ETGLOBAL). Now, Lua 5.2+ has no such ops...

Best regards,

--
DoubleF

Reply | Threaded
Open this post in threaded view
|

Re: Refactor a block of Lua code to a function

Egor Skriptunoff-2
On Thu, Jun 27, 2019 at 8:23 AM Sergey Zakharchenko wrote:
putting that chunk of code into a separate file and running a global reference detector on it (luac -p|grep [GS]ETGLOBAL). Now, Lua 5.2+ has no such ops...


That's why one have to write his own global reference detector

Reply | Threaded
Open this post in threaded view
|

Re: Refactor a block of Lua code to a function

Sergey Zakharchenko

Egor,

> That's why one have to write his own global reference detector

Or one greps for '[GS]ETTABUP.*; _ENV ', and saves oneself the need to maintain a(nother) 1.5k line script. Alternatively, one writes a script that does a lot more than that;)

Best regards,

--
DoubleF

Reply | Threaded
Open this post in threaded view
|

Re: Refactor a block of Lua code to a function

szbnwer@gmail.com
hi there! :)

[continuing the offtopic part only]

Egor Skriptunoff:
> That's why one have to write his own global reference detector

so true! :D however i only made mine when something related came up on
lua-l recently... its very useful! actually the 1st error ive found in
my codes with it was right in itself :D

Sergey Zakharchenko:
> saves oneself the need to maintain a(nother) 1.5k line script

lol, i also got tired when i reached the end of it, but i just skimmed
some more complicated parts of it :D


mine is 83loc with 17 empty and 13 comment lines (that are not all the
comments that it have) so 113 in total. however, only for pretty
printing, there is an inner function with 41 significant lines that i
will refactor, cuz i have a new use case for a part of it. (Abhijit
Nandy, no luck here, im doing such tasks with my bare hands. :D ) this
inner function uses an external string escaper, a `keywords` table;
and the "main" function uses a `table.getSortedKeys()` function and an
output buffer factory for the new, changed and deleted globals, but it
can be used for any other table as well :D

it also have some glue for my app for initializing it before i would
do anything, so i can see more, but thats not required; some to have
buttons for enable/disable it and for enable/disable showing the
changes for same types; and some more for making a "hole" into my main
output buffer (just a matter of inserting an empty string into a
table) run whatever, run this, and put the results into that "hole".
:D but these are kinda tiny stuffs (25loc in total :D )

(11th commandment: “Thou shall not litter.” rip TAD :( )

otherwise mine wont tell where the globals came from, it is a bit
tangled with my app, and it wont do a whole static analysis, that is
actually a good to have, even for modifying it for other purposes, so
its not really a fair comparison... so no offense, Egor, its somewhat
of an apple vs pear comparison! :)

btw some side notes: its invisible if there is nothing to say. the
trim length can be set. it looks a bit weird on the screenshot around
`'\015'`, cuz it wont cut the middle of an escape sequence. the square
brackets shows the previous value, but they are not there if things
changed within the same type. and finally, it was like 13loc (iirc)
when it already worked, and that was very easy, it became a "beast"
when i prettified it, and i still dont have much idea about some
trivial math around the trim, but the tests prove that its right...
X'D


(the file was too big for lua-l)
https://www.dropbox.com/s/acpk3fqub9wn5af/global%20changes.png


bests to all! :)

Reply | Threaded
Open this post in threaded view
|

Re: Refactor a block of Lua code to a function

Philippe Verdy
In reply to this post by Abhijit Nandy
Beware of patterns if you have still not canonicalized or prettyprinted your Lua code.

As well beware of comments (including multiline comments starting by square brackets) that you should not alter in their content (you may still pretty print their initial and final tags)
 finally beware of multiline strings that start also by brackets (and that should be converted to concatenated strings (with '..'), with an explicit '\n' escape at each newline.

The alternative is to first prepend the content of these multiline comments or multiline strings by a static character at the begining of each line, which cannot be a valid character starting a prettyprinted line of Lua source code (for example you may use a '!' which won't complicate your search patterns that will invalidate any line matching the '^!' pattern.
After your edits, remove every "^!" from the modified prettyprinted code.

You may want to canonicalize as well the semicolons used at end of every instruction in the pretty printed code after parsing it to know if it is expected (this will help resolve the ambiguous newlines when there are function calls (but without assignement of the returned value) on a line following an assignment or a return statement: this actually the same statement and there's no implied ';': the pretty printed code should merge these two lines and insterad split after the opening parenthese of the function call.

Once you've done this pretty printing (and temporary modification of multiline strings or comments, you can apply your patterns to perform code detection/substitution.

But your main difficulty is to determine the scope of identifiers: are they local ? are they parameters ? were parameters hidden in the same scope by a local redeclaration of the same name but for a distinct variable ? For all that you need a syntax analysis and then being able to determine the start and end of each scope, and then you can build a true mapping of variables used by functions, including variables that are part of an "outer closure", accessible to that function but still not really "global".

You may also want to canonicalize the curified function calls with explicit parentheses around their first expression parameter (beware of priorities: "f x+y" means "f(x)+y", the first parameter of curryfied function calls (with implied but omitted parentheses) are restricted to unary expressions only, except negation ("f -x" means a binary substraction operation between f and x, not the function call "f(-x)" which requires the parentheses;  and "f 1 f 2" means "f(1)(f)(2)" and not "f(1); f(2)"... even if there are some newlines or comments anywhere).

So some syntaxic sugars in Lua (notably curified calls) can be a nightmare for detection by patterns.

If you find no parameter names (for a function) and no local variable in your file, then the occurence of the name is
- "global" (i.e in the local scope of the parent including that file by transclusion, something that should not be used),
- or local to that file (if that Lua source file is included by a "require" statement), or more exactly part of a file-level closure (the closure is created by the require statement itself), and this includes the "_G" variable itself (which actually belongs to the parent closure where it is a local member that may not be the same _G variable used in all contexts where your file may be "require()"d within separate closures or that could have been reassigned by the code.

You cannot really determine the scope of named members (object.member or object['member']) of any parent object, including the _G parent, without running the code, because _G or th parent object may have been reassigned elsewhere by some function called inside your module file but defined externally) and asume that they refer to the same object or value.

So the best you can use is to use the regular Lua syntaxic parser to get a feed of tokens (and tokens for identifiers will have properties for their registering their scope (as start/end file positions and for the set of variables in parent closures and the ordered list of parent closures).

external variables seen that are not defined in any locally defined scope are in some parent scope (but not necessarily the same scope as "_G" defined in that parent scope).

The lexical scopes of closures are not just for function definitions, they also exist in "for" loop statement and each new declaration of a local creates a derived scope distinct from the first scope of the function definition or at start of the whole module file (it is also legal to redeclare the same name to create a new variable that will hide the previous one).

Finally there's the difficulty of functions defined with the syntax "function name() ... end", and not "name = function()... end"; the scope of the 'name' is the block containing it but excluding the prior scope where the name was also defined. The first syntax allows a name for be used in prior code (as a forward reference) before it is actually define, this is not the case with the anonymous definition by an assignment to a variable (declared with "local" or not), which because that name can only be used by backward reference (and it's much simpler for the parser). The lexical scope is terminated by the "end" keyword in the same block or function definition, or by the reassignment of the variable, or by another local declaration with the same name.

For this reason, using patterns to transform code is very unsafe.




Le jeu. 27 juin 2019 à 06:45, Abhijit Nandy <[hidden email]> a écrit :
Hi,

My Lua functions often exceed 500 lines and I then need to refactor blocks in if statements and loops to a separate function. I use MSVC a bit and it has a plugin called Visual Assist which can convert a given chunk of C/C++ code into a function with the proper required arguments. Its very useful in a hurry :)

Has anyone tried something similar for Lua? Basically I would want to give it a block of code and it would need to deduce the required arguments for the function and give me back the code string as "function (args...) ... end".

I am going to try it today with regexes and maybe ltokenp, but wanted to check if someone has already tried it.

A simple converter could perhaps detect patterns like "<0 or more spaces><1 or more valid variable name chars><Lua delimiter like '(' or ',' >" and then put the variable names into the argument list if they are not present in _G. Reserved words skipped.

Not sure of the relevant part in lexer.c yet, that I could perhaps convert to do this.

Thanks,
Abhi
Reply | Threaded
Open this post in threaded view
|

Re: Refactor a block of Lua code to a function

Tim Hill
In reply to this post by Abhijit Nandy


> On Jun 26, 2019, at 9:44 PM, Abhijit Nandy <[hidden email]> wrote:
>
> Hi,
>
> My Lua functions often exceed 500 lines and I then need to refactor blocks in if statements and loops to a separate function. I use MSVC a bit and it has a plugin called Visual Assist which can convert a given chunk of C/C++ code into a function with the proper required arguments. Its very useful in a hurry :)
>
> Has anyone tried something similar for Lua? Basically I would want to give it a block of code and it would need to deduce the required arguments for the function and give me back the code string as "function (args...) ... end".
>
> I am going to try it today with regexes and maybe ltokenp, but wanted to check if someone has already tried it.
>
> A simple converter could perhaps detect patterns like "<0 or more spaces><1 or more valid variable name chars><Lua delimiter like '(' or ',' >" and then put the variable names into the argument list if they are not present in _G. Reserved words skipped.
>
> Not sure of the relevant part in lexer.c yet, that I could perhaps convert to do this.
>
> Thanks,
> Abhi

I’m always worried when people say “my function is X lines long and I need to refactor it…”. Yes, many times a long function MAY be better when broken up, but the “function police” who claim such a thing is ALWAYS true are talking nonsense imho. Some functions are complex and inherently must be long.

To my mind, refactoring is about CLARITY, and breaking up large functions is only one way to do this (and not the best, either, since in languages that lack lexical nesting it may expose internal helper functions to abuse). Other ways to clarify long functions are to carefully control function-wide state vs local “working” state, and re-factor deeply nested loops into single-layer loops with state machines.

Knowing when to break up a function falls into the “art” side of good software development; doing so automatically based on line count has always seemed dubious at best to me.

—Tim


Reply | Threaded
Open this post in threaded view
|

Re: Refactor a block of Lua code to a function

Jim-2
28.06.2019, 02:44, "Tim Hill" <[hidden email]>:
> Knowing when to break up a function falls into the “art”
> side of good software development

indeed. i am also not very good at it, but i guess better
refactoring into functions/procedures comes with more experience.
stack based languages "enforce" such an approach in order
to keep "words" short and easier to follow and understand.

but having large functions/procedures at the beginning seems
quite normal when following the procedural approach of structured
programming.

i wonder how this outdated procedural debates fit in the world
of OO ? isn't  such an outdated paradigm more the approach
only old grumpy luddites would choose voluntarily ?


Reply | Threaded
Open this post in threaded view
|

Re: Refactor a block of Lua code to a function

Jim-2
In reply to this post by Tim Hill
28.06.2019, 02:44, "Tim Hill" <[hidden email]>:
> doing so automatically based on line count has always
> seemed dubious at best to me.

it is not solely based on line count i guess.
i think that M$ compiler plugin follows the flow in the function/
procedure and factors only certain code out, eg
alternatives in if/then/else, loops etc.

sounds like a good idea for C code to me.


Reply | Threaded
Open this post in threaded view
|

Re: Refactor a block of Lua code to a function

szbnwer@gmail.com
In reply to this post by Tim Hill
Tim Hill:

such pleasant words!!! :D


tl;dr: read the next paragraph, check out Wirth's law, and basically
follow "yagni" and "dry" instead of "clean code". :D


my rule-of-thumb is like:
dont use an extra variable or function for anything that wont be used
for more than once (if possible) except if it really makes sense to
have a freestanding setting/tool, that is future-wise, or there is any
strong-enough reason to do so.


everyone knows Moore's law, they say, that its fine to pull some trash
here and some more there, cuz the customer wants it for yesterday, and
business says that this is the right way to go, and if someone else
will need to use and maintain it, then its fine for them to give trash
and say that it "works"... yuck. did u guys hear about Wirth's law? so
much ppl around the industry didnt, or just dont care about it... :/
the industry forces the right opposite of the golden rule (“Whatever
you wish that others would do to you, do also to them, for this is the
Law and the Prophets.”) we are using tools on a daily basis that were
made with just that mentality, therefor programmers can earn this
much, cuz its a real madness! :D and thats my main reason for using
lua!!! :D

ppl use variables just for giving sense to some magic, like a search
box that is triggered by hitting <enter>, so that is checking for the
charcode if it is 13, so they put it in a variable called 'enter' to
make it human-readable, but i prefer a comment on the end of the line,
that its an <enter>... commenting is the right thing to make things
easy-to-understand for humans. (but for sure, thats suitable for
compiled languages where that wont become overhead, but not a good
practice for scripting.) otoh, if i would need more charcodes then i
would most likely make a hash table for them.

ppl use everywhere set-/getters just because they are afraid of
refactorization, so they make everything bloated, and to be
blackboxes... tracking down all these extra abstraction layers is a
total madness, it can easily give a mental overflow, and when i go for
the essence of such things (like a google+ login where they give a
bloated lib instead of an api specification) then finally ive got a
small bunch of neat lines, instead of megs of bloated mess... they do
this cuz they think its for ease of coding, but they forget about that
they have to understand it as well, so it also becomes error prone,
that is a total madness... too much abstraction is wrong for our
health, mmmkay? :D and also, such common ways generally makes every
little essential tools self-contained in every bloat, so if u have a
nodejs app, for example, then most likely u have probably a s#!tload
of tools included, that are basically the same, but they are
everywhere... :D

also, its fine to use jquery everywhere only for its `xmlHttpRequest`,
that can be written like in 20loc to be a freestanding function with
all the goodies included; and its fine to use bootstrap only for its
responsivity that is a matter of using `float: left;` and `@media`...


all the bests! :)

Reply | Threaded
Open this post in threaded view
|

Re: Refactor a block of Lua code to a function

Luiz Henrique de Figueiredo
In reply to this post by Sergey Zakharchenko
> The simplest starting point, implementable with stock tools, that comes to my (5.1) mind is putting that chunk of code into a separate file and running a global reference detector on it (luac -p|grep [GS]ETGLOBAL). Now, Lua 5.2+ has no such ops...

For a tool, see http://lua-users.org/lists/lua-l/2012-12/msg00397.html
and the thread starting at
http://lua-users.org/lists/lua-l/2013-01/msg00470.html