Incorrect optimization of function isinstack

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Incorrect optimization of function isinstack

Yongheng Chen

Hi,

In https://github.com/lua/lua/blob/4e47f81188d37e29027158b76271d02a781242e2/ldebug.c#L645, the check for isinstack is

`return (0 <= i && i < (ci->top - base) && s2v(base + i) == o);`

However, I found that the default build (not modify anything in the makefile but only add `-g` to CFLAGS) of lua optimized away the final check (
s2v(base + i) == o).

The following is the asm of function varinfo (isinstack gets inlined in this function) in the binary built by GCC 8.3.0:

.text:000000000000AEE5                 add     rdi, 10h
.text:000000000000AEE9                 mov     rdx, o
.text:000000000000AEEC                 sub     rdx, rdi
.text:000000000000AEEF                 js      short loc_AF30
.text:000000000000AEF1                 cmp     o, [ci+8]
.text:000000000000AEF5                 jnb     short loc_AF30
.text:000000000000AEF7                 add     rdi, rdx
.text:000000000000AEFA                 cmp     o, rdi
.text:000000000000AEFD                 jnz     short loc_AF30
.text:000000000000AEFF                 mov     rdi, [r8+18h]   ; p
.text:000000000000AF03                 mov     o, [ci+20h]
.text:000000000000AF07                 sar     rdx, 4          ; reg
.text:000000000000AF0B                 sub     rsi, [rdi+40h]
.text:000000000000AF0F                 sar     rsi, 2
.text:000000000000AF13                 lea     rcx, [rsp+18h+name] ; name
.text:000000000000AF18                 dec     esi             ; lastpc
.text:000000000000AF1A                 call    getobjname

The instruction at 0xAEEF corresponds to i>=0, the instruction at 0x
AEF5 refers to i < (ci->top - base). However, the third check is always true:


.text:000000000000AEE9                 mov     rdx, o
.text:000000000000AEEC                 sub     rdx, rdi

.text:000000000000AEF7                 add     rdi, rdx
.text:000000000000AEFA                 cmp     o, rdi

Which means s2v(base + i) == o is constant-folded into a True. I also tried compiling with clang-7.0.1 and clang-9.0.1, which brought me the same result.

I am not sure whether this is intended or will cause any problem.

Best,

Yongheng

Reply | Threaded
Open this post in threaded view
|

Re: Incorrect optimization of function isinstack

Joseph C. Sible
When would that condition ever be false? s2v turns a pointer to a union into a pointer to its member, so isn't it basically just a cast? And i is just (o - base) modulo a cast, so that condition looks a lot like "base + o - base == o" to me.

Joseph C. Sible


On Sun, Feb 21, 2021 at 9:19 PM Yongheng Chen <[hidden email]> wrote:

>
> Hi,
>
> In https://github.com/lua/lua/blob/4e47f81188d37e29027158b76271d02a781242e2/ldebug.c#L645, the check for isinstack is
>
> `return (0 <= i && i < (ci->top - base) && s2v(base + i) == o);`
>
> However, I found that the default build (not modify anything in the makefile but only add `-g` to CFLAGS) of lua optimized away the final check (
> s2v(base + i) == o).
>
> The following is the asm of function varinfo (isinstack gets inlined in this function) in the binary built by GCC 8.3.0:
>
> .text:000000000000AEE5                 add     rdi, 10h
> .text:000000000000AEE9                 mov     rdx, o
> .text:000000000000AEEC                 sub     rdx, rdi
> .text:000000000000AEEF                 js      short loc_AF30
> .text:000000000000AEF1                 cmp     o, [ci+8]
> .text:000000000000AEF5                 jnb     short loc_AF30
> .text:000000000000AEF7                 add     rdi, rdx
> .text:000000000000AEFA                 cmp     o, rdi
> .text:000000000000AEFD                 jnz     short loc_AF30
> .text:000000000000AEFF                 mov     rdi, [r8+18h]   ; p
> .text:000000000000AF03                 mov     o, [ci+20h]
> .text:000000000000AF07                 sar     rdx, 4          ; reg
> .text:000000000000AF0B                 sub     rsi, [rdi+40h]
> .text:000000000000AF0F                 sar     rsi, 2
> .text:000000000000AF13                 lea     rcx, [rsp+18h+name] ; name
> .text:000000000000AF18                 dec     esi             ; lastpc
> .text:000000000000AF1A                 call    getobjname
>
> The instruction at 0xAEEF corresponds to i>=0, the instruction at 0x
> AEF5 refers to i < (ci->top - base). However, the third check is always true:
>
>
> .text:000000000000AEE9                 mov     rdx, o
> .text:000000000000AEEC                 sub     rdx, rdi
> .text:000000000000AEF7                 add     rdi, rdx
> .text:000000000000AEFA                 cmp     o, rdi
>
> Which means s2v(base + i) == o is constant-folded into a True. I also tried compiling with clang-7.0.1 and clang-9.0.1, which brought me the same result.
>
> I am not sure whether this is intended or will cause any problem.
>
> Best,
>
> Yongheng
Reply | Threaded
Open this post in threaded view
|

Re: Incorrect optimization of function isinstack

Yongheng Chen

> that condition looks a lot like "base + o - base == o" to me.

I think you are correct. Then this check seems useless in normal situation?

Yongheng


On 2/21/2021 11:21 PM, Joseph C. Sible wrote:
When would that condition ever be false? s2v turns a pointer to a union into a pointer to its member, so isn't it basically just a cast? And i is just (o - base) modulo a cast, so that condition looks a lot like "base + o - base == o" to me.

Joseph C. Sible


On Sun, Feb 21, 2021 at 9:19 PM Yongheng Chen <[hidden email]> wrote:
>
> Hi,
>
> In https://github.com/lua/lua/blob/4e47f81188d37e29027158b76271d02a781242e2/ldebug.c#L645, the check for isinstack is
>
> `return (0 <= i && i < (ci->top - base) && s2v(base + i) == o);`
>
> However, I found that the default build (not modify anything in the makefile but only add `-g` to CFLAGS) of lua optimized away the final check (
> s2v(base + i) == o).
>
> The following is the asm of function varinfo (isinstack gets inlined in this function) in the binary built by GCC 8.3.0:
>
> .text:000000000000AEE5                 add     rdi, 10h
> .text:000000000000AEE9                 mov     rdx, o
> .text:000000000000AEEC                 sub     rdx, rdi
> .text:000000000000AEEF                 js      short loc_AF30
> .text:000000000000AEF1                 cmp     o, [ci+8]
> .text:000000000000AEF5                 jnb     short loc_AF30
> .text:000000000000AEF7                 add     rdi, rdx
> .text:000000000000AEFA                 cmp     o, rdi
> .text:000000000000AEFD                 jnz     short loc_AF30
> .text:000000000000AEFF                 mov     rdi, [r8+18h]   ; p
> .text:000000000000AF03                 mov     o, [ci+20h]
> .text:000000000000AF07                 sar     rdx, 4          ; reg
> .text:000000000000AF0B                 sub     rsi, [rdi+40h]
> .text:000000000000AF0F                 sar     rsi, 2
> .text:000000000000AF13                 lea     rcx, [rsp+18h+name] ; name
> .text:000000000000AF18                 dec     esi             ; lastpc
> .text:000000000000AF1A                 call    getobjname
>
> The instruction at 0xAEEF corresponds to i>=0, the instruction at 0x
> AEF5 refers to i < (ci->top - base). However, the third check is always true:
>
>
> .text:000000000000AEE9                 mov     rdx, o
> .text:000000000000AEEC                 sub     rdx, rdi
> .text:000000000000AEF7                 add     rdi, rdx
> .text:000000000000AEFA                 cmp     o, rdi
>
> Which means s2v(base + i) == o is constant-folded into a True. I also tried compiling with clang-7.0.1 and clang-9.0.1, which brought me the same result.
>
> I am not sure whether this is intended or will cause any problem.
>
> Best,
>
> Yongheng
Reply | Threaded
Open this post in threaded view
|

Re: Incorrect optimization of function isinstack

Andrew Gierth
>>>>> "Yongheng" == Yongheng Chen <[hidden email]> writes:

 >> that condition looks a lot like "base + o - base == o" to me.

 Yongheng> I think you are correct. Then this check seems useless in
 Yongheng> normal situation?

The check seems like a misguided attempt to make up for the potential
undefined behavior of the prior pointer subtraction. i.e. it supposes
that on some hypothetical architecture (perhaps a segmented one), the
subtraction of two unrelated pointers might produce a value that when
added back to "base" does not recreate the original value.

(In fact, just performing the subtraction in cases like that is
undefined behavior, so there is no guarantee at all about what happens.)

--
Andrew.
Reply | Threaded
Open this post in threaded view
|

Re: Incorrect optimization of function isinstack

Joseph C. Sible
To further emphasize your point: since subtracting unrelated pointers
is Undefined Behavior, even on such a hypothetical architecture,
compilers would still be allowed to perform this optimization. The
comment that "the subsequent checks are ISO C and ensure a correct
result" is wrong. If isinstack is ever called where o and base are
unrelated, the entire execution is undefined.

Joseph C. Sible


On Mon, Feb 22, 2021 at 12:45 AM Andrew Gierth
<[hidden email]> wrote:

>
> >>>>> "Yongheng" == Yongheng Chen <[hidden email]> writes:
>
>  >> that condition looks a lot like "base + o - base == o" to me.
>
>  Yongheng> I think you are correct. Then this check seems useless in
>  Yongheng> normal situation?
>
> The check seems like a misguided attempt to make up for the potential
> undefined behavior of the prior pointer subtraction. i.e. it supposes
> that on some hypothetical architecture (perhaps a segmented one), the
> subtraction of two unrelated pointers might produce a value that when
> added back to "base" does not recreate the original value.
>
> (In fact, just performing the subtraction in cases like that is
> undefined behavior, so there is no guarantee at all about what happens.)
>
> --
> Andrew.
Reply | Threaded
Open this post in threaded view
|

Re: Incorrect optimization of function isinstack

Roberto Ierusalimschy
> To further emphasize your point: since subtracting unrelated pointers
> is Undefined Behavior, even on such a hypothetical architecture,
> compilers would still be allowed to perform this optimization. The
> comment that "the subsequent checks are ISO C and ensure a correct
> result" is wrong. If isinstack is ever called where o and base are
> unrelated, the entire execution is undefined.

You are right :-(

One option to fix that would be to traverse the whole segment looking
for a pointer equality: Slow but correct.  The size would be the number
of registers in a Lua function. This is limited to 255, but usually is
much smaller.

-- Roberto