Pooling of strings is good

classic Classic list List threaded Threaded
103 messages Options
123456
Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Coda Highland
On Mon, Aug 25, 2014 at 3:30 AM, Coroutines <[hidden email]> wrote:

> On Sun, Aug 24, 2014 at 7:50 PM, Javier Guerra Giraldez
> <[hidden email]> wrote:
>> On Sun, Aug 24, 2014 at 9:25 PM, Sean Conner <[hidden email]> wrote:
>>>   Reusing a buffer was the cause of the recent Heartbleed bug that affected
>>> OpenSSL and just about everything that relied upon it.
>>
>>
>> I think it was more like failing to sanitize an input (http://xkcd.com/1354/)
>
> On a related note, I think because of Heartbleed we should never reuse
> buffers ever again.  Seems logical, right?  Can't trust programmers
> anyway, we're all just simple beings that can't figure out better...
>
> (I'm not mad at you, I think Sean is getting ridiculous -- Appeal to
> Fear and all that...)

The actual lesson in Heartbleed is "never assume anything is safe when
you're writing crypto." (The lesson in Apple's SSL bug was "don't
copy-paste code," which is much more broadly relevant.)

/s/ Adam

Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Axel Kittenberger
In reply to this post by Coroutines
> On a related note, I think because of Heartbleed we should never reuse
> buffers ever again.  Seems logical, right?  Can't trust programmers
> anyway, we're all just simple beings that can't figure out better...

There are a few lessons one could learn from it. For example, if you already have a custom allocator in place in a security critical software, use memset() to zero out any new allocated memory with may have been left holding security relevant information. This really are only a few cycles worth it. Or memset() to zero security relevant memory before you free() it. Or -- and this plays into Luas miminimalism princriple -- there is little reason to active an obscure source tree by default for everybody that is only used by a few users and maintained and looked at by few developers. Or -- and this has also been argued by a few -- reconsider the decision to write security critical stuff in C. With most highlevel languages this bug could not have happened this way. Or -- there should be more code review in security critical stuff. Or -- etc. etc.

Yes, we improve as coders by generalized learning. There are a few options in which way to go and develop -- Yes, there is not one "true" way, since ages of arguing about coding, but to deny any possible generalizations gained from experience like you did is sure one wrong way.




On Mon, Aug 25, 2014 at 12:30 PM, Coroutines <[hidden email]> wrote:
On Sun, Aug 24, 2014 at 7:50 PM, Javier Guerra Giraldez
<[hidden email]> wrote:
> On Sun, Aug 24, 2014 at 9:25 PM, Sean Conner <[hidden email]> wrote:
>>   Reusing a buffer was the cause of the recent Heartbleed bug that affected
>> OpenSSL and just about everything that relied upon it.
>
>
> I think it was more like failing to sanitize an input (http://xkcd.com/1354/)

On a related note, I think because of Heartbleed we should never reuse
buffers ever again.  Seems logical, right?  Can't trust programmers
anyway, we're all just simple beings that can't figure out better...

(I'm not mad at you, I think Sean is getting ridiculous -- Appeal to
Fear and all that...)


Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Coroutines
In reply to this post by Coda Highland
On Mon, Aug 25, 2014 at 5:15 AM, Coda Highland <[hidden email]> wrote:

> The actual lesson in Heartbleed is "never assume anything is safe when
> you're writing crypto." (The lesson in Apple's SSL bug was "don't
> copy-paste code," which is much more broadly relevant.)

Well I'm not writing crypto, guess I can pretend everything is safe.
Interface pl0x?

Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Coroutines
In reply to this post by Axel Kittenberger
On Mon, Aug 25, 2014 at 6:38 AM, Axel Kittenberger <[hidden email]> wrote:

> There are a few lessons one could learn from it. For example, if you already
> have a custom allocator in place in a security critical software, use
> memset() to zero out any new allocated memory with may have been left
> holding security relevant information.

I was doing this, now I just allocate a new buffer every time like an
idiot.  I use Lua because it allows me to prototype quickly -- if the
facility is there I love avoiding work in C.  Using userdata as
mutable buffers in places of immutable strings is not currently
possible/accessible.  I'm arguing for an interface to this, not their
default.

> Yes, we improve as coders by generalized learning. There are a few options
> in which way to go and develop -- Yes, there is not one "true" way, since
> ages of arguing about coding, but to deny any possible generalizations
> gained from experience like you did is sure one wrong way.

I've been dealing with generalizations for 2-3 days now, in both this
thread and a few others.  There is a point where generalizing is just
preventative and exclusionary, not enlightening.  I find too much of
that on this list, and I'm sure you just jumped in here to offer a
tidbit of wisdom without reading the whole chain.  Honestly, most of
these replies sound like "never use goto!"  Someone good at their job
would recognize the time and place.

Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Coda Highland
On Mon, Aug 25, 2014 at 7:28 AM, Coroutines <[hidden email]> wrote:
> Honestly, most of these replies sound like "never use goto!"

Ironic that you'd use this example considering this is exactly the
thing that caused Apple's SSL bug.

> Well I'm not writing crypto, guess I can pretend everything is safe.

C'mon, you should know that the inverse ("a => b" to "!a => !b") of a
true statement isn't logically assured to be true; it's the
contrapositive ("a => b" to "!b => !a"; in this case, "if it's not
safe, don't use it when writing crypto") that's logically guaranteed.

/s/ Adam

Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Coroutines
On Mon, Aug 25, 2014 at 7:52 AM, Coda Highland <[hidden email]> wrote:
> On Mon, Aug 25, 2014 at 7:28 AM, Coroutines <[hidden email]> wrote:
>> Honestly, most of these replies sound like "never use goto!"
>
> Ironic that you'd use this example considering this is exactly the
> thing that caused Apple's SSL bug.

Well I'm getting nothing out of this discussion now... there's
generalization and then there's common sense.  Do what the situation
requires :\

I don't think you're saying "never use goto" but if you are we've come
full circle.  I'm outtie.

>> Well I'm not writing crypto, guess I can pretend everything is safe.
>
> C'mon, you should know that the inverse ("a => b" to "!a => !b") of a
> true statement isn't logically assured to be true; it's the
> contrapositive ("a => b" to "!b => !a"; in this case, "if it's not
> safe, don't use it when writing crypto") that's logically guaranteed.

I do know this.

Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Dirk Laurie-2
In reply to this post by Coroutines
2014-08-25 12:37 GMT+02:00 Coroutines <[hidden email]>:

> You can "rename" modules at runtime but not userdata internal
> typenames -- you would have to recompile whatever defines those types.

You can rename them.

The string called "tname" that is an argument to some functions in
the auxiliary library is merely a key in the registry associated with
the metatable of the userdata. You have access to the registry via

> R=debug.getregistry()

For example, you can shallowcopy R["FILE*"] to a new table, store
that as R.CLONE, change its __name field to "CLONE", and
debug.setmetatable(io.stdout,R.CLONE).

This is not a particularly useful thing to do. If you print io.type(io.stdout)
you get nil because it is not "FILE*". Calls to the io library do not work,
e.g.

> io.stdout:write"yyy\n"
... calling 'write' on bad self (FILE* expected, got CLONE)

What you can't change is C code that calls those functions from
the auxiliary library. They take const char* arguments. So everything
in the io library still creates and expects tname "FILE*". That's why
writing to stdout failed: "FILE*" is explicitly checked for.

But your spyware module that inspects the innards of someone
else's userdata will not use luaL_checkudata. It will simply use
lua_touserdata, No names, no pack drill.

Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Coda Highland
In reply to this post by Coroutines
On Mon, Aug 25, 2014 at 7:58 AM, Coroutines <[hidden email]> wrote:

> On Mon, Aug 25, 2014 at 7:52 AM, Coda Highland <[hidden email]> wrote:
>> On Mon, Aug 25, 2014 at 7:28 AM, Coroutines <[hidden email]> wrote:
>>> Honestly, most of these replies sound like "never use goto!"
>>
>> Ironic that you'd use this example considering this is exactly the
>> thing that caused Apple's SSL bug.
>
> Well I'm getting nothing out of this discussion now... there's
> generalization and then there's common sense.  Do what the situation
> requires :\
>
> I don't think you're saying "never use goto" but if you are we've come
> full circle.  I'm outtie.

I'm not saying it. I was just pointing out the irony. My intent was
humor, not derision. Apologies if it came across offensively.

/s/ Adam

Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Philipp Janda
In reply to this post by Coroutines
Am 25.08.2014 um 12:08 schröbte Coroutines:

> On Sun, Aug 24, 2014 at 7:25 PM, Sean Conner <[hidden email]> wrote:
>
>>    The issue I have with this (and I can't speak for others) is that it
>> doesn't really *buy* you anthying but pointless complexity.
>
> As I said in another post, looking at the contents of the io.stdout
> userdata was just an example.  Ideally I'd want to present a buffer I
> recv() into from my socket library to Lua to be used like a string --
> without becoming a string -- so I can avoid reallocating for that
> buffer over and over.

You will probably have to write your own buffer userdata for that. I
don't think such a userdata is useful enough for most people that it
should be included in Lua's standard library (compared to a `FILE*`
which is the only userdata in Lua right now). If you really need pattern
matching etc. in your socket buffers, I suggest you start with a
modified copy of `lstrlib.c` (it's MIT after all). But even if you
implement all string functions for your buffer userdata, you won't be
able to replace the immutable string type in Lua because (immutable)
strings are hashed by contents and (mutable) userdata are hashed by
identity/address. It would get awkward fast if you modified the contents
of a table key ...


Philipp



Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Coroutines
On Mon, Aug 25, 2014 at 8:47 AM, Philipp Janda <[hidden email]> wrote:

> You will probably have to write your own buffer userdata for that. I don't
> think such a userdata is useful enough for most people that it should be
> included in Lua's standard library (compared to a `FILE*` which is the only
> userdata in Lua right now). If you really need pattern matching etc. in your
> socket buffers, I suggest you start with a modified copy of `lstrlib.c`
> (it's MIT after all). But even if you implement all string functions for
> your buffer userdata, you won't be able to replace the immutable string type
> in Lua because (immutable) strings are hashed by contents and (mutable)
> userdata are hashed by identity/address. It would get awkward fast if you
> modified the contents of a table key ...

I can make my own version of the string library that would largely be
duplicated code, as I'm just trying to get those functions to operate
on userdata.  I was hoping an interface/function-or-3 would be made
upstream to allow one to trick functions that take strings to accept
userdata.  I know of no simple way to do this.  (tricking
lua_isstring() and friends)

Repeating what I have said earlier: I realize this would be a security
risk, and I hope that if something is possible it could be controlled
through a function or boolean switch within the debug library.

Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Philipp Janda
Am 25.08.2014 um 17:52 schröbte Coroutines:

> On Mon, Aug 25, 2014 at 8:47 AM, Philipp Janda <[hidden email]> wrote:
>
>> You will probably have to write your own buffer userdata for that. I don't
>> think such a userdata is useful enough for most people that it should be
>> included in Lua's standard library (compared to a `FILE*` which is the only
>> userdata in Lua right now). If you really need pattern matching etc. in your
>> socket buffers, I suggest you start with a modified copy of `lstrlib.c`
>> (it's MIT after all). But even if you implement all string functions for
>> your buffer userdata, you won't be able to replace the immutable string type
>> in Lua because (immutable) strings are hashed by contents and (mutable)
>> userdata are hashed by identity/address. It would get awkward fast if you
>> modified the contents of a table key ...
>
> I can make my own version of the string library that would largely be
> duplicated code, as I'm just trying to get those functions to operate
> on userdata.  I was hoping an interface/function-or-3 would be made
> upstream to allow one to trick functions that take strings to accept
> userdata.

The string library doesn't make sense for generic userdata, it only
makes sense for character arrays a.k.a buffers. That's why I suggested
you write a buffer userdata type. And please don't try to replace the
string library with your own version. Just take `lstrlib.c`, replace
`luaL_checkstring` with `luaL_checkudata`, and register those functions
as methods of your buffer userdata. Yes, you duplicate a lot of code,
but you stay backwards compatible with current Lua, and you stay type-safe.

>
> Repeating what I have said earlier: I realize this would be a security
> risk, and I hope that if something is possible it could be controlled
> through a function or boolean switch within the debug library.

But it would be a shame if your socket library only worked in debug
mode, wouldn't it?

Philipp




Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Coroutines
On Mon, Aug 25, 2014 at 9:22 AM, Philipp Janda <[hidden email]> wrote:

> The string library doesn't make sense for generic userdata, it only makes
> sense for character arrays a.k.a buffers. That's why I suggested you write a
> buffer userdata type. And please don't try to replace the string library
> with your own version. Just take `lstrlib.c`, replace `luaL_checkstring`
> with `luaL_checkudata`, and register those functions as methods of your
> buffer userdata. Yes, you duplicate a lot of code, but you stay backwards
> compatible with current Lua, and you stay type-safe.

Yes, this is what's expected -- I just disagree with the code
duplication a lot :>

> But it would be a shame if your socket library only worked in debug mode,
> wouldn't it?

If I care this much about performance and not stressing allocations, I
think I'd be okay with requiring the debug library be present -- that
said, I haven't worked in a lua env where it hasn't been available.
I've made sandboxes where I've personally removed it, though.  I don't
feel like it would be a very distressing dependency.. but you have a
point.

Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Roberto Ierusalimschy
In reply to this post by Coda Highland
> On Mon, Aug 25, 2014 at 7:28 AM, Coroutines <[hidden email]> wrote:
> > Honestly, most of these replies sound like "never use goto!"
>
> Ironic that you'd use this example considering this is exactly the
> thing that caused Apple's SSL bug.

Come on, be fair. It was not the use of goto that *caused* the bug. They
could produce exactly the same kind of dirty code with break or return.

"Do not look for other causes to explain things that can be explained by
pure incompetence" :)

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Coda Highland
On Mon, Aug 25, 2014 at 9:52 AM, Roberto Ierusalimschy
<[hidden email]> wrote:

>> On Mon, Aug 25, 2014 at 7:28 AM, Coroutines <[hidden email]> wrote:
>> > Honestly, most of these replies sound like "never use goto!"
>>
>> Ironic that you'd use this example considering this is exactly the
>> thing that caused Apple's SSL bug.
>
> Come on, be fair. It was not the use of goto that *caused* the bug. They
> could produce exactly the same kind of dirty code with break or return.
>
> "Do not look for other causes to explain things that can be explained by
> pure incompetence" :)
>
> -- Roberto
>

Oh, granted, granted, as I had mentioned in a previous post, the
actual failing was copy-paste coding. But yes, you're right :)

(Which is of course still relevant to the suggestion in this thread to
duplicate the string library...)

/s/ Adam

Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Axel Kittenberger
In reply to this post by Coroutines
> Yes, this is what's expected -- I just disagree with the code
> duplication a lot :>

So code duplication is bad? Never look at code duplication? You aren't making one of these generalisations here you are hating so much?


On Mon, Aug 25, 2014 at 6:35 PM, Coroutines <[hidden email]> wrote:
On Mon, Aug 25, 2014 at 9:22 AM, Philipp Janda <[hidden email]> wrote:

> The string library doesn't make sense for generic userdata, it only makes
> sense for character arrays a.k.a buffers. That's why I suggested you write a
> buffer userdata type. And please don't try to replace the string library
> with your own version. Just take `lstrlib.c`, replace `luaL_checkstring`
> with `luaL_checkudata`, and register those functions as methods of your
> buffer userdata. Yes, you duplicate a lot of code, but you stay backwards
> compatible with current Lua, and you stay type-safe.

Yes, this is what's expected -- I just disagree with the code
duplication a lot :>

> But it would be a shame if your socket library only worked in debug mode,
> wouldn't it?

If I care this much about performance and not stressing allocations, I
think I'd be okay with requiring the debug library be present -- that
said, I haven't worked in a lua env where it hasn't been available.
I've made sandboxes where I've personally removed it, though.  I don't
feel like it would be a very distressing dependency.. but you have a
point.


Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Coroutines
On Mon, Aug 25, 2014 at 11:22 AM, Axel Kittenberger <[hidden email]> wrote:

> So code duplication is bad? Never look at code duplication? You aren't
> making one of these generalisations here you are hating so much?

Earlier in this long and unproductive email chain I was purporting
that ideas/projects talked about on this list are killed by dangerous
generalizations.

I've come to the better conclusion that some people just want to
argue.  Indefinitely...

Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Dirk Laurie-2
2014-08-25 20:26 GMT+02:00 Coroutines <[hidden email]>:
> I've come to the better conclusion that some people just want to
> argue.  Indefinitely...

You've been on this list how long, counting previous incarnations,
and only reached that conclusion now??

Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Coroutines
On Mon, Aug 25, 2014 at 11:33 AM, Dirk Laurie <[hidden email]> wrote:

> You've been on this list how long, counting previous incarnations,
> and only reached that conclusion now??

It was mostly sarcasm :p  I just thought we'd try to accomplish
something once in a while :3

Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Roberto Ierusalimschy
In reply to this post by Coroutines
> Earlier in this long and unproductive email chain [...]

On which you created more than 40% of the messages...

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: Pooling of strings is good

Coroutines
On Mon, Aug 25, 2014 at 1:14 PM, Roberto Ierusalimschy
<[hidden email]> wrote:
>> Earlier in this long and unproductive email chain [...]
>
> On which you created more than 40% of the messages...

Good, so you understand my disappointment -- having repeatedly
clarified what I'm looking to do for those who didn't read.  Now that
everybody is caught up we can start being productive ^__^

123456