Persistent http connection

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Persistent http connection

Srinivas Murthy
Using http.request() like so, 
        response_body = { }
        local res, code, response_headers, status = http.request
        {
        url = myurl,
        method = "POST",

is resulting in a new TCP conn each time I invoke this call. Is there a way to use persistent tcp conn to avoid this conn setup / tear down overhead?

Reply | Threaded
Open this post in threaded view
|

Re: Persistent http connection

Dirk Laurie-2
You need to provide a file for stroing cookies.

I can do it from the module 'curl' installed by 'luarocks install lua-curl'.

  local session = curl.easy()
  local cookies
  if curl.OPT_COOKIEFILE and curl.OPT_COOKIEJAR then
    cookies = os.tmpname()
    session:setopt(curl.OPT_COOKIEFILE,cookies)
    session:setopt(curl.OPT_COOKIEJAR,cookies)
    logins[session] = cookies
  else
    message("Your 'curl' does not support cookies. You will be anonymous.")
  end

The module 'http' should offer a similar service.
Op Wo., 7 Nov. 2018 om 21:34 het Srinivas Murthy
<[hidden email]> geskryf:

>
> Using http.request() like so,
>         response_body = { }
>         local res, code, response_headers, status = http.request
>         {
>         url = myurl,
>         method = "POST",
>
> is resulting in a new TCP conn each time I invoke this call. Is there a way to use persistent tcp conn to avoid this conn setup / tear down overhead?
>

Reply | Threaded
Open this post in threaded view
|

Re: Persistent http connection

Coda Highland
On Wed, Nov 7, 2018 at 1:56 PM Dirk Laurie <[hidden email]> wrote:

>
> You need to provide a file for stroing cookies.
>
> I can do it from the module 'curl' installed by 'luarocks install lua-curl'.
>
>   local session = curl.easy()
>   local cookies
>   if curl.OPT_COOKIEFILE and curl.OPT_COOKIEJAR then
>     cookies = os.tmpname()
>     session:setopt(curl.OPT_COOKIEFILE,cookies)
>     session:setopt(curl.OPT_COOKIEJAR,cookies)
>     logins[session] = cookies
>   else
>     message("Your 'curl' does not support cookies. You will be anonymous.")
>   end
>
> The module 'http' should offer a similar service.

That's curious. A cookie jar is required in order to use HTTP
keepalive? Is that documented? It seems like a dubious dependency.

/s/ Adam

Reply | Threaded
Open this post in threaded view
|

Re: Persistent http connection

Srinivas Murthy
This is an important issue for me. Can anyone please add to this? There's got to be a way to avoid conn setup/teardown each time we issue a http.request()

Thanks

On Wed, Nov 7, 2018 at 12:44 PM Coda Highland <[hidden email]> wrote:
On Wed, Nov 7, 2018 at 1:56 PM Dirk Laurie <[hidden email]> wrote:
>
> You need to provide a file for stroing cookies.
>
> I can do it from the module 'curl' installed by 'luarocks install lua-curl'.
>
>   local session = curl.easy()
>   local cookies
>   if curl.OPT_COOKIEFILE and curl.OPT_COOKIEJAR then
>     cookies = os.tmpname()
>     session:setopt(curl.OPT_COOKIEFILE,cookies)
>     session:setopt(curl.OPT_COOKIEJAR,cookies)
>     logins[session] = cookies
>   else
>     message("Your 'curl' does not support cookies. You will be anonymous.")
>   end
>
> The module 'http' should offer a similar service.

That's curious. A cookie jar is required in order to use HTTP
keepalive? Is that documented? It seems like a dubious dependency.

/s/ Adam

Reply | Threaded
Open this post in threaded view
|

Re: Persistent http connection

Dirk Laurie-2
Op Do., 8 Nov. 2018 om 02:47 het Srinivas Murthy
<[hidden email]> geskryf:

>
> This is an important issue for me. Can anyone please add to this? There's got to be a way to avoid conn setup/teardown each time we issue a http.request()
>
> Thanks
>
> On Wed, Nov 7, 2018 at 12:44 PM Coda Highland <[hidden email]> wrote:
>>
>> On Wed, Nov 7, 2018 at 1:56 PM Dirk Laurie <[hidden email]> wrote:
>> >
>> > You need to provide a file for stroing cookies.
>> >
>> > I can do it from the module 'curl' installed by 'luarocks install lua-curl'.
>> >
>> >   local session = curl.easy()
>> >   local cookies
>> >   if curl.OPT_COOKIEFILE and curl.OPT_COOKIEJAR then
>> >     cookies = os.tmpname()
>> >     session:setopt(curl.OPT_COOKIEFILE,cookies)
>> >     session:setopt(curl.OPT_COOKIEJAR,cookies)
>> >     logins[session] = cookies
>> >   else
>> >     message("Your 'curl' does not support cookies. You will be anonymous.")
>> >   end
>> >
>> > The module 'http' should offer a similar service.
>>
>> That's curious. A cookie jar is required in order to use HTTP
>> keepalive? Is that documented? It seems like a dubious dependency.

All I can attest to is that curl with a cookie jar works. It might
well be overkill of you need only keepalive, but since I use curl only
for sites that require me to login, I would not know.

Reply | Threaded
Open this post in threaded view
|

Re: Persistent http connection

Philippe Verdy
A cookie jar to store cookies is for something else: it does not create "keepalive" HTTP sessions, but allows restarting sessions that have been already closed by proving again the stored cookies.

No, there's NO need at all of ANY cookie jar storage in HTTP to use "keepalive" sessions.

So this just looks like a limit of the current "http.*" package you use which terminates all sessions immediately after each request instead of maintaining it (note: maintaing a session is not warranties: the server may close it at any time if your session is idle for too long, or for various reasons, but for HTTP sessions you want to use for example to performed streamed requests in a queue, reusing the session will always be faster than creating one new outgoing session for each request in your queue (including the cookies exchange which also add to the data volume transmitted).

Without "keepalive", if you want to perform a lot of small queries in a long series of requests, or want to do streaming, your client would rapidly exhaust its number of available TCP ports (because each new HTTP session allocates a new port number, and even if it is closed and the server has acknowledge that closure, that port number cannot be reused immediately before a delay which is generally about 30 seconds (this varies if you have proxies, or depending on your ISP or router, or depending on security thresholds: for security and privacy of TCP, this delay should never be reduced too much: there are security routers that force outgoing TCP port numbers to remain in FIN_WAIT state for more than 3 minutes, and if your Internet connection is shared by multiple users or other services, you'll fall to a case where you no longer have any usable local TCP ports to allow more connection, and your repeated HTTP queries will rapidly then fail to connect after a few thousands queries performed rapidly).

The "keepalive" feature was made a *standard* part of HTTP/1.1 (instead of being optional and not working very well in the now very old HTTP/1.0 protocol) exactly to preserve network resources, and get much better performance and lower bandwidth usage. Without it, most modern websites would be extremely slow.

Using "curl" to create a cookie jar does not mean that you are using keepalive, but at least it allows all requests you send using the same cookie jar file to reopen new sessions using the same resident cookies that were exchanged in the previous requests (so this "simulates" sessions, but still you'll see many outgoing TCP sockets that are "half-terminated" in FIN_WAIT state and that still block the unique port numbers that were assigned to that socket).

So I suggest instead using a virtual "local HTTP proxy" package that will maintain a session opened on an unterminated HTTP session, overwhich you'll queue one or multiple http requests. Such thing is used in (de)muxing subprotocols (of multimedia streaming protocols, or in VPN protocol managers), but it is even part of classic web browsers that queue the many queries that are performed in the background when visiting a website (generally a browser can create up to 4 HTTP sessions working in parallel, each one having its own queue of queries). All of these use the keepalive feature of HTTP/1.1.


Le jeu. 8 nov. 2018 à 08:49, Dirk Laurie <[hidden email]> a écrit :
Op Do., 8 Nov. 2018 om 02:47 het Srinivas Murthy
<[hidden email]> geskryf:
>
> This is an important issue for me. Can anyone please add to this? There's got to be a way to avoid conn setup/teardown each time we issue a http.request()
>
> Thanks
>
> On Wed, Nov 7, 2018 at 12:44 PM Coda Highland <[hidden email]> wrote:
>>
>> On Wed, Nov 7, 2018 at 1:56 PM Dirk Laurie <[hidden email]> wrote:
>> >
>> > You need to provide a file for stroing cookies.
>> >
>> > I can do it from the module 'curl' installed by 'luarocks install lua-curl'.
>> >
>> >   local session = curl.easy()
>> >   local cookies
>> >   if curl.OPT_COOKIEFILE and curl.OPT_COOKIEJAR then
>> >     cookies = os.tmpname()
>> >     session:setopt(curl.OPT_COOKIEFILE,cookies)
>> >     session:setopt(curl.OPT_COOKIEJAR,cookies)
>> >     logins[session] = cookies
>> >   else
>> >     message("Your 'curl' does not support cookies. You will be anonymous.")
>> >   end
>> >
>> > The module 'http' should offer a similar service.
>>
>> That's curious. A cookie jar is required in order to use HTTP
>> keepalive? Is that documented? It seems like a dubious dependency.

All I can attest to is that curl with a cookie jar works. It might
well be overkill of you need only keepalive, but since I use curl only
for sites that require me to login, I would not know.

Reply | Threaded
Open this post in threaded view
|

Re: Persistent http connection

Dirk Laurie-2
Op Do., 8 Nov. 2018 om 14:46 het Philippe Verdy <[hidden email]> geskryf:
>
> A cookie jar to store cookies is for something else: it does not create "keepalive" HTTP sessions, but allows restarting sessions that have been already closed by proving again the stored cookies.
>
> No, there's NO need at all of ANY cookie jar storage in HTTP to use "keepalive" sessions.
...
> Using "curl" to create a cookie jar does not mean that you are using keepalive, but at least it allows all requests you send using the same cookie jar file to reopen new sessions using the same resident cookies that were exchanged in the previous requests (so this "simulates" sessions, but still you'll see many outgoing TCP sockets that are "half-terminated" in FIN_WAIT state and that still block the unique port numbers that were assigned to that socket).

Thanks a lot for this explanation.

-- Dirk

Reply | Threaded
Open this post in threaded view
|

Re: Persistent http connection

Srinivas Murthy
In reply to this post by Philippe Verdy
"So I suggest instead using a virtual "local HTTP proxy" package that will maintain a session opened on an unterminated HTTP session, overwhich you'll queue one or multiple http requests. " 
Is anyone aware of a lua http lib that supports keepalive? Using a "local HTTP proxy" will still be a significant overhead if the client still has to setup a new conn for each req. These are streaming events and could be very frequent.



On Thu, Nov 8, 2018 at 4:47 AM Philippe Verdy <[hidden email]> wrote:
A cookie jar to store cookies is for something else: it does not create "keepalive" HTTP sessions, but allows restarting sessions that have been already closed by proving again the stored cookies.

No, there's NO need at all of ANY cookie jar storage in HTTP to use "keepalive" sessions.

So this just looks like a limit of the current "http.*" package you use which terminates all sessions immediately after each request instead of maintaining it (note: maintaing a session is not warranties: the server may close it at any time if your session is idle for too long, or for various reasons, but for HTTP sessions you want to use for example to performed streamed requests in a queue, reusing the session will always be faster than creating one new outgoing session for each request in your queue (including the cookies exchange which also add to the data volume transmitted).

Without "keepalive", if you want to perform a lot of small queries in a long series of requests, or want to do streaming, your client would rapidly exhaust its number of available TCP ports (because each new HTTP session allocates a new port number, and even if it is closed and the server has acknowledge that closure, that port number cannot be reused immediately before a delay which is generally about 30 seconds (this varies if you have proxies, or depending on your ISP or router, or depending on security thresholds: for security and privacy of TCP, this delay should never be reduced too much: there are security routers that force outgoing TCP port numbers to remain in FIN_WAIT state for more than 3 minutes, and if your Internet connection is shared by multiple users or other services, you'll fall to a case where you no longer have any usable local TCP ports to allow more connection, and your repeated HTTP queries will rapidly then fail to connect after a few thousands queries performed rapidly).

The "keepalive" feature was made a *standard* part of HTTP/1.1 (instead of being optional and not working very well in the now very old HTTP/1.0 protocol) exactly to preserve network resources, and get much better performance and lower bandwidth usage. Without it, most modern websites would be extremely slow.

Using "curl" to create a cookie jar does not mean that you are using keepalive, but at least it allows all requests you send using the same cookie jar file to reopen new sessions using the same resident cookies that were exchanged in the previous requests (so this "simulates" sessions, but still you'll see many outgoing TCP sockets that are "half-terminated" in FIN_WAIT state and that still block the unique port numbers that were assigned to that socket).

So I suggest instead using a virtual "local HTTP proxy" package that will maintain a session opened on an unterminated HTTP session, overwhich you'll queue one or multiple http requests. Such thing is used in (de)muxing subprotocols (of multimedia streaming protocols, or in VPN protocol managers), but it is even part of classic web browsers that queue the many queries that are performed in the background when visiting a website (generally a browser can create up to 4 HTTP sessions working in parallel, each one having its own queue of queries). All of these use the keepalive feature of HTTP/1.1.


Le jeu. 8 nov. 2018 à 08:49, Dirk Laurie <[hidden email]> a écrit :
Op Do., 8 Nov. 2018 om 02:47 het Srinivas Murthy
<[hidden email]> geskryf:
>
> This is an important issue for me. Can anyone please add to this? There's got to be a way to avoid conn setup/teardown each time we issue a http.request()
>
> Thanks
>
> On Wed, Nov 7, 2018 at 12:44 PM Coda Highland <[hidden email]> wrote:
>>
>> On Wed, Nov 7, 2018 at 1:56 PM Dirk Laurie <[hidden email]> wrote:
>> >
>> > You need to provide a file for stroing cookies.
>> >
>> > I can do it from the module 'curl' installed by 'luarocks install lua-curl'.
>> >
>> >   local session = curl.easy()
>> >   local cookies
>> >   if curl.OPT_COOKIEFILE and curl.OPT_COOKIEJAR then
>> >     cookies = os.tmpname()
>> >     session:setopt(curl.OPT_COOKIEFILE,cookies)
>> >     session:setopt(curl.OPT_COOKIEJAR,cookies)
>> >     logins[session] = cookies
>> >   else
>> >     message("Your 'curl' does not support cookies. You will be anonymous.")
>> >   end
>> >
>> > The module 'http' should offer a similar service.
>>
>> That's curious. A cookie jar is required in order to use HTTP
>> keepalive? Is that documented? It seems like a dubious dependency.

All I can attest to is that curl with a cookie jar works. It might
well be overkill of you need only keepalive, but since I use curl only
for sites that require me to login, I would not know.

Reply | Threaded
Open this post in threaded view
|

RE: Persistent http connection

Tim McCracken

>  All I can attest to is that curl with a cookie jar works. It might
>   well be overkill of you need only keepalive, but since I use curl only
>   for sites that require me to login, I would not know.

It sounds like you are confusing session state (keeping a session alive) using cookies versus HTTP keep-alive which are two very different things. HTTP keep-alive has nothing to do with being logged in or not. It only affects requests that occur over a very short time span as a previous person posted. For example, if you open a web page with 100 photos on it, each of those 100 photos is a different HTTP request. Keep-alive will enable all those to be downloaded using a single TCP session. But a few seconds after all the pages are down loaded, that TCP session goes away. But the presence or absence of cookies has nothing to do with this. And in fact, session state will work just fine without HTTP keep alive - and can span days, months or years depending on how the cookies are set to expire.

For the original poster: cURL c++ library supports keep-alive - I think they simply refer to it as TCP keep alive. I would be very surprised if the Lua thin wrapper libraries didn't allow support of it as well. You may just need to use a cURL based library if that will otherwise support your needs.
Reply | Threaded
Open this post in threaded view
|

Re: Persistent http connection

Philippe Verdy
In reply to this post by Dirk Laurie-2
What would be needed is that the "http" package included in the created instance an optional boolean "keepalive" property which (once set to true) woulc cause it to NOT close the session automatically once the result has arrived, but maintains the session open, and that it includes a "close()" method that your application can use when he non longer needs the session. That package should also include a "isalive()" method to check if the session is still opened (in "idle" state or "running" state waiting for replies).

That package could also handle a "request queue", to allow sending multiple queries in order (each query could be associated with some user data, so that when you get the results, you can identify to which query the results is coming. Ideally, the userdata should be the "query" itself which has its own state (possibly the resource name or query string, the verb like "GET"/"POST", other user data you can use for example to hold a persistent cookie jar or other persistant data needed by your aplpication; when you receive a reply (success or failure) you also get the reference to that userdata, and the state of the HTTP session.

Note: an HTTP session is not necessarily bound to a TCP session (it could be any bidirectional I/O channel): this binding is only done when you connect it (by opening a socket with the target indicated by the host and port number, i.e. the first part of the URL, all what is before first "/" or "?" or "#" of the full URL, but then excluding all what is after the first "#" which is an anchor): the HTTP protocol itself does not "understand" the domain name, host address, port number, and anchor, it only needs the query.

Note as well that an HTTP query is not limited to send only a single verb (like "GET", "POST", "PUT", "HEAD"...) and a resource name (usually starting with a "/"): it sends also a set of MIME headers, and you can also "stream" an output attachment which may be arbitrarily long: if you want the query to be handled asynchronously, you'll need that the async event handler can check the status of the session (idle, sending headers, sending an attachment, waiting for a reply from remote server, and notification that a reply starts to come, i.e. you've received at least HTTP response with a status, and then if the response received is complete and not just some of the MIME headers, and includes the full content body).

Normally it's up to the "http" package to handle itself and internally some responses like intermediate status while server confirms it has received a query and starts running but cannot give a definitive status to your query, or if the server replies with a redirect (it's up to the client to accept the redirect live "moved to", and if it accepts it, reexecute the query but for a new target).

The package should also include internally the support of "streamed" format for partial responses for the attachment, and encode/decode it for you), it should support also itself the negociation of options like data compression, encryption/decryption, and should handle itself the cookie jar or provide an API so that your client is notified when the server sends you a cookie that your client will store or discard as he wants.

If the "http" package works only in synchronous mode, then all queries are blocking, but then you cannot handle a queue of requests (so you don't need at all any private data: the query will be terminated, but this is very limitating because it does not allow sending large queries or receive large responses (either in MIME headers, or in the content body). Running asynchronously allows much better management, but doe not necessarily implies multithreading, and the Lua coroutines (with yield and resume) can be used to handle the state of a session in a "semi-blocking" way, just like I/O on files. Effectively the HTTP protocol is just using the generic "producer/consumer" concept over a single bidirectional I/O channel (it's not up to HTTP itself to open that channel and negociate the options, not even the HTTPS securisation, and it is agnostic about the transport protocol used, which may be TCP, or a TCP-like VPN over UDP, or a serial link, HTTP as well will not resolve itself DNS hostnames, needed before you can open an outgoing socket; and the protocol itself does not need that you initiated yourself the session before sending a query: that bidirectional I/O channel may have be initiated by the remote agent: HTTP queries are asymetric with a server and a client, but the asymetry is not necessarily the same for the bidirectional I/O channel on which it is established, so the same channel, once it's established and both agents are waiting for a query to execcute, may be used by one or the other agent to start a query in which case it will be a HTTP client and the other agent will be a HTTP server replying to the query, but the roles can be swapped later and it's up to each agent to decide when he wants to terminate/close the I/O channel otself, or to indicate to the other agent that he should terminate/close the session once it has processed the query or the response, and it is the role of the "keepalive" option in MIME headers; as well HTTP allows any of the two agent to terminate the session: this is indicated by an I/O error or close event deteted by the agent that was waiting the other party. HTTP has well does not provide itself the facility to close the I/O channel itself: the HTTP protocol is by default illimited in time).


Le jeu. 8 nov. 2018 à 14:05, Dirk Laurie <[hidden email]> a écrit :
Op Do., 8 Nov. 2018 om 14:46 het Philippe Verdy <[hidden email]> geskryf:
>
> A cookie jar to store cookies is for something else: it does not create "keepalive" HTTP sessions, but allows restarting sessions that have been already closed by proving again the stored cookies.
>
> No, there's NO need at all of ANY cookie jar storage in HTTP to use "keepalive" sessions.
...
> Using "curl" to create a cookie jar does not mean that you are using keepalive, but at least it allows all requests you send using the same cookie jar file to reopen new sessions using the same resident cookies that were exchanged in the previous requests (so this "simulates" sessions, but still you'll see many outgoing TCP sockets that are "half-terminated" in FIN_WAIT state and that still block the unique port numbers that were assigned to that socket).

Thanks a lot for this explanation.

-- Dirk
Reply | Threaded
Open this post in threaded view
|

Re: Persistent http connection

Daurnimator
In reply to this post by Srinivas Murthy
On Fri, 9 Nov 2018 at 04:36, Srinivas Murthy
<[hidden email]> wrote:
> Is anyone aware of a lua http lib that supports keepalive? Using a "local HTTP proxy" will still be a significant overhead if the client still has to setup a new conn for each req. These are streaming events and could be very frequent.

lua-http is gaining support for it soon
https://github.com/daurnimator/lua-http/pull/121
It's a much trickier problem than you may think at the surface!
especially once SSL is involved (and infact I have found bugs in
nginx's and curl's implementations while doing research for
lua-http's)

Reply | Threaded
Open this post in threaded view
|

Re: Persistent http connection

Srinivas Murthy
Appreciate all the discussion. I have experience with this before and yes its not simple to do it properly.
For now though, I'm in a tight time frame and need a simple solution that works with non - https solution. The closest I see is the curl wrapper that is mentioned. Any other ideas?

On Thu, Nov 8, 2018 at 6:06 PM Daurnimator <[hidden email]> wrote:
On Fri, 9 Nov 2018 at 04:36, Srinivas Murthy
<[hidden email]> wrote:
> Is anyone aware of a lua http lib that supports keepalive? Using a "local HTTP proxy" will still be a significant overhead if the client still has to setup a new conn for each req. These are streaming events and could be very frequent.

lua-http is gaining support for it soon
https://github.com/daurnimator/lua-http/pull/121
It's a much trickier problem than you may think at the surface!
especially once SSL is involved (and infact I have found bugs in
nginx's and curl's implementations while doing research for
lua-http's)

Reply | Threaded
Open this post in threaded view
|

Re: Persistent http connection

Philippe Verdy
You may want to look at LuaSocket to implement your own wrapper for persistent TCP sessions. On which you'll reimplement the HTTP protocol.
The bad thing is that it's not easily integrable on basic Lua servers without the native C support library (the biggest difficulty being that to fully implement persistent sessions and streaming, the simple consumer/receiver pattern for I/O used by Lua (which is synchronous and based on coroutines whose exectution is controled by blocking yield/resume calls) will not easily allow you to react on asynchronous send/receive events that a streaming protocol usually requires (as well you'll need to support true multithreading and nnot just cooperative threading).

Given these limits, the "http" package offers no real "resume" facility, and as the socket it creates is temporary and can be garbage collected (and then the FIN_WAIT state of the socket starting for a long time, forbidding reuse of the dynamically assigned TCP port numbert) as soon as it terminates a request, it is not easy to avoid its closure. So each query opens its own new separate socket and you'll "burn" a lot of outgoing local TCP ports if you intend to use it for streaming many small HTTP requests.

A solution/emulation however is possible (just like with old versions of WinSockets, in old cooperative-only versions of 16-bit Windows, also based on yields/resume with an message loop, which can easily adapted to pure-Lua coroutines, already used by the basic I/O library of Lua), provided that your Lua application is cooperative (and provides enough "yield" calls to serve both the "send" and "receive" events and manage the two message queues on that socket: one queue for outgoing HTTP requests, the other queue for the incoming responses). A smart implementation of HTTP would use not just a single pair of queue (one TCP session), but could create at least 4 pairs of queues per destination (i.e. host and port number, where a host is either a domain name or an IPv4 or IPv6 address). With that you would emulate what web browsers already do to load pages and multiple dependant scripts and images, without abusing remote server resources.

Note that the effect of absence of persistence of TCP sessions does not concern only the local host (for local outgoing TCP port numbers allocated to each new socket created by the client), but also the remove server (for local incoming TCP port numbers allocated to each accepted incoming requests): the exhaustion of port numbers on server may be even more critical, servers also needing to keep the FIN_WAIT delays if they want to secure their communications and avoid sending private data to other new incoming clients, or to avoid that incoming data coming to late from a previously connected client comes in to pollute the incoming data from new connections !

All HTTP clients and servers today need to support "keepalive" as described in HTTP/1.1. The old behavior without them (in HTTP/1.0) is no longer acceptable and cause severe security problems (notably it exposes servers to DOS attacks if they permit the same remote client to use an arbitrary number of new temporary incoming queries).


Le ven. 9 nov. 2018 à 19:02, Srinivas Murthy <[hidden email]> a écrit :
Appreciate all the discussion. I have experience with this before and yes its not simple to do it properly.
For now though, I'm in a tight time frame and need a simple solution that works with non - https solution. The closest I see is the curl wrapper that is mentioned. Any other ideas?

On Thu, Nov 8, 2018 at 6:06 PM Daurnimator <[hidden email]> wrote:
On Fri, 9 Nov 2018 at 04:36, Srinivas Murthy
<[hidden email]> wrote:
> Is anyone aware of a lua http lib that supports keepalive? Using a "local HTTP proxy" will still be a significant overhead if the client still has to setup a new conn for each req. These are streaming events and could be very frequent.

lua-http is gaining support for it soon
https://github.com/daurnimator/lua-http/pull/121
It's a much trickier problem than you may think at the surface!
especially once SSL is involved (and infact I have found bugs in
nginx's and curl's implementations while doing research for
lua-http's)

Reply | Threaded
Open this post in threaded view
|

Re: Persistent http connection

Philippe Verdy
Note that to avoid DOS attacks on servers or exhaustion of outgoing port numbers in clients, an OS may perform an optimization: if the closed TCP socket in FIN_WAIT state wants to be reconnected to the same remote host/port pair, it can reallocate the same port number directly, instead of allocating a separate new port number, and put it in CONNECTED state (and it may also avoid the TCP MTU size negociation at start of the session and reuse also an existing large "TCP window" size to allow "fast start").

Le ven. 9 nov. 2018 à 21:57, Philippe Verdy <[hidden email]> a écrit :
You may want to look at LuaSocket to implement your own wrapper for persistent TCP sessions. On which you'll reimplement the HTTP protocol.
The bad thing is that it's not easily integrable on basic Lua servers without the native C support library (the biggest difficulty being that to fully implement persistent sessions and streaming, the simple consumer/receiver pattern for I/O used by Lua (which is synchronous and based on coroutines whose exectution is controled by blocking yield/resume calls) will not easily allow you to react on asynchronous send/receive events that a streaming protocol usually requires (as well you'll need to support true multithreading and nnot just cooperative threading).

Given these limits, the "http" package offers no real "resume" facility, and as the socket it creates is temporary and can be garbage collected (and then the FIN_WAIT state of the socket starting for a long time, forbidding reuse of the dynamically assigned TCP port numbert) as soon as it terminates a request, it is not easy to avoid its closure. So each query opens its own new separate socket and you'll "burn" a lot of outgoing local TCP ports if you intend to use it for streaming many small HTTP requests.

A solution/emulation however is possible (just like with old versions of WinSockets, in old cooperative-only versions of 16-bit Windows, also based on yields/resume with an message loop, which can easily adapted to pure-Lua coroutines, already used by the basic I/O library of Lua), provided that your Lua application is cooperative (and provides enough "yield" calls to serve both the "send" and "receive" events and manage the two message queues on that socket: one queue for outgoing HTTP requests, the other queue for the incoming responses). A smart implementation of HTTP would use not just a single pair of queue (one TCP session), but could create at least 4 pairs of queues per destination (i.e. host and port number, where a host is either a domain name or an IPv4 or IPv6 address). With that you would emulate what web browsers already do to load pages and multiple dependant scripts and images, without abusing remote server resources.

Note that the effect of absence of persistence of TCP sessions does not concern only the local host (for local outgoing TCP port numbers allocated to each new socket created by the client), but also the remove server (for local incoming TCP port numbers allocated to each accepted incoming requests): the exhaustion of port numbers on server may be even more critical, servers also needing to keep the FIN_WAIT delays if they want to secure their communications and avoid sending private data to other new incoming clients, or to avoid that incoming data coming to late from a previously connected client comes in to pollute the incoming data from new connections !

All HTTP clients and servers today need to support "keepalive" as described in HTTP/1.1. The old behavior without them (in HTTP/1.0) is no longer acceptable and cause severe security problems (notably it exposes servers to DOS attacks if they permit the same remote client to use an arbitrary number of new temporary incoming queries).


Le ven. 9 nov. 2018 à 19:02, Srinivas Murthy <[hidden email]> a écrit :
Appreciate all the discussion. I have experience with this before and yes its not simple to do it properly.
For now though, I'm in a tight time frame and need a simple solution that works with non - https solution. The closest I see is the curl wrapper that is mentioned. Any other ideas?

On Thu, Nov 8, 2018 at 6:06 PM Daurnimator <[hidden email]> wrote:
On Fri, 9 Nov 2018 at 04:36, Srinivas Murthy
<[hidden email]> wrote:
> Is anyone aware of a lua http lib that supports keepalive? Using a "local HTTP proxy" will still be a significant overhead if the client still has to setup a new conn for each req. These are streaming events and could be very frequent.

lua-http is gaining support for it soon
https://github.com/daurnimator/lua-http/pull/121
It's a much trickier problem than you may think at the surface!
especially once SSL is involved (and infact I have found bugs in
nginx's and curl's implementations while doing research for
lua-http's)

Reply | Threaded
Open this post in threaded view
|

Re: Persistent http connection

Sam Chang
I have a small none-blocking project based on lua coroutine which support this case


by using “fan.http.core” module which was implemented based on CURL (connection reused by default).

also this is a small project, and i’m trying to replace curl implement with “fan.tcpd", but “fan.http.core" may solve your problem.

Regards,
Sam

On Nov 10, 2018, at 5:16 AM, Philippe Verdy <[hidden email]> wrote:

Note that to avoid DOS attacks on servers or exhaustion of outgoing port numbers in clients, an OS may perform an optimization: if the closed TCP socket in FIN_WAIT state wants to be reconnected to the same remote host/port pair, it can reallocate the same port number directly, instead of allocating a separate new port number, and put it in CONNECTED state (and it may also avoid the TCP MTU size negociation at start of the session and reuse also an existing large "TCP window" size to allow "fast start").

Le ven. 9 nov. 2018 à 21:57, Philippe Verdy <[hidden email]> a écrit :
You may want to look at LuaSocket to implement your own wrapper for persistent TCP sessions. On which you'll reimplement the HTTP protocol.
The bad thing is that it's not easily integrable on basic Lua servers without the native C support library (the biggest difficulty being that to fully implement persistent sessions and streaming, the simple consumer/receiver pattern for I/O used by Lua (which is synchronous and based on coroutines whose exectution is controled by blocking yield/resume calls) will not easily allow you to react on asynchronous send/receive events that a streaming protocol usually requires (as well you'll need to support true multithreading and nnot just cooperative threading).

Given these limits, the "http" package offers no real "resume" facility, and as the socket it creates is temporary and can be garbage collected (and then the FIN_WAIT state of the socket starting for a long time, forbidding reuse of the dynamically assigned TCP port numbert) as soon as it terminates a request, it is not easy to avoid its closure. So each query opens its own new separate socket and you'll "burn" a lot of outgoing local TCP ports if you intend to use it for streaming many small HTTP requests.

A solution/emulation however is possible (just like with old versions of WinSockets, in old cooperative-only versions of 16-bit Windows, also based on yields/resume with an message loop, which can easily adapted to pure-Lua coroutines, already used by the basic I/O library of Lua), provided that your Lua application is cooperative (and provides enough "yield" calls to serve both the "send" and "receive" events and manage the two message queues on that socket: one queue for outgoing HTTP requests, the other queue for the incoming responses). A smart implementation of HTTP would use not just a single pair of queue (one TCP session), but could create at least 4 pairs of queues per destination (i.e. host and port number, where a host is either a domain name or an IPv4 or IPv6 address). With that you would emulate what web browsers already do to load pages and multiple dependant scripts and images, without abusing remote server resources.

Note that the effect of absence of persistence of TCP sessions does not concern only the local host (for local outgoing TCP port numbers allocated to each new socket created by the client), but also the remove server (for local incoming TCP port numbers allocated to each accepted incoming requests): the exhaustion of port numbers on server may be even more critical, servers also needing to keep the FIN_WAIT delays if they want to secure their communications and avoid sending private data to other new incoming clients, or to avoid that incoming data coming to late from a previously connected client comes in to pollute the incoming data from new connections !

All HTTP clients and servers today need to support "keepalive" as described in HTTP/1.1. The old behavior without them (in HTTP/1.0) is no longer acceptable and cause severe security problems (notably it exposes servers to DOS attacks if they permit the same remote client to use an arbitrary number of new temporary incoming queries).


Le ven. 9 nov. 2018 à 19:02, Srinivas Murthy <[hidden email]> a écrit :
Appreciate all the discussion. I have experience with this before and yes its not simple to do it properly.
For now though, I'm in a tight time frame and need a simple solution that works with non - https solution. The closest I see is the curl wrapper that is mentioned. Any other ideas?

On Thu, Nov 8, 2018 at 6:06 PM Daurnimator <[hidden email]> wrote:
On Fri, 9 Nov 2018 at 04:36, Srinivas Murthy
<[hidden email]> wrote:
> Is anyone aware of a lua http lib that supports keepalive? Using a "local HTTP proxy" will still be a significant overhead if the client still has to setup a new conn for each req. These are streaming events and could be very frequent.

lua-http is gaining support for it soon
https://github.com/daurnimator/lua-http/pull/121
It's a much trickier problem than you may think at the surface!
especially once SSL is involved (and infact I have found bugs in
nginx's and curl's implementations while doing research for
lua-http's)