need help to handle fail to use lpeg to parse http upload file with multi-part

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

need help to handle fail to use lpeg to parse http upload file with multi-part

zhiguo zhao
Hi all,

attachment is post data(to upload file), when I use below code to parse only success on plain text file, failed with any binary file.
My quest is  lpeg only support plain text? and if support binary even partly, how to update my lpeg code to parse attachment if support binary.

thank you for you option or help.

```lua
local lpeg = require'lpeg'

local P,R,S,V = lpeg.P,lpeg.R,lpeg.S,lpeg.V
local    C,     Cb,     Cf,    Cg,      Ct,     Cmt =
    lpeg.C,lpeg.Cb,lpeg.Cf,lpeg.Cg,lpeg.Ct,lpeg.Cmt
local alpha  = R('az')+R('AZ')

local line       = P"--"
local crlf       = P"\r"^-1 * P"\n"
local name_value = Cg(C(alpha^1) * '=' * P'"'^-1 * C( (1-P'"'-crlf)^0 ) * P'"'^-1) - crlf
local name_      = alpha^1 * (P'-'^0    * alpha)^1
local subtype    = alpha^1 * (S('/-')^0 * alpha)^1

local header     = Ct(C(name_) * ': ' * C(subtype) * Cf(Ct"" * (S(',;') * ' ' * name_value)^0,rawset) )

local headers    = Ct( header * (crlf*header)^0 )

local build = function(boundary)
  local _, _, bound = string.find(boundary,'boundary=(.+)')
  if (bound) then
    boundary = P(bound)
  else
    boundary = P(boundary)
  end

  local node        = Ct(line*boundary*crlf*headers*crlf*crlf*C( P(1)^0- crlf*line*boundary ))
  local nodes       = Ct(node * (crlf*node)^0 * crlf * line*boundary*line)
  return nodes
end

local basic_parse = function(multipart, boundary)
  local nodes = build(boundary)
  p(nodes)
  p(multipart)
  p(#multipart)
  return lpeg.match(nodes,multipart)
end


local echo = function(s) print(s) end

local parse
parse=function(body, boundary)
 -- local io = require'enhance.io'
 -- io.savedata('toparse.bin',body)
  local basic = basic_parse(body,boundary)
  if not basic then
    print(boundary)
    print(body)
    print(#body)
    assert(nil,'fail to parse')
  end
  local ret = {}
  for i=1,#basic do
    local part = basic[i]
    local headers,value = part[1],part[2]
    if(string.lower(headers[1][1])=='content-disposition'
      and headers[1][2]=='form-data') then
      --form-data
      if (headers[1][3].filename) then
          --handle fileupload
          local file = {filename=headers[1][3].filename}
          file[1] = value
          for j=2,#headers do
            file[headers[j][1]] = headers[j][2]
          end
          ret[headers[1][3].name] = file
      else
        if #headers==1 then
          --simple key-value pair
          ret[headers[1][3].name] = value
        else
          local t = {}
          ret[headers[1][3].name] = t
          if(string.lower(headers[2][1])=='content-type'
            and string.match(headers[2][2],'^multipart')) then
            --multipart sub
            ret[headers[1][3].name] = assert(parse(value,headers[2][3].boundary))
          else
            ret[headers[1][3].name] = value
          end
        end
      end
    else
    end
  end
  return ret
end
--]]

return parse
```

toparse.bin (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: need help to handle fail to use lpeg to parse http upload file with multi-part

Sean Conner
It was thus said that the Great zhiguo zhao once stated:
> Hi all,
>
> attachment is post data(to upload file), when I use below code to parse
> only success on plain text file, failed with any binary file.
> My quest is  lpeg only support plain text? and if support binary even
> partly, how to update my lpeg code to parse attachment if support binary.
>
> thank you for you option or help.
>

  I use LPeg to parse CGI data and while my code isn't in a form to be
published quite yet [1], I do know that I can handle binary data just fine.
The main part of the code that will probably help you the most is:

  local boundary = lpeg.P("--" .. separator)
  local hdrs     = core.parse_headers(mime._HEADERS,contentdisp._HEADERS)
  local body     = lpeg.C((lpeg.P(1) - boundary)^0)
  local section  = boundary
                 * core.CRLF
                 * lpeg.Ct(lpeg.Cg(hdrs,"headers") * lpeg.Cg(body,"body"))
  local sections = lpeg.Ct(section^1) * boundary * lpeg.P"--" * core.CRLF
 
  local tmp = sections:match(data)

hdrs just parses the MIME headers for each section, body is what parses the
actual body of each section and is text/binary agnostic.  

  -spc

[1] It's 741 lines of Lua/LPeg code just to parse multipart/form-data
        and what I have works for my usecase; I can't guarentee it will work
        for all websites.

Reply | Threaded
Open this post in threaded view
|

Re: need help to handle fail to use lpeg to parse http upload file with multi-part

zhiguo zhao
passed after change local alpha  = R('az')+R('AZ')+R('09')

2015-09-22 16:19 GMT+08:00 Sean Conner <[hidden email]>:
It was thus said that the Great zhiguo zhao once stated:
> Hi all,
>
> attachment is post data(to upload file), when I use below code to parse
> only success on plain text file, failed with any binary file.
> My quest is  lpeg only support plain text? and if support binary even
> partly, how to update my lpeg code to parse attachment if support binary.
>
> thank you for you option or help.
>

  I use LPeg to parse CGI data and while my code isn't in a form to be
published quite yet [1], I do know that I can handle binary data just fine.
The main part of the code that will probably help you the most is:

  local boundary = lpeg.P("--" .. separator)
  local hdrs     = core.parse_headers(mime._HEADERS,contentdisp._HEADERS)
  local body     = lpeg.C((lpeg.P(1) - boundary)^0)
  local section  = boundary
                 * core.CRLF
                 * lpeg.Ct(lpeg.Cg(hdrs,"headers") * lpeg.Cg(body,"body"))
  local sections = lpeg.Ct(section^1) * boundary * lpeg.P"--" * core.CRLF

  local tmp = sections:match(data)

hdrs just parses the MIME headers for each section, body is what parses the
actual body of each section and is text/binary agnostic.

  -spc

[1]     It's 741 lines of Lua/LPeg code just to parse multipart/form-data
        and what I have works for my usecase; I can't guarentee it will work
        for all websites.