Convert bytes array to string

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Convert bytes array to string

SATO Seichi
Hi;

How can I convert bytes array to string efficiently in Lua?
or must I extend with C?

a sample I wrote:

function bytestostring (bytes)
  s = ""
  for i=1,getn(bytes) do
    s = s..strchar(bytes[i])  -- '..' create new string. Expensive!!
  end
  return s
end


Thanks.

-- 
SATO Seichi

Reply | Threaded
Open this post in threaded view
|

Re: Convert bytes array to string

Eric Tetz-2
--- SATO Seichi <[hidden email]> wrote:
> How can I convert bytes array to string efficiently in Lua?
> or must I extend with C?
> 
> a sample I wrote:
> 
> function bytestostring (bytes)
>   s = ""
>   for i=1,getn(bytes) do
>     s = s..strchar(bytes[i])  -- '..' create new string. Expensive!!
>   end
>   return s
> end

You can do something like this:

  function bytestostring (bytes)
    local buf = strrep(' ', getn(bytes))
    local i = 0
    return (gsub (buf, '(.)', 
        function(c)
          i = i + 1
          return strchar(bytes[i])
        end))
  end
 
Not exactly pretty, but it's a lot faster than string concatenation.

Cheers,
Eric


__________________________________________________
Do You Yahoo!?
Yahoo! Sports - live college hoops coverage
http://sports.yahoo.com/

Reply | Threaded
Open this post in threaded view
|

Re: Convert bytes array to string

SATO Seichi
Eric, thank you for your reply.
But your gsub approach is slower than concatenation.

this is I wrote:

  function bytetostring (bytes)
    local buf = strrep(' ', getn(bytes))
    local index = {n = 0}
    local rep = function (c)
                  %index.n = %index.n + 1
                  return strchar(%bytes[%index.n])
                end
    return gsub(buf, '(.)', rep)
  end


On Sat, Mar 16, 2002 at 09:47:21AM -0800, Eric Tetz wrote:
> You can do something like this:
> 
>   function bytestostring (bytes)
>     local buf = strrep(' ', getn(bytes))
>     local i = 0
>     return (gsub (buf, '(.)', 
>         function(c)
>           i = i + 1
>           return strchar(bytes[i])
>         end))
>   end
>  
> Not exactly pretty, but it's a lot faster than string concatenation.

-- 
SATO Seichi

Reply | Threaded
Open this post in threaded view
|

Re: Convert bytes array to string

Eric Tetz-2
--- SATO Seichi <[hidden email]> wrote:
> Eric, thank you for your reply.
> But your gsub approach is slower than concatenation.
> 
> this is I wrote:
> 
>   function bytetostring (bytes)
>     local buf = strrep(' ', getn(bytes))
>     local index = {n = 0}
>     local rep = function (c)
>                   %index.n = %index.n + 1
>                   return strchar(%bytes[%index.n])
>                 end
>     return gsub(buf, '(.)', rep)
>   end

Well, in Lua 4.1w4 it's nearly 10 times faster than concatenation.

I'm surpised it's so much slower in Lua 4.

Cheers,
Eric

__________________________________________________
Do You Yahoo!?
Yahoo! Sports - live college hoops coverage
http://sports.yahoo.com/

Reply | Threaded
Open this post in threaded view
|

Re: Convert bytes array to string

SATO Seichi
I try this test in

1) Lua 4.0 on Darwin 5.3
2) Lua 4.0 on Debian GNU/Linux 2.2
3) Lua 4.1w4 on Darwin 5.3
4) Lua 4.1w4 on Debian GNU/Linux 2.2

And in each case concatenation approach is faster.

So piece together with the disagreement with Eric's result,
I think Lua 4.1w4 makes some platform dependent optimization
around gsub function.

Does anyone knows abount this?

--
SATO Seichi

Reply | Threaded
Open this post in threaded view
|

Re: Convert bytes array to string

Roberto Ierusalimschy
> I try this test in
> 
> [...]
>
> And in each case concatenation approach is faster.
> 
> [...]
>
> Does anyone knows abount this?

Have you both agreed about how to test each solution? (I haven't seen
that on the messages...) On my machine, the concat solution is
faster than gsub for small arrays, but gsub gets better as the size
grows. This is expected, as the concat solution is ~O(n^2), see
http://www.lua.org/notes/ltn009.html.

With Lua w4, you can try the new "concat" function:

  function bytestostring2 (bytes)
    local w = {}
    for i=1,getn(bytes) do w[i] = strchar(bytes[i]) end
    return concat(w)
  end

-- Roberto

Reply | Threaded
Open this post in threaded view
|

RE: Convert bytes array to string

Luiz Carlos de Castro Silveira Filho
In reply to this post by SATO Seichi
Probably the lua version has nothing to do with that.

What happens is that your algorithm is faster for small byte sequences (lets say, 10, 100 bytes) and Eric's is faster for larger
sequences.

This is because Eric's code has a startup time larger than your's (calling strrep, defining a function inside the function and,
possibly, the secrets that goes inside gsub itself).

If you're worried about performance, you may want to use the 2 algorithms. You'll need to write some tests to "calibrate" for best
performance (determining the region -- number of bytes -- in which one code is faster than the other) and write a function that
chooses the best alternative. This region may be platform dependent and lua version dependent.

If you want to be sure that this part of your code is really the responsible for slowness, you can use the LuaProfiler (it's in the
wiki)

Luiz.

> -----Original Message-----
> From: [hidden email]
> [[hidden email] Behalf Of SATO Seichi
> Sent: Sunday, March 17, 2002 12:08 AM
> To: Multiple recipients of list
> Subject: Re: Convert bytes array to string
>
>
> I try this test in
>
> 1) Lua 4.0 on Darwin 5.3
> 2) Lua 4.0 on Debian GNU/Linux 2.2
> 3) Lua 4.1w4 on Darwin 5.3
> 4) Lua 4.1w4 on Debian GNU/Linux 2.2
>
> And in each case concatenation approach is faster.
>
> So piece together with the disagreement with Eric's result,
> I think Lua 4.1w4 makes some platform dependent optimization
> around gsub function.
>
> Does anyone knows abount this?
>
> --
> SATO Seichi
>


Reply | Threaded
Open this post in threaded view
|

Re: Convert bytes array to string

Philippe Lhoste
In reply to this post by SATO Seichi
> > I try this test in
> > 
> > [...]
> >
> > And in each case concatenation approach is faster.
> > 
> > [...]
> >
> > Does anyone knows abount this?
> 
> Have you both agreed about how to test each solution? (I haven't seen
> that on the messages...) On my machine, the concat solution is
> faster than gsub for small arrays, but gsub gets better as the size
> grows. This is expected, as the concat solution is ~O(n^2), see
> http://www.lua.org/notes/ltn009.html.
> 
> With Lua w4, you can try the new "concat" function:
> 
> function bytestostring2 (bytes)
> local w = {}
> for i=1,getn(bytes) do w[i] = strchar(bytes[i]) end
> return concat(w)
> end
> 
> -- Roberto

Argh, I was about to say the same thing! :-)
Well, I can as well publish my results on Win32. Since the figures were too
small for clock(), I just finished my Win32 library (with a lot of dummy
functions!) to a pair of functions using the Pentium performance counter to get a
lot more precision. That's why I lost so much time :-)

Reusing the given functions:

function bytestostring(bytes)
  s = ""
  for i = 1, getn(bytes) do
    s = s .. strchar(bytes[i]) -- '..' create new string. Expensive!!
  end
  return s
end

function bytestostring40(bytes)
  local buf = strrep(' ', getn(bytes))
  local index = {n = 0}
  local rep = function (c)
    %index.n = %index.n + 1
    return strchar(%bytes[%index.n])
  end
  return gsub(buf, '(.)', rep)
end

function bytestostring41(bytes)
  local buf = strrep(' ', getn(bytes))
  local i = 0
  return (gsub (buf, '(.)',
    function(c)
      i = i + 1
      return strchar(bytes[i])
    end))
end

function FastBytesToString(bytes)
  s = {}
  for i = 1, getn(bytes) do
    s[i] = strchar(bytes[i])
  end
  return concat(s)
end

str = {}
for i = 32, 127 do tinsert(str, i) end
StartFineClock() -- return current raw counter value
print(slowbytestostring(str))
-- return time in seconds, in microseconds, in raw counter value
-- if given false argument, don't reset the counter
_, t = GetFineClockTime()
print(t)
print(bytestostring40(str))
_, t = GetFineClockTime()
print(t)
print(bytestostring41(str))
_, t = GetFineClockTime()
print(t)
print(FastBytesToString(str))
_, t = GetFineClockTime()
print(t)

lua -c -f "TestConcat.lua"

!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
bytestostring: 342.7809001476724

!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~	96
bytestostring40: 378.8189898942492

!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
bytestostring41: 313.4475712841796

!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
FastBytesToString: 232.1523455767855

The figures vary a bit from one run to the other, but are consistent.

BTW, I wonder why bytestostring40 returns two values (because of gsub) and
not bytestostring41?
Oh, I can answer myself, it is because the use of the seemingly useless
parenthesis around the return value of bytestostring41. Actually, they are not
useless! With the new syntax, it reduces the number of returned arguments to
the first one.
I note this because it is a good way for me (and perhaps others) to better
memorize this feature...

Regards.

-- 
--=#=--=#=--=#=--=#=--=#=--=#=--=#=--=#=--=#=--
Philippe Lhoste (Paris -- France)
Professional programmer and amateur artist
http://jove.prohosting.com/~philho/
--=#=--=#=--=#=--=#=--=#=--=#=--=#=--=#=--=#=--

GMX - Die Kommunikationsplattform im Internet.
http://www.gmx.net


Reply | Threaded
Open this post in threaded view
|

Re: Convert bytes array to string

Eric Tetz-2
In reply to this post by Roberto Ierusalimschy
--- Roberto Ierusalimschy <[hidden email]> wrote:
> Have you both agreed about how to test each solution? (I haven't seen
> that on the messages...) On my machine, the concat solution is
> faster than gsub for small arrays, but gsub gets better as the size
> grows. This is expected, as the concat solution is ~O(n^2), see
> http://www.lua.org/notes/ltn009.html.

I included the test code I used below, modified to use your 'concat' example and to try each
algorithm on small and large input. The results:

  Function         Small Input   Large Input
   concatenation    0.08          11.706
   'gsub'           0.121          4.456
   'concat'         0.1            3.165

Doubling the input size makes it easy to see that 'gsub' and 'concat' are linear:

  Function         Small Input   Large Input
   concatenation    0.16          53.467
   'gsub'           0.23           8.823
   'concat'         0.17           6.359

I used the gsub trick to speed up life.lua significantly (it's output routine uses concatenation).
I suppose it should be switched to 'concat' for Lua 4.1.

Cheers,
Eric

small = {
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0
}

large = {
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
  1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,
}

function b2s_concatenation (bytes)
   local s = ""
   for i=1,getn(bytes) do s = s..strchar(bytes[i]) end
   return s
end

function b2s_gsub (bytes)
  local buf = strrep(' ', getn(bytes))
  local i = 0
  return (gsub (buf, '(.)', 
    function(c)
      i = i + 1
      return strchar(bytes[i])
    end))
end

function b2s_libconcat (bytes)
  local w = {}
  for i=1,getn(bytes) do w[i] = strchar(bytes[i]) end
  return concat(w)
end

function timethis (f, data)
  local start = clock()
  for i=1,1000 do f(data) end
  return clock() - start
end

function test(f, fname)
  print(
    format("%s\n\tsmall input: %g\n\tlarge input: %g\n\n",
      fname,
      timethis(f,small),
      timethis(f,large)
      ))
end

test (b2s_concatenation, "b2s_concatenation")
test (b2s_gsub,          "b2s_gsub")
test (b2s_libconcat,     "b2s_libconcat")





__________________________________________________
Do You Yahoo!?
Yahoo! Sports - live college hoops coverage
http://sports.yahoo.com/