getF in lauxlib.c--why call feof?

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

getF in lauxlib.c--why call feof?

John D. Ramsdell-2
I recently realized some C code I wrote does something very similar to
what is done by luaL_loadfile in src/lauxlib.c.  I studied the code
and found, with no surprise, some really good ideas I hadn't thought
of.  For example, my code just printed an error message and exited
when a read error occurs, but luaL_loadfile adjusts the Lua stack
so that these errors are accessed as others are.

While almost all of the code was clear to me, I am unable to explain
one line of code in getF.  Why is it necessary to call feof before the
fread?  Won't fread return 0 if feof returns true?  It seems that
fread correctly handles the case of EOF.

static const char *getF (lua_State *L, void *ud, size_t *size) {
  LoadF *lf = (LoadF *)ud;
  (void)L;
  if (lf->extraline) {
    lf->extraline = 0;
    *size = 1;
    return "\n";
  }
  if (feof(lf->f)) return NULL;      /* Why? */
  *size = fread(lf->buff, 1, sizeof(lf->buff), lf->f);
  return (*size > 0) ? lf->buff : NULL;
}

John

Reply | Threaded
Open this post in threaded view
|

Re: getF in lauxlib.c--why call feof?

Luiz Henrique de Figueiredo
> Why is it necessary to call feof before the fread? Won't fread return
> 0 if feof returns true? It seems that fread correctly handles the case
> of EOF.

The man page says:

  fread does not distinguish between end-of-file and error, and callers
  must use feof(3) and ferror(3) to determine which occurred.


Reply | Threaded
Open this post in threaded view
|

Re: getF in lauxlib.c--why call feof?

Thomas Harning Jr.
On Tue, 25 Mar 2008 14:38:47 -0300
Luiz Henrique de Figueiredo <[hidden email]> wrote:

> > Why is it necessary to call feof before the fread? Won't fread
> > return 0 if feof returns true? It seems that fread correctly
> > handles the case of EOF.
> 
> The man page says:
> 
>   fread does not distinguish between end-of-file and error, and
> callers must use feof(3) and ferror(3) to determine which occurred.
> 
Wouldn't that mean that you should call feof/ferror to determine the
error after the fact?

Reply | Threaded
Open this post in threaded view
|

Re: getF in lauxlib.c--why call feof?

Daniel Stephens
In reply to this post by Luiz Henrique de Figueiredo
Luiz Henrique de Figueiredo wrote:
Why is it necessary to call feof before the fread? Won't fread return
0 if feof returns true? It seems that fread correctly handles the case
of EOF.

The man page says:

  fread does not distinguish between end-of-file and error, and callers
  must use feof(3) and ferror(3) to determine which occurred.



Lua doesn't appear to be using it that way (In that usage the call to feof is made AFTER fread returns 0, in order to figure out why), because feof is called before, and the end result is identical to that obtained if feof returns false and fread then returns 0.

/usually/ feof is called after fread fails because you dont always know you're at the end until you try reading past it.

(Certainly lua isn't doing anything /wrong/ here, but the call does seem unnecessary)

Daniel.

Reply | Threaded
Open this post in threaded view
|

Re: getF in lauxlib.c--why call feof?

Roberto Ierusalimschy
In reply to this post by John D. Ramsdell-2
> While almost all of the code was clear to me, I am unable to explain
> one line of code in getF.  Why is it necessary to call feof before the
> fread?  Won't fread return 0 if feof returns true?  It seems that
> fread correctly handles the case of EOF.

This was added to the code in June 2002, with no proper documentation.
I am afraid we will have to figure out again why...

-- Roberto

Reply | Threaded
Open this post in threaded view
|

Re: getF in lauxlib.c--why call feof?

Edgar Toernig
In reply to this post by John D. Ramsdell-2
John D. Ramsdell wrote:
>
> While almost all of the code was clear to me, I am unable to explain
> one line of code in getF.  Why is it necessary to call feof before the
> fread?  Won't fread return 0 if feof returns true?  It seems that
> fread correctly handles the case of EOF.
> 
>   if (feof(lf->f)) return NULL;      /* Why? */
>   *size = fread(lf->buff, 1, sizeof(lf->buff), lf->f);
>   return (*size > 0) ? lf->buff : NULL;

It's a precaution to not require excessive "CTRL-D"s when
reading from a terminal/stdin. (It may also help on not 100%
correct stdio implementations.)

Contrary to files, an EOF is not a permanent conditions on
terminal devices: one fread/getc/etc may signal EOF (because
the user pressed CTRL-D) but the next fread will happily try
to read more data (and block).

Try "lua -" and count how many "CTRL-D"s are required with
and without that "if (feof..." line.

Ciao, ET.

Reply | Threaded
Open this post in threaded view
|

Re: getF in lauxlib.c--why call feof?

Daniel Stephens
On OSX my experience doesn't match your explanation, for two reasons...

1) feof doesn't actually do anything that'd cause the EOF flag to get set, it just checks the existing flag 2) While the 'condition' isn't permanent, the FLAG is, until you call clearerr

In this code, feof is completely unnecessary.

A more thorough implementation might be something along the lines of:

*size = fread(lf->buff, 1, sizeof(lf->buff), lf->f);
if (*size > 0) return lf->buff;
if (feof(lf->f)) {
 /* Do something indicating EOF occurred */
 return NULL;
}
if (ferror(lf->f)) {
 /* Do something indicating error occurred */
}
/* Do something indicating immediate return from read (no error -- was lf->buff empty?) */


Daniel.

Edgar Toernig wrote:
John D. Ramsdell wrote:
While almost all of the code was clear to me, I am unable to explain
one line of code in getF.  Why is it necessary to call feof before the
fread?  Won't fread return 0 if feof returns true?  It seems that
fread correctly handles the case of EOF.

  if (feof(lf->f)) return NULL;      /* Why? */
  *size = fread(lf->buff, 1, sizeof(lf->buff), lf->f);
  return (*size > 0) ? lf->buff : NULL;

It's a precaution to not require excessive "CTRL-D"s when
reading from a terminal/stdin. (It may also help on not 100%
correct stdio implementations.)

Contrary to files, an EOF is not a permanent conditions on
terminal devices: one fread/getc/etc may signal EOF (because
the user pressed CTRL-D) but the next fread will happily try
to read more data (and block).

Try "lua -" and count how many "CTRL-D"s are required with
and without that "if (feof..." line.

Ciao, ET.



Reply | Threaded
Open this post in threaded view
|

Re: getF in lauxlib.c--why call feof?

Edgar Toernig
Daniel Stephens wrote:
>
> On OSX my experience doesn't match your explanation, for two reasons...

Did you try the "lua -" + CTRL-D test with and without the line?

> 1) feof doesn't actually do anything that'd cause the EOF flag to get 
> set, it just checks the existing flag

Right, the feof flag is set somewhere deep within stdio.

> 2) While the 'condition' isn't permanent, the FLAG is, until you call 
> clearerr

Exactly.  And that's the point: it's a notification that stdio has
seen EOF on the FILE*.

> In this code, feof is completely unnecessary.

Only if you never call fread/fgetc/etc after you've seen an EOF.
But if you call one of these routines again with the expectation
that they will report EOF again you *may* lose.

> A more thorough implementation might be something along the lines of:
> 
> *size = fread(lf->buff, 1, sizeof(lf->buff), lf->f);
> if (*size > 0) return lf->buff;
> if (feof(lf->f)) {
>   /* Do something indicating EOF occurred */
>   return NULL;
> }

This feof-test is at the wrong place.  First, you may already
be at EOF (happens in Lua) but more important, the fread may
return > 0 but may have also set feof.  If you call fread again
and lf->f is a terminal it will wait for the next line or CTRL-D.
To circumvent this, the feof-test has to be before the fread.

Ciao, ET.

Reply | Threaded
Open this post in threaded view
|

Re: getF in lauxlib.c--why call feof?

Luiz Henrique de Figueiredo
> more important, the fread may return > 0 but may have also set feof.
> If you call fread again and lf->f is a terminal it will wait for the
> next line or CTRL-D.

That's exactly the point. Thanks, ET.

Reply | Threaded
Open this post in threaded view
|

Re: getF in lauxlib.c--why call feof?

Daniel Stephens
In reply to this post by Edgar Toernig
Edgar Toernig wrote:
Daniel Stephens wrote:
On OSX my experience doesn't match your explanation, for two reasons...

Did you try the "lua -" + CTRL-D test with and without the line?

Yes, it works just fine on OSX, but either way, if that's failing it's much more likely the fault of the call to ungetc earlier that doesn't bother to check if c == EOF.

In this code, feof is completely unnecessary.

Only if you never call fread/fgetc/etc after you've seen an EOF.
But if you call one of these routines again with the expectation
that they will report EOF again you *may* lose.

I disagree, it's unnecessary as long as PREVIOUS reads bothered to check feof (which is what they're SUPPOSED to do, but the calling module (luaL_loadfile in this case) isn't very consistent there).

This feof-test is at the wrong place.  First, you may already
be at EOF (happens in Lua) but more important, the fread may
return > 0 but may have also set feof.  If you call fread again
and lf->f is a terminal it will wait for the next line or CTRL-D.
To circumvent this, the feof-test has to be before the fread.
I'm not the only person who interprets the documentation to mean 'call feof after you've had a read fail', my experience has been that subsequent reads will fail immediately until you clear the flag, but I'll admit that i didn't do exhaustive testing on the latter.




Reply | Threaded
Open this post in threaded view
|

Re: getF in lauxlib.c--why call feof?

Daniel Stephens
Daniel Stephens wrote:
but either way, if that's failing it's much more likely the fault of the call to ungetc earlier that doesn't bother to check if c == EOF.
I'll retract that part of what I said, as I checked the documentation and ungetc has a very specific (and useful) outcome in that case.

(lua - still passes the Ctrl-D test with the feof test in getF removed tho)

Daniel.

Reply | Threaded
Open this post in threaded view
|

Re: getF in lauxlib.c--why call feof?

Luiz Henrique de Figueiredo
> (lua - still passes the Ctrl-D test with the feof test in getF removed tho)

It does in Mac OS X, but in Linux you have to press Ctrl-D twice
if you enter some text and three times if you press it immediately!

% lua -
^D^D^D
% lua -
a=1
^D^D
%

So, like ET said, it's a precaution against faulty handling of EOF in terminals.

Python seems to have a similar problem:
	http://www.thescripts.com/forum/thread507634.html

I've tested it and that program needs two ^D to end to loop, even in Python 2.5.
--lhf

Reply | Threaded
Open this post in threaded view
|

Re: getF in lauxlib.c--why call feof?

John D. Ramsdell-2
In reply to this post by John D. Ramsdell-2
> more important, the fread may return > 0 but may have also set feof.
> If you call fread again and lf->f is a terminal it will wait for the
> next line or CTRL-D.

Thank you for this explanation.  Please add a comment to the code that
makes it easy for readers to understand this line of code.  You could
simply modify ET's quote if you're stuck for words.

/* EOF flag checked here because when lf->f is a terminal, the fread
   may return > 0 even when it is set.  If you call fread again and
   lf->f is a terminal it will wait for the next line or CTRL-D. */

John

Reply | Threaded
Open this post in threaded view
|

Re: getF in lauxlib.c--why call feof?

Luiz Henrique de Figueiredo
> /* EOF flag checked here because when lf->f is a terminal, the fread
>    may return > 0 even when it is set.  If you call fread again and
>    lf->f is a terminal it will wait for the next line or CTRL-D. */

Actually I think the exact problem is a little different: fread can
return > 0  *and* set the EOF flag. The next time getF is called, if
you call fread, then the terminal will wait for user input. By calling
feof before fread, you avoid this wait.

Reply | Threaded
Open this post in threaded view
|

Re: getF in lauxlib.c--why call feof?

Edgar Toernig
Luiz Henrique de Figueiredo wrote:
>
> > /* EOF flag checked here because when lf->f is a terminal, the fread
> >    may return > 0 even when it is set.  If you call fread again and
> >    lf->f is a terminal it will wait for the next line or CTRL-D. */
> 
> Actually I think the exact problem is a little different: fread can
> return > 0  *and* set the EOF flag. The next time getF is called, if
> you call fread, then the terminal will wait for user input. By calling
> feof before fread, you avoid this wait.

How about

	/*
	 * Trying to read past EOF or after an error exposes implemen-
	 * tation and device (file/tty) differences.  Better not ...
	 */
	if (feof(lf->f) || ferror(lf->f))
		return NULL;
?

Ciao, ET.

Reply | Threaded
Open this post in threaded view
|

Re: getF in lauxlib.c--why call feof?

David Jones-2

On 26 Mar 2008, at 20:51, Edgar Toernig wrote:
Luiz Henrique de Figueiredo wrote:

/* EOF flag checked here because when lf->f is a terminal, the fread
   may return > 0 even when it is set.  If you call fread again and
   lf->f is a terminal it will wait for the next line or CTRL-D. */

Actually I think the exact problem is a little different: fread can
return > 0  *and* set the EOF flag. The next time getF is called, if
you call fread, then the terminal will wait for user input. By calling
feof before fread, you avoid this wait.

How about

	/*
	 * Trying to read past EOF or after an error exposes implemen-
	 * tation and device (file/tty) differences.  Better not ...
	 */
	if (feof(lf->f) || ferror(lf->f))
		return NULL;

Perhaps with an additional note that this is only necessary for implementations of fread that violate the C standard:

ISO 9899:1999 Section 7.19.7.1 (fgetc):

"If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end- of-file indicator for the stream is set and the fgetc function returns EOF"

(and fread is documented as working as if it calls fgetc many times).

This violation (in general that successive reads from a terminal only return EOF once) is a traditional Unix behaviour and there may be many, older, Unix programs that rely on it. Which is perhaps why Linux implements it like it does.

drj