[Rd] (PR#7899) seek(con, 0, "end", rw="r") does not always work

ligges at statistik.uni-dortmund.de ligges at statistik.uni-dortmund.de
Sat May 28 18:11:31 CEST 2005


Tony Plate wrote:
> ligges at statistik.uni-dortmund.de wrote:
> 
>> tplate at blackmesacapital.com wrote:
>>
>>
>>> I've noticed that seek(con, 0, "end", rw="r") on a file connection 
>>> does not always work correctly after a write (R 2.1.0 on Windows).
>>>
>>> [Is a call to fflush() needed inside file_seek() in main/connections.c?]
>>
>>
>>
>>
>> If you have an idea where to fflush() precisely and your patch works, 
>> please tell it! I'll happily run some test cases where seeking matters.
>>
> 
> I couldn't see why the current code was returning a bad value under some 
> conditions.  (That's why didn't offer anything more than a suggestion). 
>  My suggestion to use an fflush() was a guess (hence the question mark, 
> but evidence for the guess being correct was that doing a flush at the R 
> command line made the whole thing work correctly.)  To be safe, I would 
> try to put a flush() right at the beginning of file_seek(), before the 
> call to f_tell().  I tried this, and with the modification the test case 
> I gave produced correct output.  Here's how the beginning of my modified 
> file_seek() function (in main/connections.c) looks:
 >
> static double file_seek(Rconnection con, double where, int origin, int rw)
> {
>     Rfileconn this = con->private;
>     FILE *fp = this->fp;
> #if defined(HAVE_OFF_T) && defined(__USE_LARGEFILE)
>     off_t pos;
> #else
> #ifdef Win32
>     off64_t pos;
> #else
>     long pos;
> #endif
> #endif
>     int whence = SEEK_SET;
>     fflush(fp);
>     pos = f_tell(fp);
> 
>     /* make sure both positions are set */
> 


Works for your example, but I found another one where it introduces a 
worse bug when using origin="current". Hence it's not that easy.

After reviewing this issue more closely, I think writeLines() into a 
binary connection might be the real problem and a misuse in this case. 
See the last paragrpah in the Details Section of ?writeLines. Hence, 
this might also be an issue related to the text mode connection problem 
on Windows.

Using simple writeChar and readChar statements works as expected for me 
(at least, I was not able to produce anything unexpected). I'm no longer 
convinced that this is a bug in R.




>> Note that ?seek currently tells us "The value returned by 
>> seek(where=NA) appears to be unreliable on Windows systems, at least 
>> for text files."
>> It would be nice if this comment could be removed, of course ....
> 
> 
> May the explanation could be given that this happens with text files 
> because Windows inserts extra characters at end-of-lines when reading 
> "text" mode files (but with binary files, things should be fine.)  This 
> particular issue is documented in Microsoft Windows documentation (e.g., 
> at http://msdn2.microsoft.com/library/75yw9bf3(en-us,vs.80).aspx, found 
> by searching on Google using the terms "fseek windows documentation"). 
> Are there any known issues using seek with binary files under Windows? 
> If there are not, then the caveat could be made specific to text files 
> and all vagueness removed.

Hmm, all I find (including your link) is Windows CE related ...

Uwe Ligges



> 
> -- Tony Plate
> 
>>
>> Uwe Ligges
>>
>>
>>
>>
>>> Example (see the lines with the "***WRONG***" comment)
>>>
>>> > # seek(, rw="r") on a file does not always work correctly after a 
>>> write
>>> > f <- file("tmp3.txt", "w+b")
>>> > # Write something earlier in the file
>>> > seek(f, 10, rw="w")
>>> [1] 0
>>> > writeLines(c("ghi", "jkl"), f)
>>> > seek(f, 20, rw="w")
>>> [1] 18
>>> > writeLines(c("abc"), f)
>>> > seek(f, 0, "end", rw="w")
>>> [1] 24
>>> > # Try to read at the end of the file
>>> > seek(f, 0, "end", rw="r")
>>> [1] 0
>>> > readLines(f, -1)
>>> character(0)
>>> > seek(f, 0, "end", rw="w")
>>> [1] 18
>>> > # write something at the end of the file
>>> > writeLines(c("def"), f)
>>> > # Try to read at the end of the file
>>> > # flush(f) # flushing here makes the seek work correctly
>>> > seek(f, 0, "end", rw="r")
>>> [1] 24
>>> > seek(f, NA, rw="r") # ***WRONG*** (should return 28)
>>> [1] 24
>>> > readLines(f, -1) # ***WRONG*** (should return character(0))
>>> [1] "def"
>>> > seek(f, 20, rw="r")
>>> [1] 28
>>> > readLines(f, -1)
>>> [1] "abc" "def"
>>> > seek(f, 0, "end", rw="r") # now it works correctly
>>> [1] 28
>>> > seek(f, NA, rw="r")
>>> [1] 28
>>> > readLines(f, -1)
>>> character(0)
>>> > close(f)
>>> >
>>> > version
>>>          _
>>> platform i386-pc-mingw32
>>> arch     i386
>>> os       mingw32
>>> system   i386, mingw32
>>> status
>>> major    2
>>> minor    1.0
>>> year     2005
>>> month    04
>>> day      18
>>> language R
>>> >
>>>
>>> -- Tony Plate
>>>
>>> ______________________________________________
>>> R-devel at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>>
>> ______________________________________________
>> R-devel at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>



More information about the R-devel mailing list