[R] Argh! Trouble using string data read from a file

Ted Byers r.ted.byers at gmail.com
Thu Oct 16 06:35:41 CEST 2008


Thank you Prof. Ripley.

I appreciate this.

Have a good day.

Ted

On Thu, Oct 16, 2008 at 12:20 AM, Prof Brian Ripley
<ripley at stats.ox.ac.uk> wrote:
> On Wed, 15 Oct 2008, Ted Byers wrote:
>
>> Thanks Jim,
>>
>> I hadn't seen the distinction between the commandline in RGui and what
>> happens within my code.
>>
>> I have, however seen other differences I don't understand.  For
>> example, looking at the documentation for RScript, I see:
>>
>> Rscript [options] [-e expression] file [args]
>>
>> And the example:
>>
>> Rscript -e 'date()' -e 'format(Sys.time(), "%a %b %d %X %Y")'
>>
>>
>> So I tried it (Windows XP; R2.7.2), and this is what I got with just
>> copy directly from the documentation and pasting into the Windows
>> commandline window:
>
> Your problem is the shell quoting: the Windows shell requires ". E.g.
>
> C:\> d:/R/R-2.7.2/bin/Rscript -e "date()" -e "format(Sys.time(), \"%a %b %d
> %X %Y\")"
> [1] "Thu Oct 16 05:16:46 2008"
> [1] "Thu Oct 16 05:16:46 2008"
>
> Other shells (e.g. bash, tcsh) do allow '', and indeed that is the preferred
> form there.  See ?shQuote .
>
>>
>> C:\>Rscript -e 'date()' -e 'format(Sys.time(), "%a %b %d %X %Y")'
>> [1] "date()"
>>
>> C:\>Rscript -e 'format(Sys.time(), "%a %b %d %X %Y")'
>>
>> C:\>
>>
>> But within RGui, I get:
>>
>>> date();format(Sys.time(), "%a %b %d %X %Y")
>>
>> [1] "Wed Oct 15 20:36:57 2008"
>> [1] "Wed Oct 15 8:36:57 PM 2008"
>>>
>>
>> Thanks again
>>
>> Ted
>>
>> On Wed, Oct 15, 2008 at 8:09 PM, jim holtman <jholtman at gmail.com> wrote:
>>>
>>> You have to explicitly 'print' the value of x in the loop:    print(x)
>>>
>>> 'x' by itself is just it value.  At the command line, typing an
>>> objects name is equivalent to printing that object, but it only
>>> happens at the command line.  If you want a value printed, the 'print'
>>> it.  Also works at the command line if you want to use it there also.
>>>
>>> On Wed, Oct 15, 2008 at 5:36 PM, Ted Byers <r.ted.byers at gmail.com> wrote:
>>>>
>>>> Actually, I'd tried single brackets first.  Here is what I got:
>>>>
>>>>> for (i in 1:length(V4) ) { x = read.csv(V4[i], header = FALSE,
>>>>> na.strings="");x }
>>>>
>>>> Error in read.table(file = file, header = header, sep = sep, quote =
>>>> quote,  :
>>>>  'file' must be a character string or connection
>>>>>
>>>>
>>>>
>>>> the advice to use as.character worked, in that progress has been made.
>>>>
>>>> Can you guys explain the following output, though?
>>>>
>>>>> setwd("K:\\MerchantData\\RiskModel\\AutomatedRiskModel")
>>>>> for (i in 1:length(V4) ) { x = read.csv(as.character(V4[[i]]), header =
>>>>> FALSE, na.strings="");x }
>>>>> x
>>>>
>>>>  V1
>>>> 1  0
>>>>>
>>>>> x = read.csv(as.character(V4[[1]]), header = FALSE, na.strings="");x
>>>>
>>>>    V1
>>>> 1     0
>>>> 2     0
>>>> 3    21
>>>> 4     0
>>>> 5     1
>>>> 6     7
>>>> 7    51
>>>> 8    20
>>>> 9     3
>>>> 10    5
>>>> 11    6
>>>> 12    8
>>>> 13    2
>>>> 14    0
>>>> 15    2
>>>> 16    4
>>>> 17   23
>>>>
>>>> Clearly, if I hand write a line to read the data, getting the file
>>>> name from V4 (in this case V4[[1]]), I get the data into 'x', which I
>>>> can then display.  I only displayed the first few as some of these
>>>> files will have thousands of values.
>>>>
>>>> But what puzzles me is that I saw virtually no output from my loop.  I
>>>> thought what would happen (with the x after the ';') is that the
>>>> contents of each file would be displayed after it is read and before
>>>> the first is read.  And after the loop finishes, there is nothing in
>>>> x.  I don't see why the contents of x would disappear after the loop,
>>>> unless R has scoping restrictions as stringent as, say, C++ (e.g. a
>>>> variable declared inside a loop is not visible outside the loop).  But
>>>> that would beg the question as to how to declare a variable before it
>>>> is first used.
>>>>
>>>> This doesn't bode well for me, or perhaps my ability to learn a new
>>>> trick at my age, when such a simple loop should give me such trouble.
>>>> :-(
>>>>
>>>> Getting more grey hair by the minute.  :-(
>>>>
>>>> Thanks
>>>>
>>>> ted
>>>>
>>>> On Wed, Oct 15, 2008 at 5:12 PM, Rolf Turner <r.turner at auckland.ac.nz>
>>>> wrote:
>>>>>
>>>>> On 16/10/2008, at 10:03 AM, jim holtman wrote:
>>>>>
>>>>>> try putting as.character in the call:
>>>>>>
>>>>>> x = read.csv(as.character(V4[[i]]), header = FALSE
>>>>>
>>>>> No.  This won't help.  V4 is a column of the data frame optdata,
>>>>> and hence is a vector.  Not a list!  Use single brackets --- V4[i] ---
>>>>> and all will be well.
>>>>>
>>>>>       cheers,
>>>>>
>>>>>               Rolf
>>>>>>
>>>>>> On Wed, Oct 15, 2008 at 4:46 PM, Ted Byers <r.ted.byers at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Here is what I tried:
>>>>>>>
>>>>>>> optdata =
>>>>>>>
>>>>>>> read.csv("K:\\MerchantData\\RiskModel\\AutomatedRiskModel\\soptions.dat",
>>>>>>> header = FALSE, na.strings="")
>>>>>>> optdata
>>>>>>> attach(optdata)
>>>>>>> for (i in 1:length(V4) ) { x = read.csv(V4[[i]], header = FALSE,
>>>>>>> na.strings="");x }
>>>>>>>
>>>>>>> And here  is the outcome (just a few of the 60 records successfully
>>>>>>> read):
>>>>>>>>
>>>>>>>> optdata =
>>>>>>>>
>>>>>>>>
>>>>>>>> read.csv("K:\\MerchantData\\RiskModel\\AutomatedRiskModel\\soptions.dat",
>>>>>>>> header = FALSE, na.strings="")
>>>>>>>> optdata
>>>>>>>
>>>>>>>  V1   V2 V3                        V4
>>>>>>> 1  251 2008 18 Plus_Shipping.2008.18.dat
>>>>>>> 2  251 2008 19 Plus_Shipping.2008.19.dat
>>>>>>> 3  251 2008 20 Plus_Shipping.2008.20.dat
>>>>>>> 4  251 2008 22 Plus_Shipping.2008.22.dat
>>>>>>> 5  251 2008 23 Plus_Shipping.2008.23.dat
>>>>>>> 6  251 2008 24 Plus_Shipping.2008.24.dat
>>>>>>>
>>>>>>> I can see the data has been correctly read.  But for some reason that
>>>>>>> isn't
>>>>>>> clear, read.csv doesn't like the data in the last column.
>>>>>>>
>>>>>>>> attach(optdata)
>>>>>>>> for (i in 1:length(V4) ) { x = read.csv(V4[[i]], header = FALSE,
>>>>>>>> na.strings="");x }
>>>>>>>
>>>>>>> Error in read.table(file = file, header = header, sep = sep, quote =
>>>>>>> quote,
>>>>>>> :
>>>>>>>  'file' must be a character string or connection
>>>>>>>>
>>>>>>>> V4[[1]]
>>>>>>>
>>>>>>> [1] Plus_Shipping.2008.18.dat
>>>>>>> 60 Levels: Easyway.2008.17.dat Easyway.2008.18.dat
>>>>>>> Easyway.2008.19.dat
>>>>>>> Easyway.2008.20.dat ... Secured_Pay.2008.31.dat
>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> The last column is comprised of valid Windows filenames (and no
>>>>>>> whitespace,
>>>>>>> so as not to confuse things).
>>>>>>>
>>>>>>> I see in the docuentation "`[[...]]' is the operator used to select a
>>>>>>> single
>>>>>>> element, whereas `[...]' is a general subscripting operator.", so I
>>>>>>> assume
>>>>>>> V4[[i]] is the correct way to get the ith value from V4.  So why does
>>>>>>> read.csv complain that "'file' must be a character string or
>>>>>>> connection"?
>>>>>>> It seems obvious that the value in V4[[i]i] is a string.  V4[[1]]
>>>>>>> does
>>>>>>> give
>>>>>>> me the right value, although that is followed by output I didn't ask
>>>>>>> for.
>>>>>>>
>>>>>>> In the loop above, I was going to replace the output obtained by 'x'
>>>>>>> with
>>>>>>> output from fitdistr(x,"exponential"), but I can't proceed with that
>>>>>>> until I
>>>>>>> can get the data in these files read.
>>>>>>>
>>>>>>> What have I missed?
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Ted
>>>>>>> --
>>>>>>> View this message in context:
>>>>>>>
>>>>>>> http://www.nabble.com/Argh%21--Trouble-using-string-data-read-from-a-file-tp20002064p20002064.html
>>>>>>> Sent from the R help mailing list archive at Nabble.com.
>>>>>>>
>>>>>>> ______________________________________________
>>>>>>> R-help at r-project.org mailing list
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide
>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Jim Holtman
>>>>>> Cincinnati, OH
>>>>>> +1 513 646 9390
>>>>>>
>>>>>> What is the problem that you are trying to solve?
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>>
>>>>> ######################################################################
>>>>> Attention:This e-mail message is privileged and confidential. If you
>>>>> are not
>>>>> theintended recipient please delete the message and notify the
>>>>> sender.Any
>>>>> views or opinions presented are solely those of the author.
>>>>>
>>>>> This e-mail has been scanned and cleared by
>>>>> MailMarshalwww.marshalsoftware.com
>>>>> ######################################################################
>>>>>
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>>
>>> --
>>> Jim Holtman
>>> Cincinnati, OH
>>> +1 513 646 9390
>>>
>>> What is the problem that you are trying to solve?
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> --
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>



More information about the R-help mailing list