[R] Reading word by word in a dataset

Spencer Graves spencer.graves at pdf.com
Mon Nov 1 23:53:18 CET 2004


Dear Andy & Tony: 

      That's great.  Unfortunately, I still spend most of my life in the 
S-Plus world, and read.table in S-Plus 6.2 does not have the "fill" 
argument.  However, Tony's solution (and my ugly hack) work in both 
S-Plus 6.2 and R 2.0.0. 

      Thanks again. 
      Spencer Graves

Tony Plate wrote:

> Trying to make it work when not all rows have the same numbers of 
> fields seems like a good place to use the "flush" argument to scan() 
> (to skip everything after the first field on the line):
>
> With the following copied to the clipboard:
>
> i1-apple        10$   New_York
> i2-banana
> i3-strawberry   7$    Japan
>
> do:
>
> > scan("clipboard", "", flush=T)
> Read 3 items
> [1] "i1-apple"      "i2-banana"     "i3-strawberry"
> > sub("^[A-Za-z0-9]*-", "", scan("clipboard", "", flush=T))
> Read 3 items
> [1] "apple"      "banana"     "strawberry"
> >
>
> -- Tony Plate
>
> At Monday 01:59 PM 11/1/2004, Spencer Graves wrote:
>
>>      Uwe and Andy's solutions are great for many applications but 
>> won't work if not all rows have the same numbers of fields.  Consider 
>> for example the following modification of Lee's example:
>> i1-apple        10$   New_York
>> i2-banana
>> i3-strawberry   7$    Japan
>>
>>      If I copy this to "clipboard" and run Andy's code, I get the 
>> following:
>> > read.table("clipboard", colClasses=c("character", "NULL", "NULL"))
>> Error in scan(file = file, what = what, sep = sep, quote = quote, dec 
>> = dec,  :
>>    line 2 did not have 3 elements
>>
>>      We can get around this using "scan", then splitting things apart 
>> similar to the way Uwe described:
>> > dat <-
>> + scan("clipboard", character(0), sep="\n")
>> Read 3 items
>> > dash <- regexpr("-", dat)
>> > dat2 <- substring(dat, pmax(0, dash)+1)
>> >
>> > blank <- regexpr(" ", dat2)
>> > if(any(blank<0))
>> +   blank[blank<0] <- nchar(dat2[blank<0])
>> > substring(dat2, 1, blank)
>> [1] "apple "      "banana"      "strawberry "
>>
>>      hope this helps.  spencer graves
>>
>> Uwe Ligges wrote:
>>
>>> Liaw, Andy wrote:
>>>
>>>> Using R-2.0.0 on WinXPPro, cut-and-pasting the data you have:
>>>>
>>>>
>>>>> read.table("clipboard", colClasses=c("character", "NULL", "NULL"))
>>>>
>>>>
>>>>
>>>>              V1
>>>> 1      i1-apple
>>>> 2     i2-banana
>>>> 3 i3-strawberry
>>>
>>>
>>>
>>>
>>> ... and if only the words after "-" are of interest, the statement 
>>> can be followed by
>>>
>>>  sapply(strsplit(...., "-"), "[", 2)
>>>
>>>
>>> Uwe Ligges
>>>
>>>
>>>
>>>> HTH,
>>>> Andy
>>>>
>>>>
>>>>> From: j lee
>>>>>
>>>>> Hello All,
>>>>>
>>>>> I'd like to read first words in lines into a new file.
>>>>> If I have a data file the following, how can I get the
>>>>> first words: apple, banana, strawberry?
>>>>>
>>>>> i1-apple        10$   New_York
>>>>> i2-banana       5$    London
>>>>> i3-strawberry   7$    Japan
>>>>>
>>>>> Is there any similar question already posted to the
>>>>> list? I am a bit new to R, having a few months of
>>>>> experience now.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> John
>>>>>
>>>>> ______________________________________________
>>>>> R-help at stat.math.ethz.ch mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide! 
>>>>> http://www.R-project.org/posting-guide.html
>>>>>
>>>>
>>>>
>>>> ______________________________________________
>>>> R-help at stat.math.ethz.ch mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide! 
>>>> http://www.R-project.org/posting-guide.html
>>>
>>>
>>>
>>> ______________________________________________
>>> R-help at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide! 
>>> http://www.R-project.org/posting-guide.html
>>
>>
>>
>> -- 
>> Spencer Graves, PhD, Senior Development Engineer
>> O:  (408)938-4420;  mobile:  (408)655-4567
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide! 
>> http://www.R-project.org/posting-guide.html
>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html


-- 
Spencer Graves, PhD, Senior Development Engineer
O:  (408)938-4420;  mobile:  (408)655-4567




More information about the R-help mailing list