[R] Reading word by word in a dataset
John
cyracules at yahoo.co.uk
Thu Nov 4 14:00:30 CET 2004
Thanks, Tony.
I got a very good idea of using "flush" in scan() from
your reply, so that I successfully did my little job.
But, my next question arises if I want to extract the
list of the price items only in the 2nd column in my
example.
I did it the following way. Is it the right way to do?
Or do you have a smarter or more efficient way to do
it?
> system("more mtx.ex.1")
i1-apple 10$ New_York
i2-banana 5$ London
i3-strawberry 7$ Japan
>
> scan(file="mtx.ex.1", what=list(NULL,""),
flush=T)[[2]]
Read 3 records
[1] "10$" "5$" "7$"
Cheers,
John
--- Tony Plate <tplate at acm.org> wrote:
> Trying to make it work when not all rows have the
> same numbers of fields
> seems like a good place to use the "flush" argument
> to scan() (to skip
> everything after the first field on the line):
>
> With the following copied to the clipboard:
>
> i1-apple 10$ New_York
> i2-banana
> i3-strawberry 7$ Japan
>
> do:
>
> > scan("clipboard", "", flush=T)
> Read 3 items
> [1] "i1-apple" "i2-banana" "i3-strawberry"
> > sub("^[A-Za-z0-9]*-", "", scan("clipboard", "",
> flush=T))
> Read 3 items
> [1] "apple" "banana" "strawberry"
> >
>
> -- Tony Plate
>
> At Monday 01:59 PM 11/1/2004, Spencer Graves wrote:
> > Uwe and Andy's solutions are great for many
> applications but won't
> > work if not all rows have the same numbers of
> fields. Consider for
> > example the following modification of Lee's
> example:
> >i1-apple 10$ New_York
> >i2-banana
> >i3-strawberry 7$ Japan
> >
> > If I copy this to "clipboard" and run Andy's
> code, I get the following:
> > > read.table("clipboard",
> colClasses=c("character", "NULL", "NULL"))
> >Error in scan(file = file, what = what, sep = sep,
> quote = quote, dec =
> >dec, :
> > line 2 did not have 3 elements
> >
> > We can get around this using "scan", then
> splitting things apart
> > similar to the way Uwe described:
> > > dat <-
> >+ scan("clipboard", character(0), sep="\n")
> >Read 3 items
> > > dash <- regexpr("-", dat)
> > > dat2 <- substring(dat, pmax(0, dash)+1)
> > >
> > > blank <- regexpr(" ", dat2)
> > > if(any(blank<0))
> >+ blank[blank<0] <- nchar(dat2[blank<0])
> > > substring(dat2, 1, blank)
> >[1] "apple " "banana" "strawberry "
> >
> > hope this helps. spencer graves
> >
> >Uwe Ligges wrote:
> >
> >>Liaw, Andy wrote:
> >>
> >>>Using R-2.0.0 on WinXPPro, cut-and-pasting the
> data you have:
> >>>
> >>>
> >>>>read.table("clipboard",
> colClasses=c("character", "NULL", "NULL"))
> >>>
> >>>
> >>> V1
> >>>1 i1-apple
> >>>2 i2-banana
> >>>3 i3-strawberry
> >>
> >>
> >>
> >>... and if only the words after "-" are of
> interest, the statement can be
> >>followed by
> >>
> >> sapply(strsplit(...., "-"), "[", 2)
> >>
> >>
> >>Uwe Ligges
> >>
> >>
> >>
> >>>HTH,
> >>>Andy
> >>>
> >>>
> >>>>From: j lee
> >>>>
> >>>>Hello All,
> >>>>
> >>>>I'd like to read first words in lines into a new
> file.
> >>>>If I have a data file the following, how can I
> get the
> >>>>first words: apple, banana, strawberry?
> >>>>
> >>>>i1-apple 10$ New_York
> >>>>i2-banana 5$ London
> >>>>i3-strawberry 7$ Japan
> >>>>
> >>>>Is there any similar question already posted to
> the
> >>>>list? I am a bit new to R, having a few months
> of
> >>>>experience now.
> >>>>
> >>>>Cheers,
> >>>>
> >>>>John
> >>>>
> >>>>______________________________________________
> >>>>R-help at stat.math.ethz.ch mailing list
> >>>>https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>PLEASE do read the posting guide!
> >>>>http://www.R-project.org/posting-guide.html
> >>>>
> >>>
> >>>
> >>>______________________________________________
> >>>R-help at stat.math.ethz.ch mailing list
> >>>https://stat.ethz.ch/mailman/listinfo/r-help
> >>>PLEASE do read the posting guide!
> >>>http://www.R-project.org/posting-guide.html
> >>
> >>
> >>______________________________________________
> >>R-help at stat.math.ethz.ch mailing list
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
> >
> >
> >--
> >Spencer Graves, PhD, Senior Development Engineer
> >O: (408)938-4420; mobile: (408)655-4567
> >
> >______________________________________________
> >R-help at stat.math.ethz.ch mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
>
More information about the R-help
mailing list