[R] Reading word by word in a dataset
Prof Brian Ripley
ripley at stats.ox.ac.uk
Thu Nov 4 14:49:52 CET 2004
On Thu, 4 Nov 2004, John wrote:
> Dear Andy,
> Why does my 'read.table()' NOT work in this example?
> I have the error, "subscript out of bounds", as you
> see below. My R version is 1.9.0.
^^^^^
That is your problem. It works in the current version of R, 2.0.0. Using
colClasses=NULL was not documented in 1.9.0, and was not intended to work.
What does the posting guide say about this?
> > system("more mtx.ex.1")
> i1-apple 10$ New_York
> i2-banana 5$ London
> i3-strawberry 7$ Japan
> >
> > read.table("mtx.ex.1",
> colClasses=c("character","NULL","NULL"), fill=T)
> Error in data[[i]] : subscript out of bounds
> >
> > read.table("mtx.ex.1", colClasses=c("character",
> NULL, NULL), fill=T)
> V1 V2 V3
> 1 i1-apple 10$ New_York
> 2 i2-banana 5$ London
> 3 i3-strawberry 7$ Japan
> >
> > read.table("mtx.ex.1", colClasses=c("character",
> NULL, NULL), fill=T)[,1]
> [1] "i1-apple" "i2-banana" "i3-strawberry"
> >
>
> Cheers,
>
> John
>
>
> --- "Liaw, Andy" <andy_liaw at merck.com> wrote:
> > Don't give up on read.table() just yet:
> >
> > > read.table("clipboard", colClasses=c("character",
> > "NULL", "NULL"),
> > fill=TRUE)
> > V1
> > 1 i1-apple
> > 2 i2-banana
> > 3 i3-strawberry
> >
> > Andy
> >
> > > From: Spencer Graves
> > >
> > > Uwe and Andy's solutions are great for many
> > > applications but won't
> > > work if not all rows have the same numbers of
> > fields. Consider for
> > > example the following modification of Lee's
> > example:
> > >
> > > i1-apple 10$ New_York
> > > i2-banana
> > > i3-strawberry 7$ Japan
> > >
> > > If I copy this to "clipboard" and run Andy's
> > code, I get the
> > > following:
> > >
> > > > read.table("clipboard",
> > colClasses=c("character", "NULL", "NULL"))
> > > Error in scan(file = file, what = what, sep = sep,
> > quote =
> > > quote, dec =
> > > dec, :
> > > line 2 did not have 3 elements
> > >
> > > We can get around this using "scan", then
> > splitting
> > > things apart
> > > similar to the way Uwe described:
> > >
> > > > dat <-
> > > + scan("clipboard", character(0), sep="\n")
> > > Read 3 items
> > > > dash <- regexpr("-", dat)
> > > > dat2 <- substring(dat, pmax(0, dash)+1)
> > > >
> > > > blank <- regexpr(" ", dat2)
> > > > if(any(blank<0))
> > > + blank[blank<0] <- nchar(dat2[blank<0])
> > > > substring(dat2, 1, blank)
> > > [1] "apple " "banana" "strawberry "
> > >
> > > hope this helps. spencer graves
> > >
> > > Uwe Ligges wrote:
> > >
> > > > Liaw, Andy wrote:
> > > >
> > > >> Using R-2.0.0 on WinXPPro, cut-and-pasting the
> > data you have:
> > > >>
> > > >>
> > > >>> read.table("clipboard",
> > colClasses=c("character", "NULL", "NULL"))
> > > >>
> > > >>
> > > >> V1
> > > >> 1 i1-apple
> > > >> 2 i2-banana
> > > >> 3 i3-strawberry
> > > >
> > > >
> > > >
> > > > ... and if only the words after "-" are of
> > interest, the
> > > statement can
> > > > be followed by
> > > >
> > > > sapply(strsplit(...., "-"), "[", 2)
> > > >
> > > >
> > > > Uwe Ligges
> > > >
> > > >
> > > >
> > > >> HTH,
> > > >> Andy
> > > >>
> > > >>
> > > >>> From: j lee
> > > >>>
> > > >>> Hello All,
> > > >>>
> > > >>> I'd like to read first words in lines into a
> > new file.
> > > >>> If I have a data file the following, how can I
> > get the
> > > >>> first words: apple, banana, strawberry?
> > > >>>
> > > >>> i1-apple 10$ New_York
> > > >>> i2-banana 5$ London
> > > >>> i3-strawberry 7$ Japan
> > > >>>
> > > >>> Is there any similar question already posted
> > to the
> > > >>> list? I am a bit new to R, having a few months
> > of
> > > >>> experience now.
> > > >>>
> > > >>> Cheers,
> > > >>>
> > > >>> John
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list