[R] Way to handle variable length and numbers of columns using read.table(...)

Gabor Grothendieck ggrothendieck at gmail.com
Tue May 5 05:04:11 CEST 2009


Its not clear exactly what the rules are for this but if we assume
that numbers always end in a decimal plus two digits then
using stapply from the gsubfn package:

> Lines <- "Time Loc1 Loc2
+ 1 22.33 44.55
+ 2 66.77 88.99
+ 3 222.33344.55
+ 4 66.77 88.99"
>
> library(gsubfn)
> L <- readLines(textConnection(Lines))
> strapply(L[-1], "[0-9]*[.][0-9][0-9]", as.numeric, simplify = rbind)
       [,1]   [,2]
[1,]  22.33  44.55
[2,]  66.77  88.99
[3,] 222.33 344.55
[4,]  66.77  88.99

See http://gsubfn.googlecode.com and for regular expressions see ?regex

On Mon, May 4, 2009 at 10:20 PM, Jason Rupert <jasonkrupert at yahoo.com> wrote:
>
> I've got read.table to successfully read in my table of three columns.  Most of the time I will have a set number of rows, but sometime that will be variable and sometimes there will be only be two variables in one row, e.g.
>
> Time Loc1 Loc2
> 1 22.33 44.55
> 2 66.77 88.99
> 3 222.33344.55
> 4 66.77 88.99
>
> Is there any way to have read.table handle (1) a variable number of rows, and (2) sometime there are only two variables as shown in Time = 3 above?
>
> Just curious about how to handle this, and if read.table is the right way to go about or if I should read in all the data and then try to parse it out best I can.
>
> Thanks again.
>
>> R.version
>               _
> platform       i386-apple-darwin8.11.1
> arch           i386
> os             darwin8.11.1
> system         i386, darwin8.11.1
> status
> major          2
> minor          8.0
> year           2008
> month          10
> day            20
> svn rev        46754
> language       R
> version.string R version 2.8.0 (2008-10-20)
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list