[R] reading in table with different number of elements in each row

David Winsemius dwinsemius at comcast.net
Wed May 26 03:05:47 CEST 2010


On May 25, 2010, at 8:05 PM, Johan Jackson wrote:

> HI all,
>
> This is probably simple, but I haven't been able to locate the  
> answer either
> in the Import Manual or from searching the listserve.
>
> I have tab-delimited data with different numbers of elements in each  
> row. I
> want to read it into R, such that R fills in "NA" in elements that  
> have no
> data. How do I accomplish this?

Look at the fill argument to read.table.

  read.table(textConnection("     1 -0.068191       -0.050729        
-0.113982       -0.044363\n
  -0.072445       -0.044516       -0.048597       -0.051866\n
  -0.051563       -0.041576\n
       2 -0.032645       -0.062389       -0.054491       -0.058061\n
  -0.034690       -0.038044       -0.045332       -0.043785\n
  -0.050639       -0.049617"), header=FALSE, fill =TRUE,  
colClasses=rep("numeric", 4))


          V1        V2        V3        V4        V5
1  1.000000 -0.068191 -0.050729 -0.113982 -0.044363
2 -0.072445 -0.044516 -0.048597 -0.051866        NA
3 -0.051563 -0.041576        NA        NA        NA
4  2.000000 -0.032645 -0.062389 -0.054491 -0.058061
5 -0.034690 -0.038044 -0.045332 -0.043785        NA
6 -0.050639 -0.049617        NA        NA        NA

In your case you may want to use sep="\t"

-- 
David.
>
>
>
> Example:
>
>
> DATA on disk:
>      1 -0.068191       -0.050729       -0.113982       -0.044363
> -0.072445       -0.044516       -0.048597       -0.051866
> -0.051563       -0.041576
>      2 -0.032645       -0.062389       -0.054491       -0.058061
> -0.034690       -0.038044       -0.045332       -0.043785
> -0.050639       -0.049617
>      3 -0.068191       -0.044207       -0.058061       -0.050729
> -0.034991       -0.045360       -0.051563       -0.060290
> -0.043785       -0.048757
>      4 -0.068191       -0.062389       -0.050729       -0.058579
> -0.056481       -0.044363       -0.042347       -0.060290
> -0.051563       -0.037216       -0.041576       -0.056476
>      5 -0.068191       -0.047649       -0.062389       -0.058061
> -0.034227       -0.185829       -0.071855       -0.064096
> -0.195645
>      6 -0.040208       -0.068191       -0.036475       -0.041268
> -0.044207       -0.044363       -0.034991       -0.059810
> -0.051619       -0.051563       -0.037216       -0.041576
> -0.019762
>      7 -0.068191       -0.034227       -0.044363       -0.051563
> -0.041576       -0.053823       -0.057023       -0.046083
> -0.089374       -0.057436
>      8 -0.068191       -0.050731       -0.044207       -0.169714
> -0.060025       -0.048597       -0.037827       -0.053823
> -0.055154
>      9 -0.062389       -0.044207       -0.050729       -0.044363
> -0.043785
>     10 -0.040208       -0.036716       -0.068191       -0.051466
> -0.050731       -0.050729       -0.048095       -0.044363
> -0.044817       -0.059810       -0.051563       -0.037827
> -0.053985       -0.059573       -0.052893
>     11 -0.068191       -0.034227       -0.048597       -0.051563
> -0.041576       -0.056512
>     12 -0.040208       -0.050731       -0.044207       -0.048095
> -0.044363       -0.044817       -0.037827       -0.053985        
> -0.059573
>
> My attempts:
> x <- read.table("DATA",fill=TRUE,sep="\t",colClasses="numeric")
>> x
>          V1        V2        V3        V4        V5        V6
> V7        V8        V9       V10       V11       V12       V13
> 1  -0.068191 -0.050729 -0.113982 -0.044363 -0.072445 -0.044516  
> -0.048597
> -0.051866 -0.051563 -0.041576        NA        NA        NA
> 2  -0.032645 -0.062389 -0.054491 -0.058061 -0.034690 -0.038044  
> -0.045332
> -0.043785 -0.050639 -0.049617        NA        NA        NA
> 3  -0.068191 -0.044207 -0.058061 -0.050729 -0.034991 -0.045360  
> -0.051563
> -0.060290 -0.043785 -0.048757        NA        NA        NA
> 4  -0.068191 -0.062389 -0.050729 -0.058579 -0.056481 -0.044363  
> -0.042347
> -0.060290 -0.051563 -0.037216 -0.041576 -0.056476        NA
> 5  -0.068191 -0.047649 -0.062389 -0.058061 -0.034227 -0.185829  
> -0.071855
> -0.064096 -0.195645        NA        NA        NA        NA
> 6  -0.040208 -0.068191 -0.036475 -0.041268 -0.044207 -0.044363  
> -0.034991
> -0.059810 -0.051619 -0.051563 -0.037216 -0.041576 -0.019762
> 7  -0.068191 -0.034227 -0.044363 -0.051563 -0.041576 -0.053823  
> -0.057023
> -0.046083 -0.089374 -0.057436        NA        NA        NA
> 8  -0.068191 -0.050731 -0.044207 -0.169714 -0.060025 -0.048597  
> -0.037827
> -0.053823 -0.055154        NA        NA        NA        NA
> 9  -0.062389 -0.044207 -0.050729 -0.044363 -0.043785        NA
> NA        NA        NA        NA        NA        NA        NA
> 10 -0.040208 -0.036716 -0.068191 -0.051466 -0.050731 -0.050729  
> -0.048095
> -0.044363 -0.044817 -0.059810 -0.051563 -0.037827 -0.053985
> 11 -0.059573 -0.052893        NA        NA        NA        NA
> NA        NA        NA        NA        NA        NA        NA
> 12 -0.068191 -0.034227 -0.048597 -0.051563 -0.041576 -0.056512
> NA        NA        NA        NA        NA        NA        NA
> 13 -0.040208 -0.050731 -0.044207 -0.048095 -0.044363 -0.044817  
> -0.037827
> -0.053985 -0.059573        NA        NA        NA        NA
>
> The above is almost right, but x has 13 rows instead of 12! WHY? Row  
> 10
> (which has 15 elements) was cut off at 13, and then the last two  
> elements
> were put in a new row. WHY?
> I have tried messing with colClasses to no avail. Any help would  
> be ...
> umm... helpful!
>
> JJ
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list