[R] reading in table with different number of elements in each row

jim holtman jholtman at gmail.com
Wed May 26 02:59:53 CEST 2010


This is in the Detail of the help page:

The number of data columns is determined by looking at the first five
lines of input (or the whole file if it has less than five lines), or
from the length of col.names if it is specified and is longer. This
could conceivably be wrong if fill or blank.lines.skip are true, so
specify col.names if necessary.


try:  read.table(..., col.names=1:30)

This will assume there are 30 columns of data (you only said a max of
15, but lets double it)

On Tue, May 25, 2010 at 8:05 PM, Johan Jackson
<johan.h.jackson at gmail.com> wrote:
> HI all,
>
> This is probably simple, but I haven't been able to locate the answer either
> in the Import Manual or from searching the listserve.
>
> I have tab-delimited data with different numbers of elements in each row. I
> want to read it into R, such that R fills in "NA" in elements that have no
> data. How do I accomplish this?
>
>
>
> Example:
>
>
> DATA on disk:
>      1 -0.068191       -0.050729       -0.113982       -0.044363
> -0.072445       -0.044516       -0.048597       -0.051866
> -0.051563       -0.041576
>      2 -0.032645       -0.062389       -0.054491       -0.058061
> -0.034690       -0.038044       -0.045332       -0.043785
> -0.050639       -0.049617
>      3 -0.068191       -0.044207       -0.058061       -0.050729
> -0.034991       -0.045360       -0.051563       -0.060290
> -0.043785       -0.048757
>      4 -0.068191       -0.062389       -0.050729       -0.058579
> -0.056481       -0.044363       -0.042347       -0.060290
> -0.051563       -0.037216       -0.041576       -0.056476
>      5 -0.068191       -0.047649       -0.062389       -0.058061
> -0.034227       -0.185829       -0.071855       -0.064096
> -0.195645
>      6 -0.040208       -0.068191       -0.036475       -0.041268
> -0.044207       -0.044363       -0.034991       -0.059810
> -0.051619       -0.051563       -0.037216       -0.041576
> -0.019762
>      7 -0.068191       -0.034227       -0.044363       -0.051563
> -0.041576       -0.053823       -0.057023       -0.046083
> -0.089374       -0.057436
>      8 -0.068191       -0.050731       -0.044207       -0.169714
> -0.060025       -0.048597       -0.037827       -0.053823
> -0.055154
>      9 -0.062389       -0.044207       -0.050729       -0.044363
> -0.043785
>     10 -0.040208       -0.036716       -0.068191       -0.051466
> -0.050731       -0.050729       -0.048095       -0.044363
> -0.044817       -0.059810       -0.051563       -0.037827
> -0.053985       -0.059573       -0.052893
>     11 -0.068191       -0.034227       -0.048597       -0.051563
> -0.041576       -0.056512
>     12 -0.040208       -0.050731       -0.044207       -0.048095
> -0.044363       -0.044817       -0.037827       -0.053985       -0.059573
>
> My attempts:
> x <- read.table("DATA",fill=TRUE,sep="\t",colClasses="numeric")
>> x
>          V1        V2        V3        V4        V5        V6
> V7        V8        V9       V10       V11       V12       V13
> 1  -0.068191 -0.050729 -0.113982 -0.044363 -0.072445 -0.044516 -0.048597
> -0.051866 -0.051563 -0.041576        NA        NA        NA
> 2  -0.032645 -0.062389 -0.054491 -0.058061 -0.034690 -0.038044 -0.045332
> -0.043785 -0.050639 -0.049617        NA        NA        NA
> 3  -0.068191 -0.044207 -0.058061 -0.050729 -0.034991 -0.045360 -0.051563
> -0.060290 -0.043785 -0.048757        NA        NA        NA
> 4  -0.068191 -0.062389 -0.050729 -0.058579 -0.056481 -0.044363 -0.042347
> -0.060290 -0.051563 -0.037216 -0.041576 -0.056476        NA
> 5  -0.068191 -0.047649 -0.062389 -0.058061 -0.034227 -0.185829 -0.071855
> -0.064096 -0.195645        NA        NA        NA        NA
> 6  -0.040208 -0.068191 -0.036475 -0.041268 -0.044207 -0.044363 -0.034991
> -0.059810 -0.051619 -0.051563 -0.037216 -0.041576 -0.019762
> 7  -0.068191 -0.034227 -0.044363 -0.051563 -0.041576 -0.053823 -0.057023
> -0.046083 -0.089374 -0.057436        NA        NA        NA
> 8  -0.068191 -0.050731 -0.044207 -0.169714 -0.060025 -0.048597 -0.037827
> -0.053823 -0.055154        NA        NA        NA        NA
> 9  -0.062389 -0.044207 -0.050729 -0.044363 -0.043785        NA
> NA        NA        NA        NA        NA        NA        NA
> 10 -0.040208 -0.036716 -0.068191 -0.051466 -0.050731 -0.050729 -0.048095
> -0.044363 -0.044817 -0.059810 -0.051563 -0.037827 -0.053985
> 11 -0.059573 -0.052893        NA        NA        NA        NA
> NA        NA        NA        NA        NA        NA        NA
> 12 -0.068191 -0.034227 -0.048597 -0.051563 -0.041576 -0.056512
> NA        NA        NA        NA        NA        NA        NA
> 13 -0.040208 -0.050731 -0.044207 -0.048095 -0.044363 -0.044817 -0.037827
> -0.053985 -0.059573        NA        NA        NA        NA
>
> The above is almost right, but x has 13 rows instead of 12! WHY? Row 10
> (which has 15 elements) was cut off at 13, and then the last two elements
> were put in a new row. WHY?
> I have tried messing with colClasses to no avail. Any help would be ...
> umm... helpful!
>
> JJ
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



More information about the R-help mailing list