[R] reading in table with different number of elements in each row
David Winsemius
dwinsemius at comcast.net
Wed May 26 03:05:47 CEST 2010
On May 25, 2010, at 8:05 PM, Johan Jackson wrote:
> HI all,
>
> This is probably simple, but I haven't been able to locate the
> answer either
> in the Import Manual or from searching the listserve.
>
> I have tab-delimited data with different numbers of elements in each
> row. I
> want to read it into R, such that R fills in "NA" in elements that
> have no
> data. How do I accomplish this?
Look at the fill argument to read.table.
read.table(textConnection(" 1 -0.068191 -0.050729
-0.113982 -0.044363\n
-0.072445 -0.044516 -0.048597 -0.051866\n
-0.051563 -0.041576\n
2 -0.032645 -0.062389 -0.054491 -0.058061\n
-0.034690 -0.038044 -0.045332 -0.043785\n
-0.050639 -0.049617"), header=FALSE, fill =TRUE,
colClasses=rep("numeric", 4))
V1 V2 V3 V4 V5
1 1.000000 -0.068191 -0.050729 -0.113982 -0.044363
2 -0.072445 -0.044516 -0.048597 -0.051866 NA
3 -0.051563 -0.041576 NA NA NA
4 2.000000 -0.032645 -0.062389 -0.054491 -0.058061
5 -0.034690 -0.038044 -0.045332 -0.043785 NA
6 -0.050639 -0.049617 NA NA NA
In your case you may want to use sep="\t"
--
David.
>
>
>
> Example:
>
>
> DATA on disk:
> 1 -0.068191 -0.050729 -0.113982 -0.044363
> -0.072445 -0.044516 -0.048597 -0.051866
> -0.051563 -0.041576
> 2 -0.032645 -0.062389 -0.054491 -0.058061
> -0.034690 -0.038044 -0.045332 -0.043785
> -0.050639 -0.049617
> 3 -0.068191 -0.044207 -0.058061 -0.050729
> -0.034991 -0.045360 -0.051563 -0.060290
> -0.043785 -0.048757
> 4 -0.068191 -0.062389 -0.050729 -0.058579
> -0.056481 -0.044363 -0.042347 -0.060290
> -0.051563 -0.037216 -0.041576 -0.056476
> 5 -0.068191 -0.047649 -0.062389 -0.058061
> -0.034227 -0.185829 -0.071855 -0.064096
> -0.195645
> 6 -0.040208 -0.068191 -0.036475 -0.041268
> -0.044207 -0.044363 -0.034991 -0.059810
> -0.051619 -0.051563 -0.037216 -0.041576
> -0.019762
> 7 -0.068191 -0.034227 -0.044363 -0.051563
> -0.041576 -0.053823 -0.057023 -0.046083
> -0.089374 -0.057436
> 8 -0.068191 -0.050731 -0.044207 -0.169714
> -0.060025 -0.048597 -0.037827 -0.053823
> -0.055154
> 9 -0.062389 -0.044207 -0.050729 -0.044363
> -0.043785
> 10 -0.040208 -0.036716 -0.068191 -0.051466
> -0.050731 -0.050729 -0.048095 -0.044363
> -0.044817 -0.059810 -0.051563 -0.037827
> -0.053985 -0.059573 -0.052893
> 11 -0.068191 -0.034227 -0.048597 -0.051563
> -0.041576 -0.056512
> 12 -0.040208 -0.050731 -0.044207 -0.048095
> -0.044363 -0.044817 -0.037827 -0.053985
> -0.059573
>
> My attempts:
> x <- read.table("DATA",fill=TRUE,sep="\t",colClasses="numeric")
>> x
> V1 V2 V3 V4 V5 V6
> V7 V8 V9 V10 V11 V12 V13
> 1 -0.068191 -0.050729 -0.113982 -0.044363 -0.072445 -0.044516
> -0.048597
> -0.051866 -0.051563 -0.041576 NA NA NA
> 2 -0.032645 -0.062389 -0.054491 -0.058061 -0.034690 -0.038044
> -0.045332
> -0.043785 -0.050639 -0.049617 NA NA NA
> 3 -0.068191 -0.044207 -0.058061 -0.050729 -0.034991 -0.045360
> -0.051563
> -0.060290 -0.043785 -0.048757 NA NA NA
> 4 -0.068191 -0.062389 -0.050729 -0.058579 -0.056481 -0.044363
> -0.042347
> -0.060290 -0.051563 -0.037216 -0.041576 -0.056476 NA
> 5 -0.068191 -0.047649 -0.062389 -0.058061 -0.034227 -0.185829
> -0.071855
> -0.064096 -0.195645 NA NA NA NA
> 6 -0.040208 -0.068191 -0.036475 -0.041268 -0.044207 -0.044363
> -0.034991
> -0.059810 -0.051619 -0.051563 -0.037216 -0.041576 -0.019762
> 7 -0.068191 -0.034227 -0.044363 -0.051563 -0.041576 -0.053823
> -0.057023
> -0.046083 -0.089374 -0.057436 NA NA NA
> 8 -0.068191 -0.050731 -0.044207 -0.169714 -0.060025 -0.048597
> -0.037827
> -0.053823 -0.055154 NA NA NA NA
> 9 -0.062389 -0.044207 -0.050729 -0.044363 -0.043785 NA
> NA NA NA NA NA NA NA
> 10 -0.040208 -0.036716 -0.068191 -0.051466 -0.050731 -0.050729
> -0.048095
> -0.044363 -0.044817 -0.059810 -0.051563 -0.037827 -0.053985
> 11 -0.059573 -0.052893 NA NA NA NA
> NA NA NA NA NA NA NA
> 12 -0.068191 -0.034227 -0.048597 -0.051563 -0.041576 -0.056512
> NA NA NA NA NA NA NA
> 13 -0.040208 -0.050731 -0.044207 -0.048095 -0.044363 -0.044817
> -0.037827
> -0.053985 -0.059573 NA NA NA NA
>
> The above is almost right, but x has 13 rows instead of 12! WHY? Row
> 10
> (which has 15 elements) was cut off at 13, and then the last two
> elements
> were put in a new row. WHY?
> I have tried messing with colClasses to no avail. Any help would
> be ...
> umm... helpful!
>
> JJ
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list