[R] Read.table problems
Steve Murray
smurray444 at hotmail.com
Mon May 18 18:24:33 CEST 2009
Dear all,
I have a file which I've converted from NetCDF (.nc) to text (.txt) using ncdump in Unix (as I had problems using the ncdf package to do this). The first few rows (as copied and pasted from the Unix console) of the file appear as follows:
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
As you can see, there are a lot of NA values before the actual numeric values start further down the dataset. My problem is that I'm having trouble reading this file into R. I think the problem lies with the sep= argument, although I may be wrong. I tried the following command at first, as the data appear to be comma separated:
> read.table("test86.txt", skip=43, na.strings="-", header=FALSE, sep=",") -> test86 # skip =43 due to meta-data information being held in the initial rows
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
line 29 did not have 25 elements
I then tried sep=" ", followed by sep="" but received a similar-type error message (although line 29 doesn't appear to be especially different from the rest).
I subsequently tried using sep=\t and then sep=\n. These both result in the data being read in without an error message being displayed, although the data are formatted as follows:
> head(test86)
V1
1 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
2 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
3 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
4 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
5 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
6 _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
> dim(test86)
[1] 179899 1
Instead of one column, I'd expect there to be 720.
I think I'm getting something wrong relating to the sep= argument (or possibly mis-using na.strings?). If anyone has any solutions to this then I'd be very grateful to hear them.
Many thanks for any advice,
Steve
More information about the R-help
mailing list