[R] read.table: skipping trailing delimiters
Marc Schwartz
marc_schwartz at me.com
Tue May 4 18:27:30 CEST 2010
On May 4, 2010, at 11:11 AM, Marshall Feldman wrote:
> Hi,
>
> I am trying to read a tab-delimited file that has trailing tab delimiters. It's a simple file with two legitimate fields. I'm using the first as row.names, and the second should be the only column in the resulting data frame.
>
> Initially, R was filling the last column with NA's, but I was able to stop that by setting colClasses=c("character","character",NULL). Still, the data frame is coming in with an extra column, only now its values are set to "".
>
> Is there any way to skip the trailing delimited field entirely? I've searched for an answer without luck.
>
> Thanks.
> Marsh Feldman
The easiest way to remove a single final column is to post-process the data frame that you imported. So if your imported data frame is called 'DF':
DF.New <- DF[, -ncol(DF)]
See ?ncol and ?Extract
You could also do more complex sub-setting using the ?subset function or consider pre-processing the file to be imported with command line tools such as cut or awk.
For example, using the 'iris' data set:
> str(iris)
'data.frame': 150 obs. of 5 variables:
$ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
$ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
$ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
$ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
$ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
> str(iris[, -ncol(iris)])
'data.frame': 150 obs. of 4 variables:
$ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
$ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
$ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
$ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
HTH,
Marc Schwartz
More information about the R-help
mailing list