[R] Dividing one column of form xx-yy into two columns, xx and yy
Bill.Venables at csiro.au
Bill.Venables at csiro.au
Tue Feb 9 03:07:53 CET 2010
Here is one way.
> dat
V1
1 43-156
2 43-43
3 1267-18
> dat <- within(dat, {
+ m <- do.call("rbind", strsplit(as.character(V1), "-"))
+ XX <- as.numeric(m[,1])
+ YY <- as.numeric(m[,2])
+ rm(m)
+ })
> dat
V1 YY XX
1 43-156 156 43
2 43-43 43 43
3 1267-18 18 1267
>
Bill Venables
CSIRO/CMIS Cleveland Laboratories
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of ZeMajik
Sent: Tuesday, 9 February 2010 9:32 AM
To: R mailing list
Subject: [R] Dividing one column of form xx-yy into two columns, xx and yy
I have a data set where one column consists of two numerical factors,
separated by a "-".
So my data looks something like this:
43-156
43-43
1267-18
.
.
.
There are additional columns consisting of single factors as well, so
reading the csv file (where the data is stored) with the sep="-" addition
won't work since the rest of the factors are separated by commas.
So first of all, is there any way to import a file which is separated by ","
OR "-"?
If this is not possible, does anyone have any ideas how I could go about to
separate these? I could use a text editor to replace the - with , and
import, but I would prefer doing this inside of R so that making a script
could be used in the future.
Just to clarify, I would like the above to turn out as two separate columns
(or vectors) where the first in this would be (43,43,1267,....) and the
second (156,43,18,.....)
The dataset is rather large, with a few hundred thousand lines, so it would
be preferable to keep resource intensive methods to a minimum if possible.
Thanks in advance!
Mike
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list