[R] Separating a Complicated String Vector
Ista Zahn
istazahn at gmail.com
Sun Jan 4 04:22:26 CET 2015
I'm not sure what's so complicated about that (am I missing
something?). You can search using grep, and replace using gsub, so
tmpDF <- read.table(text="V1 V2
A 5
a1 1
a2 1
a3 1
a4 1
a5 1
B 4
b1 1
b2 1
b3 1
b4 1",
header=TRUE)
tmpDF <- tmpDF[grepl("[0-9]", tmpDF$V1), ]
data.frame(tmpDF, V3 = toupper(gsub("[0-9]", "", tmpDF$V1)))
Seems to do the trick.
Best,
Ista
On Sat, Jan 3, 2015 at 9:41 PM, npretnar <npretnar at gmail.com> wrote:
> I have a string variable (V1) in a data frame structured as follows:
>
> V1 V2
> A 5
> a1 1
> a2 1
> a3 1
> a4 1
> a5 1
> B 4
> b1 1
> b2 1
> b3 1
> b4 1
>
> I want the following:
>
> V1 V2 V3
> a1 1 A
> a2 1 A
> a3 1 A
> a4 1 A
> a5 1 A
> b1 1 B
> b2 1 B
> b3 1 B
> b4 1 B
>
> I am not sure how to go about making this transformation besides writing a long vector that contains each of the categorical string names (these are state names, so it would be a really long vector). Any help would be greatly appreciated.
>
> Thanks,
>
> Nicholas Pretnar
> Mizzou Economics Grad Assistant
> npretnar at gmail.com
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list