[R] Separating a Complicated String Vector
npretnar
npretnar at gmail.com
Sun Jan 4 06:20:47 CET 2015
Sorry. Bad example on my part. Try this. V1 is ...
V1
alabama
bates
tuscaloosa
smith
arkansas
fayette
little rock
alaska
juneau
nome
And I want:
V1 V2
alabama bates
alabama tuscaloosa
alabama smith
arkansas fayette
arkansas little rock
alaska juneau
alaskas nome
This is more representative of the problem, extended to all 50 states.
- Nick
On Jan 3, 2015, at 9:22 PM, Ista Zahn wrote:
> I'm not sure what's so complicated about that (am I missing
> something?). You can search using grep, and replace using gsub, so
>
> tmpDF <- read.table(text="V1 V2
> A 5
> a1 1
> a2 1
> a3 1
> a4 1
> a5 1
> B 4
> b1 1
> b2 1
> b3 1
> b4 1",
> header=TRUE)
> tmpDF <- tmpDF[grepl("[0-9]", tmpDF$V1), ]
> data.frame(tmpDF, V3 = toupper(gsub("[0-9]", "", tmpDF$V1)))
>
> Seems to do the trick.
>
> Best,
> Ista
>
> On Sat, Jan 3, 2015 at 9:41 PM, npretnar <npretnar at gmail.com> wrote:
>> I have a string variable (V1) in a data frame structured as follows:
>>
>> V1 V2
>> A 5
>> a1 1
>> a2 1
>> a3 1
>> a4 1
>> a5 1
>> B 4
>> b1 1
>> b2 1
>> b3 1
>> b4 1
>>
>> I want the following:
>>
>> V1 V2 V3
>> a1 1 A
>> a2 1 A
>> a3 1 A
>> a4 1 A
>> a5 1 A
>> b1 1 B
>> b2 1 B
>> b3 1 B
>> b4 1 B
>>
>> I am not sure how to go about making this transformation besides writing a long vector that contains each of the categorical string names (these are state names, so it would be a really long vector). Any help would be greatly appreciated.
>>
>> Thanks,
>>
>> Nicholas Pretnar
>> Mizzou Economics Grad Assistant
>> npretnar at gmail.com
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list