[R] String Manipulation- Extract numerical and alphanumerical segment

jim holtman jholtman at gmail.com
Fri Feb 5 17:58:13 CET 2010


The '[[' is just the index access to an object.  type:

?'[['

to see the help page.

Actually I should have used '[' in this case:



> sapply(y, '[', 1)
[1] "1234567" "1234567" "1234567"

is equivalent to:

> sapply(y, function(a) a[1])
[1] "1234567" "1234567" "1234567"
>


So set a value based on the first character, just extract the first
character (e.g., substring) and then index into a vector with the key
values:

> key <- c(z=1, a=2, b=3)  # mapping values
> data <- c('a','c','b','d','z','a','b')  # data to be mapped
> key[data]
   a <NA>    b <NA>    z    a    b
   2   NA    3   NA    1    2    3



On Fri, Feb 5, 2010 at 10:41 AM, Su C. <sushikyc at gmail.com> wrote:
>
> Yes, that was perfect! Thank you so much!
>
> Just to clarify, since I'm kind of new to string manipulation-- is that '[['
> in the sapply function what is designating splits/elements within the
> string? So that's the part that says "I want this particular element" and
> the "1" or "2" or "number" is what designates location?
>
> And, if while looking at the second column, I want to verify if the
> alphabetical character is say, a 'z' or an 'a' or a 'b', what would be an
> elegant way to do that besides splitting the second column into alphabetical
> and numerical values, and then testing against z,a,b, using a for loop and a
> boolean statement? I want to assign a 1 for z's, a 2 for a's, and a 3 for
> b's.
>
>
> On Fri, Feb 5, 2010 at 10:30 AM, jholtman [via R] <
> ml-node+1470341-841877914 at n4.nabble.com<ml-node%2B1470341-841877914 at n4.nabble.com>
>> wrote:
>
>> Does this help:
>>
>> > x <-
>> c("1234567.z3.abcdef-gh.12","1234567.z3.abcdef-gh.12","1234567.z3.abcdef-gh.12")
>>
>> > y <- strsplit(x, '[.]')
>> >
>> > y
>> [[1]]
>> [1] "1234567"   "z3"        "abcdef-gh" "12"
>>
>> [[2]]
>> [1] "1234567"   "z3"        "abcdef-gh" "12"
>>
>> [[3]]
>> [1] "1234567"   "z3"        "abcdef-gh" "12"
>>
>> > y.1 <- sapply(y, '[[', 1)
>> > y.1
>> [1] "1234567" "1234567" "1234567"
>> > y.2 <- sapply(y, '[[', 2)
>> > y.2
>> [1] "z3" "z3" "z3"
>> >
>>
>>
>> On Fri, Feb 5, 2010 at 10:11 AM, Su C. <[hidden email]<http://n4.nabble.com/user/SendEmail.jtp?type=node&node=1470341&i=0>>
>> wrote:
>>
>> >
>> > I am currently attempting to split a long list of strings (let's call it
>> > "string.list") that is of the format:
>> >
>> > "1234567.z3.abcdef-gh.12"
>> >
>> > I have gotten it to:
>> > "1234567"  "z3"  "abcdef-gh"  "12"
>> > by use of the strsplit function.
>> >
>> > This leaves me with each element of "string.list" having a split string
>> of
>> > the above format. What I'd like to do now is extract the first two
>> strings
>> > of each element in "string.list" -- the "1234567" and the "z3" -- and
>> place
>> > them into two separate lists, say, "firstsplit.numeric.list" and
>> > "secondsplit.alphanumeric.list"
>> >
>> > I'm having some trouble figuring out how to do this. Any help would be
>> > greatly appreciated!
>> > --
>> > View this message in context:
>> http://n4.nabble.com/String-Manipulation-Extract-numerical-and-alphanumerical-segment-tp1470301p1470301.html
>> > Sent from the R help mailing list archive at Nabble.com.
>> >
>> > ______________________________________________
>> > [hidden email]<http://n4.nabble.com/user/SendEmail.jtp?type=node&node=1470341&i=1>mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>>
>>
>>
>> --
>> Jim Holtman
>> Cincinnati, OH
>> +1 513 646 9390
>>
>> What is the problem that you are trying to solve?
>>
>> ______________________________________________
>> [hidden email]<http://n4.nabble.com/user/SendEmail.jtp?type=node&node=1470341&i=2>mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> ------------------------------
>>  View message @
>> http://n4.nabble.com/String-Manipulation-Extract-numerical-and-alphanumerical-segment-tp1470301p1470341.html
>> To unsubscribe from String Manipulation- Extract numerical and
>> alphanumerical segment, click here< (link removed) ==>.
>>
>>
>>
>
>
> --
> Su H. Chu
> Carnegie Mellon University
> Economics and Statistics '09
>
> --
> View this message in context: http://n4.nabble.com/String-Manipulation-Extract-numerical-and-alphanumerical-segment-tp1470301p1470358.html
> Sent from the R help mailing list archive at Nabble.com.
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



More information about the R-help mailing list