[R] Counting the occurences of a charater within a string

Florent D. flodel at gmail.com
Fri Dec 2 04:44:48 CET 2011


Resending my code, not sure why the linebreaks got eaten:

> x <- data.frame(Col1 = c("abc/def", "ghi/jkl/mno"), stringsAsFactors = FALSE)
> count.slashes <- function(string)sum(unlist(strsplit(string, NULL)) == "/")
> within(x, Col2 <- vapply(Col1, count.slashes, 1))
         Col1 Col2
1     abc/def    1
2 ghi/jkl/mno    2


On Thu, Dec 1, 2011 at 10:32 PM, Florent D. <flodel at gmail.com> wrote:
> I used within and vapply:
>
> x <- data.frame(Col1 = c("abc/def", "ghi/jkl/mno"), stringsAsFactors = FALSE)
> count.slashes <- function(string)sum(unlist(strsplit(string, NULL)) ==
> "/")within(x, Col2 <- vapply(Col1, count.slashes, 1))
>          Col1 Col21     abc/def    12 ghi/jkl/mno    2
>
> On Thu, Dec 1, 2011 at 1:05 PM, Bert Gunter <gunter.berton at gene.com> wrote:
>> ## It's not a data frame -- it's just a vector.
>>
>>> x
>> [1] "abc/def"     "ghi/jkl/mno"
>>> gsub("[^/]","",x)
>> [1] "/"  "//"
>>> nchar(gsub("[^/]","",x))
>> [1] 1 2
>>>
>>
>> ?gsub
>> ?nchar
>>
>> -- Bert
>>
>> On Thu, Dec 1, 2011 at 8:32 AM, Douglas Esneault
>> <Douglas.Esneault at mecglobal.com> wrote:
>>> I am new to R but am experienced SAS user and I was hoping to get some help on counting the occurrences of a character within a string at a row level.
>>>
>>> My dataframe, x,  is structured as below:
>>>
>>> Col1
>>> abc/def
>>> ghi/jkl/mno
>>>
>>> I found this code on the board but it counts all occurrences of "/" in the dataframe.
>>>
>>> chr.pos <- which(unlist(strsplit(x,NULL))=='/')
>>> chr.count <- length(chr.pos)
>>> chr.count
>>> [1] 3
>>>
>>> I'd like to append a column, say cnt, that has the count of "/" for each row.
>>>
>>> Can anyone point me in the right direction or offer some code to do this?
>>>
>>> Thanks in advance for the help.
>>>
>>> Doug Esneault
>>>
>>>
>>>
>>>
>>>
>>>
>>> Privileged/Confidential Information may be contained in this message. If you
>>> are not the addressee indicated in this message (or responsible for delivery
>>> of the message to such person), you may not copy or deliver this message to
>>> anyone. In such case, you should destroy this message and kindly notify the
>>> sender by reply email. Please advise immediately if you or your employer
>>> does not consent to email for messages of this kind. Opinions, conclusions
>>> and other information in this message that do not relate to the official
>>> business of the GroupM companies shall be understood as neither given nor
>>> endorsed by it.   GroupM companies are a member of WPP plc. For more
>>> information on our business ethical standards and Corporate Responsibility
>>> policies please refer to our website at
>>> http://www.wpp.com/WPP/About/
>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> --
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>>
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list