[R] Re gular Expression help

Wacek Kusnierczyk Waclaw.Marcin.Kusnierczyk at idi.ntnu.no
Sat Nov 8 00:43:12 CET 2008


Wacek Kusnierczyk wrote:
> Peter Dalgaard wrote:
>   
>> Rajasekaramya wrote:
>>   
>>     
>>> hi there
>>>
>>> I have a vector with a set of data.I just wanna seperate them based on the
>>> first p and q values metioned within the data.
>>>
>>> [1] chr10p15.3 /// chr3q29 /// chr4q35 /// chr9q34.3
>>> [2] chr1q22-q24                                     
>>> [3] chr1q22-q24                                     
>>> [4] chr1pter-q24                                    
>>> [5] chr1pter-q24                                    
>>> [6] chr1pter-q24  
>>>
>>> i used a regular expression [+q*] to match up the values but it matches q
>>> found anywhere i know i have written like that but i jus want it to match
>>> the first p or q values.
>>>
>>> my result should be for q and 
>>> [2] chr1q22-q24                                      
>>> [3] chr1q22-q24  
>>>
>>> for p
>>> [1] chr10p15.3 /// chr3q29 /// chr4q35 /// chr9q34.3
>>> [4] chr1pter-q24                                    
>>> [5] chr1pter-q24                                    
>>> [6] chr1pter-q24 
>>>
>>>     
>>>       
>> Something like
>>
>> sub("[^pq]*([pq]).*","\\1",x)
>>
>> should get you the first p or q
>>
>>   
>>     
>
> and the following will do the whole job (assuming x is your vector):
>
> result = lapply(
>    list(p='p', q='q'),
>    function(letter)
>       grep(paste("^[^pq]*[", "]", sep=letter), x, value=TRUE))
>
>   

and this one might be slightly faster, depending on your data:

result = local({
   p = grep("^[^pq]*p", d)
   list(p=d[p], q=d[-p])
})

vQ



More information about the R-help mailing list