[R] Must be obvious but not to me : problem with regular expression

Duncan Murdoch murdoch at stats.uwo.ca
Mon Dec 17 15:46:17 CET 2007


On 12/17/2007 9:34 AM, Ptit_Bleu wrote:
> Hi,
> 
> I have a vector called nfichiers of 138 names of file whose extension is .P0
> or P1 ... to P8.
> The script is not the same when the extension is P0 or P(1 to 8).
> 
> Examples of file names :
> [128] "Output0.P0"       
> [129] "Output0.P1"       
> [130] "Output0.P2"       
> [131] "Output01102007.P0"
> [132] "Output01102007.P1"
> [133] "Output01102007.P2"
> [134] "Output01102007.P3"
> [135] "Output01102007.P4"
> 
> 
> To extract the names of file with .P0 extension I wrote :
> nfichiers[grep(".P0", nfichiers)]
> For the other extensions :
> nfichiers[grep(".P[^0]", nfichiers)]
> 
> But for the last, I get a length of 138 that is the length of the initial
> vector although I have 130 files with .P0 extension.

One problem above is that "." is special in regular expressions.  I'd 
also suggest adding $ at the end, to force the match to the end of the 
string.  That is, code as

grep("\\.P0$", nfichiers)

and

grep("\\.P[^0]$", nfichiers)

I don't know what false matches you were seeing, but this should 
eliminate some.

Duncan Murdoch

> 
> So I tried "manually" with a small vector :
>> s
> [1] "aa.P0" "bb.P0" "cc.P1" "dd.P2"
>> s[grep(".P[^0]", s)]
> [1] "cc.P1" "dd.P2"
> 
> It works !!!
> 
> Has someone an idea to solve this small problem ?
> Thanks in advance,
> Ptit Bleu.
> 
>



More information about the R-help mailing list