[R] regular expression

Laurent Rhelp laurentRhelp at free.fr
Sat Apr 7 17:25:34 CEST 2007


Uwe Ligges a écrit :

>Laurent Rhelp wrote:
>  
>
>>Uwe Ligges a écrit :
>>
>>    
>>
>>>Laurent Rhelp wrote:
>>>
>>>      
>>>
>>>>Dear R-List,
>>>>
>>>>     I have a great many files in a directory and I would like to 
>>>>replace in every file the character " by the character ' and in the 
>>>>same time, I have to change ' by '' (i.e. the character ' twice and 
>>>>not the unique character ") when the character ' is embodied in "....."
>>>>  So, "....." becomes '.....' and ".....'......" becomes '.....''......'
>>>>Certainly, regular expression could help me but I am not able to use it.
>>>>
>>>>How can I do that with R ?
>>>>        
>>>>
>>>
>>>In fact, you do not need to know anything about regular expressions in 
>>>this case, since you are simply going to replace certain characters by 
>>>others without any fuzzy restrictions:
>>>
>>>x <- "\".....'......\""
>>>cat(x, "\n")
>>>xn <- gsub('"', "'", gsub("'", "''", x))
>>>cat(xn, "\n")
>>>
>>>
>>>Uwe Ligges
>>>
>>>
>>>      
>>>
>>>>Thank you very much
>>>>
>>>>______________________________________________
>>>>R-help at stat.math.ethz.ch mailing list
>>>>https://stat.ethz.ch/mailman/listinfo/r-help
>>>>PLEASE do read the posting guide 
>>>>http://www.R-project.org/posting-guide.html
>>>>and provide commented, minimal, self-contained, reproducible code.
>>>>        
>>>>
>>>
>>>      
>>>
>>Yes, You are right. So I wrote the code below (that I find a little 
>>awkward but it works).
>>
>>##-----
>>
>>dirdata <- getwd()
>>fichnames <- list.files(path=paste(dirdata,"\\initial\\",sep=""))
>>    
>>
>
>see ?file.path to improve the above.
>
>
>  
>
>>for( i in 1:length(fichnames)){
>>    
>>
>
>see ?seq to improve the above: seq(along = fichnames)
>Or even better, just work on the names (see below).
>
>  
>
>>     filein <- paste(dirdata,"\\initial\\",fichnames[i],sep="")
>>    
>>
>
>again, file.path() is your friend
>
>  
>
>>     conin <- file(filein)
>>     open(conin)        
>>    
>>
> >      nbrows <- length( readLines(conin,n=-1) )
>  
>
>>     close(conin)
>>    
>>
>
>You can simply use readLines() with the filename which open the 
>connection to a file itself. And I do not see why you want to read the 
>file here. Since your code becomes really complicated now, let me 
>suggest the following procedure (untested!):
>
>dirdata <- getwd()
>fichnames <- list.files(file.path(dirdata, "initial"))
>for(i in fichnames){
>     temp <- readLines(file.path(dirdata, "initial", i))
>     temp <- gsub('"', "'", gsub("'", "''", temp))
>     writeLines(temp, con = file.path(dirdata, "result", i))
>}
>
>Uwe Ligges
>
>
>
>
>
>  
>
>>     fileout <- paste(dirdata,"\\result\\",fichnames[i],sep="")
>>     conout <- file(fileout,"w")
>>
>>     conin <- file(filein)
>>     open(conin)
>>
>>
>>     for( l in 1:nbrows )
>>     {
>>       text <- gsub('"',"'",gsub("'","''",readLines(conin,n=1)))
>>       writeLines(con=conout,text=text)
>>     }
>>
>>     close(conin)
>>     close(conout)
>> }
>>
>>##------
>>    
>>
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>
>  
>
I had had to modify the line below to allow for the connexion :

    temp <- readLines(file(file.path(dirdata, "initial", i)))

I didn't understand that readLines read all the file in one go, I 
understood that it read only one line !..so I did a loop on the lines of 
every file which is not necessary.

Thank you very much.



More information about the R-help mailing list