[R] How to convert "c:\a\b" to "c:/a/b"?
Spencer Graves
spencer.graves at pdf.com
Mon Jun 27 20:32:30 CEST 2005
Hi, Henrik:
Several functions, e.g., "grep", "sub", "gsub", and "regexpr", have
an argument "perl", FALSE by default. Moreover, "?regexp" has a section
on "Perl Regular Expressions". If you can do it in perl, might that
transfer to "gsub(..., perl=TRUE)"?
Thanks,
spencer graves
p.s. I skimmed the discussion of "Pearl Regular Expressions", and
experimented with "gsub(..., perl=TRUE)" without success. However,
there may be a way to do it, and I just don't know perl and regexp well
enough to have figured it out in the time available.
Henrik Bengtsson wrote:
> Spencer Graves wrote:
>
>> Thanks, Dirk, Gabor, Eric:
>>
>> You all provided appropriate solutions for the stated problem.
>> Sadly, I oversimplified the problem I was trying to solve: I copy a
>> character string giving a DOS path from MS Windows Explorer into an R
>> script file, and I get something like the following:
>>
>> D:\spencerg\statmtds\R\Rnews
>>
>> I want to be able to use this in R with its non-R meaning, e.g.,
>> in readLine, count.fields, read.table, etc., after appending a file
>> name. Your three solutions all work for my oversimplified toy example
>> but are inadequate for the problem I really want to solve.
>
>
> Hmmm. It should work as long as you do not source() the file (see
> below). There are two things to watch out for here.
>
> First, you have to be careful with backslashes, that is, a backslash is
> a single character ('\') in memory, but to be typed at the R prompt, you
> have to escape it (with a backslash), which is why we type "\\", cf.
> nchar("\\") == 0. Consider the file foo.txt containing the 28
> characters (==28 bytes in plain ASCII format)
>
> D:\spencerg\statmtds\R\Rnews
>
> You can create such a file in R by
>
> > cat(file="foo.txt", "D:\\spencerg\\statmtds\\R\\Rnews")
> > str(file.info("foo.txt"))
> `data.frame': 1 obs. of 6 variables:
> $ size : num 28
> $ isdir: logi FALSE
> $ mode :Class 'octmode' int 438
> $ mtime:'POSIXct', format: chr "2005-06-27 19:14:20"
> $ ctime:'POSIXct', format: chr "2005-06-27 19:14:20"
> $ atime:'POSIXct', format: chr "2005-06-27 19:14:20"
>
> Re-read it into R:
> > bfr <- readLines("foo.txt")
> Warning message:
> incomplete final line found by readLines on 'foo.txt'
> > bfr
> [1] "D:\\spencerg\\statmtds\\R\\Rnews"
> > cat("bfr='", bfr, "'\n", sep="")
> bfr='D:\spencerg\statmtds\R\Rnews'
>
> Now, convert backslashes to "forwardslashes":
> bfr2 <- gsub("\\\\", "/", bfr)
> > bfr2
> [1] "D:/spencerg/statmtds/R/Rnews"
> > cat("bfr2='", bfr2, "'\n", sep="")
> bfr2='D:/spencerg/statmtds/R/Rnews'
>
> Second, regular expression patterns have their own escaping rules. This
> is why the following happens:
>
> bfr3 <- gsub("\\", "/", bfr)
> Error in gsub(pattern, replacement, x, ignore.case, extended, fixed) :
> invalid regular expression '\'
>
> The pattern "\\", which is a single '\' in memory, is passed to gsub().
> Then gsub() tries to interpret this single backslash as a pattern, but
> it is invalid. gsub() uses backslashed to escape some characters in
> patterns. So, when you think what gsub() needs, this about the
> characters (bytes) that are really stored in memory, not what you see.
>
> A side comment: Wouldn't it be nice if the R parser had an alternative
> way to quote string such that, say, Perl strings could be used? Example:
>
> bfr3 <- gsub("\\\\", "/", bfr)
> bfr3 <- gsub('\\', "/", bfr)
>
> would be equal (if now single quotes wouldn't have been reserved already).
>
> Back to your problem: You must not paste the 28 characters into an R
> script that you source()! If you want to include you pathname (copied
> from the command prompt), you have to escape each '\' with a '\' to
> '\\'. Thus, if you use Emacs or another text editor, you pretty much
> should see '\\' if you want R(!) to interpret this as the single
> character '\'. Note the difference between using source() and, say,
> readLines().
>
> Hope this helps
>
> Henrik
>
>
>> Thanks,
>> spencer graves
>>
>> Gabor Grothendieck wrote:
>>
>>
>>> On 6/27/05, Dirk Eddelbuettel <edd at debian.org> wrote:
>>>
>>>
>>>> On 26 June 2005 at 20:30, Spencer Graves wrote:
>>>> | How can one convert back slashes to forward slashes, e.g,
>>>> changing
>>>> | "c:\a\b" to "c:/a/b"? I tried the following:
>>>> |
>>>> | > gsub("\\\\", "/", "c:\a\b")
>>>> | [1] "c:\a\b"
>>>>
>>>> This does work, provided you remember that single backslashed "don't
>>>> exist"
>>>> as e.g. \a is a character in itself. So use doubles are you should
>>>> be fine:
>>>>
>>>>
>>>>
>>>>> gsub("\\\\", "/", "c:\\a\\b")
>>>>
>>>>
>>>> [1] "c:/a/b"
>>>>
>>>
>>>
>>> Also, if one finds four backslashes confusing one can avoid the use
>>> of four via any of these:
>>>
>>> gsub("[\\]", "/", "c:\\a\\b")
>>> gsub("\\", "/", "c:\\a\\b", fixed = TRUE)
>>> chartr("\\", "/", "c:\\a\\b")
>>>
>>> ______________________________________________
>>> R-help at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide!
>>> http://www.R-project.org/posting-guide.html
>>
>>
>>
>
--
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA
spencer.graves at pdf.com
www.pdf.com <http://www.pdf.com>
Tel: 408-938-4420
Fax: 408-280-7915
More information about the R-help
mailing list