[Rd] subRaw?
Spencer Graves
spencer.graves at structuremonitoring.com
Fri Jul 20 18:22:20 CEST 2012
Hi, Hervé:
On 7/19/2012 10:19 PM, Hervé Pagès wrote:
> Hi Spencer,
>
> On 07/19/2012 08:29 PM, Spencer Graves wrote:
>> Hello, All:
>>
>>
>> Do you know of any capability to substitute more then one byte in
>> an object of class Raw?
>>
>>
>> Consider the following:
>>
>>
>> > let4 <- paste(letters[1:4], collapse='')
>> > (let4Raw <- charToRaw(let4))
>> [1] 61 62 63 64
>> > (let. <- sub('bc', '--', let4Raw))
>> [1] "61" "62" "63" "64"
>> > # no substitution
>> > (bc <- charToRaw('bc'))
>> [1] 62 63
>> > (ef <- charToRaw('ef'))
>> [1] 65 66
>> > (let. <- sub(bc, ef, let4Raw))
>> [1] "61" "65" "63" "64"
>> Warning messages:
>> 1: In sub(bc, ef, let4Raw) :
>> argument 'pattern' has length > 1 and only the first element will be
>> used
>> 2: In sub(bc, ef, let4Raw) :
>> argument 'replacement' has length > 1 and only the first element will
>> be used
>
> It makes no sense to use sub(), grep(), and family (i.e. all the stuff
> based on the regex code) *directly* on a raw vector because all these
> functions will start by coercing their 'x', 'text', 'pattern',
> 'replacement' args to character with as.character (if they are not
> already character).
>
> But the way as.character() operates on a raw vector won't give good
> results in that context. You'd rather do the coercion yourself first
> with rawToChar(), and coerce back the result with charToRaw():
>
> > charToRaw(sub("bc", "--", rawToChar(let4Raw)))
> [1] 61 2d 2d 64
>
> IMO it would make much more sense that sub(), grep(), and family()
> raise an error than blindly try to coerce to character but these
> functions (like many functions in R) are too polite to tell the
> user s/he's doing something wrong.
Thanks for the reply.
It sounds like you agree that a function "subRaw" to facilitate
this would be useful. In my testing, charToRaw(sub(pattern,
replacement, rawToChar(x)) did NOT preserve binary codes that did not
match legitimate characters. I tried several things before finding one
that seemed to work.
Best Wishes,
Spencer
>
> Cheers,
> H.
>
>>
>>
>> In this example, "b" was replaced by "e", but "bc" was not
>> replaced by "ef"? Do you know of any function to do this?
>>
>>
>> I ask, because I need it. I've written such a function, subRaw
>> for my own use. If I don't hear that another exists, I plan to add the
>> one I've written to the oro.dicom package.
>>
>>
>> Thanks,
>> Spencer
>>
>>
>> > sessionInfo()
>> R version 2.15.1 (2012-06-22)
>> Platform: x86_64-pc-mingw32/x64 (64-bit)
>>
>> locale:
>> [1] LC_COLLATE=English_United States.1252
>> [2] LC_CTYPE=English_United States.1252
>> [3] LC_MONETARY=English_United States.1252
>> [4] LC_NUMERIC=C
>> [5] LC_TIME=English_United States.1252
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
More information about the R-devel
mailing list