[R] remove Punctuation characters
Filipe Almeida
milheiros at gmail.com
Wed May 10 10:42:09 CEST 2006
Thanks a lot!!
Filipe Almeida
Marc Schwartz (via MN) wrote:
> On Tue, 2006-05-09 at 16:50 +0100, Filipe Almeida wrote:
>
>> Hi,
>>
>> I want to remove all punctuation characters in a string. I was trying it use
>> a regular expressions but it doesn't work.
>> Here is a sample os what i want:
>>
>> str <- 'ABD - remove de punct, and dot characters.'
>> str <- gsub('[:punct:]','',str)
>> str
>> "'ABD remove de punct and dot characters"
>>
>> is there any function that do this kind of thing?
>>
>> Thanks to all.
>>
>> Filipe Almeida
>>
>
> You almost have it. Just need to double the brackets:
>
>
>> str
>>
> [1] "ABD - remove de punct, and dot characters."
>
>
>> gsub("[[:punct:]]", "", str)
>>
> [1] "ABD remove de punct and dot characters"
>
>
> Note the following in ?regex:
>
> For example, [[:alnum:]] means [0-9A-Za-z], except the latter depends
> upon the locale and the character encoding, whereas the former is
> independent of locale and character set. (Note that the brackets in
> these class names are part of the symbolic names, and must be included
> in addition to the brackets delimiting the bracket list.) Most
> metacharacters lose their special meaning inside lists. To include a
> literal ], place it first in the list. Similarly, to include a literal
> ^, place it anywhere but first. Finally, to include a literal -, place
> it first or last. (Only these and \ remain special inside character
> classes.)
>
> HTH,
>
> Marc Schwartz
>
>
>
>
More information about the R-help
mailing list