[R] Matching backslash in a table's column using R language

Bert Gunter bgunter@4567 @end|ng |rom gm@||@com
Tue Aug 25 20:26:08 CEST 2020


1. I am far from an expert on such matters
2. It is unclear to me what your input is -- I assume a file.

The problem, as you indicate, is that R's parser sees "\B" as an incorrect
escape character, so, for example:
> cat("\B")
Error: '\B' is an unrecognized escape in character string starting ""\B"

In any case, I think you should look at ?scan. Here is an example where I
scan from the keyboard first and then remove the "\". You may have to scan
from a file to do this.

> z <-scan(file = "", what = "character")
1: A\BCDEFG
2:     #CR terminates input
Read 1 item

> cat(z)
A\BCDEFG

> nchar(z)
[1] 8  ## scan read in the "\" as a single character from the console.

> sub("\\\\","",z)  ## Yes, 4 backslashes
[1] "ABCDEFG"

There may be better ways to do this, but as I said, I'm no expert.

BTW, in posting here, please post in *plain text,* as the server can mangle
html.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Tue, Aug 25, 2020 at 9:02 AM Peter Bishop <bishop_peterj using hotmail.com>
wrote:

> In SQL, I'm using R as a way to filter data based on:
>     - 20 characters in the range <space> to <tilde>
>     - excluding <quote>, <apostrophe>, <comma>, <question mark>,
> <backslash>, <backtick>
>
> Given a SQL column containing the data:
>
>     code
>     ----
>     A\BCDEFG
>
> and the T-SQL script:
>
>     EXEC [sys].[sp_execute_external_script]
>             @language=N'R',
>             @script=N'
>     pattern1 = "^[\x20-\x7e]{1,20}$"
>     pattern2 = "[\x22\x27\x2c\x3f\x5c\x60]"
>
>     outData <- subset(inData, grepl(pattern1, code, perl=TRUE) &
> !grepl(pattern2, code, perl=TRUE))',
>             @input_data_1 = N'SELECT [code] FROM [dbo].[products]',
>             @input_data_1_name = N'inData',
>             @output_data_1_name = N'outData'
>     WITH
>             RESULT SETS (AS OBJECT [dbo].[products]);
>     GO
>
> why does the row detailed above get returned? I know that backslash is a
> special character but not in the SQL table. Consequently, the T-SQL code:
>
>     SELECT ASCII(SUBSTRING([value], 2, 1)) FROM [table]
>
> returns 92 (the ASCII code for <backslash>) which shows that this is being
> recognised as a backslash character and not as an escape indicator for the
> following "B".
>
> Can anyone advise how I can filter out the <backslash> in the way that the
> other identified characters are being successfully filtered? As the data is
> being retrieved from a table, I can’t ask the data provider to use “\\”
> instead of “\” as that will be invalid for other uses.
>
> Thanks.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list