[Rd] do_gsub (PR#10540)
Andrew Runnalls
A.R.Runnalls at kent.ac.uk
Fri Jan 25 13:11:11 CET 2008
Hin-Tak,
Apologies for the delay in responding.
On 2008/01/11 Hin-Tak Leung wrote:
> I believe STRING_ELT() works even when pat is R_NilValue, so
> the first line "does the right thing" even if the first element does not exist. I think this is intentional and not a bug.
>
> Other more knowledgeable people please correct me if I am wrong.
>
> A.R.Runnalls at kent.ac.uk wrote:
>> In character.c within R 2.6.1, the function do_gsub contains the following code:
>>
>> 1199 if (STRING_ELT(pat, 0) == NA_STRING) {
>> 1200 PROTECT(ans = allocVector(STRSXP, n));
>> 1201 for(i = 0; i < n; i++) SET_STRING_ELT(ans, i, NA_STRING);
>> 1202 UNPROTECT(1);
>> 1203 return ans;
>> 1204 }
>> 1205
>> 1206 if (length(pat) < 1 || length(rep) < 1) error(R_MSG_IA);
>>
>> Line 1206 checks that pat contains at least one element. However,
>> line 1199 has already attempted to access the first element. The check
>> should surely come first.
Generally speaking, in the interests of software reliability and
maintainability, it is a good idea for part of a program to 'do the
right thing' itself - in the present case, to check that an array is
non-empty before assuming that it is - rather than relying on what is in
fact rather unusual behaviour of another part of the program. The case
I am concerned about is not in fact where 'pat' is R_NilValue, but when
it is a vector of length zero. When asked to create such a vector,
memory.c will in fact allocate some element space in the vector, which
is why the above problem doesn't manifest itself as a crash (in the
no-segfault.R test, for example, which explores this case); however, it
seems unwise to rely on this. And what would happen if that
(redundantly) allocated space just happened already to contain NA_STRING?
Best regards,
Andrew
More information about the R-devel
mailing list