[Rd] I() in merge (was: Re: xftrm is more than 100x slower for AsIs than for character vectors)

Kurt Hornik Kurt@Horn|k @end|ng |rom wu@@c@@t
Thu Jul 18 23:14:29 CEST 2024


>>>>> Hilmar Berger via R-devel writes:

Thanks.  I just removed the I() as suggested.

Best
-k

> Dear all,
> actually, it is not clear to me why there is still a protection of the
> added Row.names column in merge using I(). This seems to stem from a
> time when R would automatically convert character vectors to factor in
> data.frame on insert. However, I can't reproduce this behaviour even in
> data.frames generated with stringsAsFactors = T in current versions of
> R. Maybe the I() inserted in r 39026 can be removed altogether?

> Best regards

> Hilmar

> On 14.07.24 19:09, HB via R-devel wrote:
>> Dear Ivan,
>> 
>> thanks for the confirmation and the proposed patch.
>> 
>> I just wanted to add some notes regarding the relevance of this: base::merge using by.x=0 or by.y=0 (i.e. matching on row.names) will automatically add a column Row.names which is I(row.names(x)) to the corresponding input table (using I() since  revision 39026 to avoid conversion of character to factor). When this column is used for sorting (sort=TRUE by default in merge; should happen at least if all.x=T or all.y=T), this will result in slower execution.
>> 
>> xtfrm.AsIs is unchanged since its addition in r50992 (likely unrelated to the former).
>> 
>> So I guess that this just went unnoticed since it will not cause problems on small data frames.
>> 
>> Best regards
>> 
>> Hilmar
>> 
>> [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list