[Rd] arithmetic with zero-column data.frames
Martin Maechler
maechler at stat.math.ethz.ch
Mon Aug 14 14:44:14 CEST 2017
>>>>> Martin Maechler <maechler at stat.math.ethz.ch>
>>>>> on Wed, 9 Aug 2017 12:39:26 +0200 writes:
> So as often there is more to it than you first think.
> Let's consider this an RFC (for experienced long time R users) :
>>>>> Martin Maechler <maechler at stat.math.ethz.ch>
>>>>> on Wed, 9 Aug 2017 10:45:56 +0200 writes:
>>>>> William Dunlap via R-devel <r-devel at r-project.org>
>>>>> on Tue, 8 Aug 2017 11:59:45 -0700 writes:
>>> Should arithmetic operations work on zero-column data.frames (returning a
>>> zero-column data.frame with the same number of rows as the data.frame
>>> argument(s))? Currently we get:
>>>> 1 + data.frame(row.names=c("A","B"))
>>> Error in data.frame(value, row.names = rn, check.names = FALSE, check.rows
>>> = FALSE) :
>>> row names supplied are of the wrong length
>>>> data.frame(row.names=c("A","B")) * 2
>>> Error in data.frame(value, row.names = rn, check.names = FALSE, check.rows
>>> = FALSE) :
>>> row names supplied are of the wrong length
>>>> data.frame(row.names=c("A","B")) / data.frame(row.names=c("A","B"))
>>> Error in data.frame(value, row.names = rn, check.names = FALSE, check.rows
>>> = FALSE) :
>>> row names supplied are of the wrong length
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>> Thank you, Bill.
>> Yes, indeed, as we have the Ops.data.frame and
>> Math.data.frame group methods (about which I have not always
>> been so happy, but they are inheritance from S),
>> and as the Math methods work too, we should get this boundary
>> case working as well for the Ops.
> Hmm.. This time, I'd be glad for comments, notably from you, Bill:
> In looking at this, I notice that "^" is treated
> exceptionally, possibly not on purpose, i.e., accidentally. E.g.,
> USArrests ^ 2 returns a matrix where all other arithmetic
> Ops give a data frame.
> All non-arithmetic Ops do give a matrix [also not documentedly, AFAICS].
> and currently "^" is treated like them.
> Note that Math.data.frame always returns a data frame (when it
> does return), so we currently have this ugly inconsistency:
>> str(USArrests ^ 0.5)
> num [1:50, 1:4] 3.63 3.16 2.85 2.97 3 ...
> - attr(*, "dimnames")=List of 2
> ..$ : chr [1:50] "Alabama" "Alaska" "Arizona" "Arkansas" ...
> ..$ : chr [1:4] "Murder" "Assault" "UrbanPop" "Rape"
>> str(sqrt(USArrests))
> 'data.frame': 50 obs. of 4 variables:
> $ Murder : num 3.63 3.16 2.85 2.97 3 ...
> $ Assault : num 15.4 16.2 17.1 13.8 16.6 ...
> $ UrbanPop: num 7.62 6.93 8.94 7.07 9.54 ...
> $ Rape : num 4.6 6.67 5.57 4.42 6.37 ...
>>
> I propose to add "^" to the other arithmetic ops which return a
> data frame. So in the above, '^ 0.5' would give the same [upto
> lowest bit rounding error] as sqrt().
> - -- - -- - --
> A further inconsistency is that the Math methods directly refuse
> to work on a data frame with non-numeric variables, whereas the
> Ops methods just go along and give warnings and NA's:
>> sqrt(CO2)
> Error in Math.data.frame(CO2) :
> non-numeric variable in data frame: PlantTypeTreatment
>> str( CO2 ^ 0.5 )
> num [1:84, 1:5] NA NA NA NA NA NA NA NA NA NA ...
> - attr(*, "dimnames")=List of 2
> ..$ : chr [1:84] "1" "2" "3" "4" ...
> ..$ : chr [1:5] "Plant" "Type" "Treatment" "conc" ...
> Warning messages:
> 1: In Ops.ordered(left, right) : '^' is not meaningful for ordered factors
> 2: In Ops.factor(left, right) : ‘^’ not meaningful for factors
> 3: In Ops.factor(left, right) : ‘^’ not meaningful for factors
>>
> One "clean" radical solution here would be for the Ops method
> to also directly give an error as the Math one.
> But that may be undesirable.
> Assume people have data frame variables of classes where an Ops method is
> defined for it. Then the corresponding "op" is applied
> everywhere and the result maybe useful and as desired.
> So, I'm much less sure what's desireable here.
> Should we just document the behavior of this latter inconsistency?
as there was no feedback from anyone,
I have now committed -- to R-devel only, svn 73093 -- what I had proposed
above:
- arithmetic for 0-column data frames now works
- "Arith"metic giving data frames, also for '^'
- the other "Ops", i.e., "Compare" and "Logic" continue
to return a logical matrix, and this is now documented.
Martin Maechler
ETH Zurich and R Core
More information about the R-devel
mailing list