[R] Why does a custom function called is.numeric.factor break lattice?

David Winsemius dwinsemius at comcast.net
Mon Nov 16 18:59:24 CET 2015


> On Nov 16, 2015, at 9:35 AM, sbihorel <Sebastien.Bihorel at cognigencorp.com> wrote:
> 
> Hi,
> 
> Thanks everyone for all your insights...
> 
> I feel that the discussion is getting way deeper and more technical and it needs to be from the point of view of what I was trying to achieve with my little "is.numeric.factor" function (ie, checking if an object is a factor and if all levels of this factor can be coerced to numeric values).

You seem to be asking for a compound test: first with is.factor, then to see whether all the levels could be coerced to numeric properly. I would think that you would need something like:

 if( is.factor(varname) ) { !sum(is.na(as.numeric(as.character(varname)))) } else { FALSE }

— 
David.


> 
> I guess that, as Duncan pointed point, using dots in function names becomes bad practice for function starring "is". I'll rename my function, that's it.
> 
> 
> On 11/16/2015 11:43 AM, Martin Maechler wrote:
>>>>>>> Bert Gunter <bgunter.4567 at gmail.com>
>>>>>>>     on Mon, 16 Nov 2015 08:21:09 -0800 writes:
>>     > Thanks Duncan. You are right; I missed this.
>> 
>>     > Namespaces and full qualification seems the only reliable solution to
>>     > the general issue though -- right?
>> 
>> Not in this case;  full qualification is very very rarely needed
>> in package code (even some "schools" do use and propagate it
>> much more than I would recommend), and we are talking about the
>> lattice code, i.e., package code, not user code, here.
>> 
>> I.e., using  base::is.numeric()  would not help at all: It
>> will still find the bogous  is.numeric.factor because that is
>> taken before the internal default method.
>> 
>> Also, I'm almost sure S4 dispatch would suffer from the same
>> feature of S (and hence R) here:  You are allowed to define
>> methods for your new classes and they are used "dynamically".
>> (I also don't think that the problem is related to the fact that this
>>  a.b.c() case is S3-ambigous:  a() method for "b.c" or a.b() method for "c".)
>> 
>> Unfortunately, this can be misused to define methods for
>> existing ("base") classes in case they are handled by the default method.
>> OTOH, if base/stats/... already *had* a 'factor' method for
>> is.numeric(), be it S3 or S4, no harm would have been done by
>> the bad user defined is.numeric.factor definition, thanks to the
>> namespace technology.
>> 
>> To get full protection here, we would have to
>> store "the dispatch table for all base classes" (a pretty vague notion)
>> with the package at package build time or install time ("load time" is too late:
>> the bad  is.numeric.factor() could already be present at package load time).
>> 
>> I'm not sure this would be is easily feasible.... but it may be
>> something to envisage for R 4.0.0 ..
>> 
>> Martin
>> 
>>     > Cheers,
>>     > Bert
>> 
>>     > Bert Gunter
>> 
>>     > "Data is not information. Information is not knowledge. And knowledge
>>     > is certainly not wisdom."
>>     > -- Clifford Stoll
>> 
>> 
>>     > On Mon, Nov 16, 2015 at 7:42 AM, Duncan Murdoch
>>     > <murdoch.duncan at gmail.com> wrote:
>>     >> On 16/11/2015 10:22 AM, Bert Gunter wrote:
>>     >>>
>>     >>> There is no multiple dispatch; just multiple misunderstanding.
>>     >>>
>>     >>> The generic function is "is.numeric" . Your method for factors is
>>     >>> "is.numeric.factor".
>>     >>>
>>     >>> You need to re-study.
>>     >>
>>     >>
>>     >>
>>     >> I think the problem is with S3.  "is.numeric.factor" could be a
>>     >> "numeric.factor" method for the "is" generic, or a "factor" method for the
>>     >> "is.numeric" generic.  Using names with dots is a bad idea. This would be
>>     >> all be simpler and less ambiguous if the class had been named
>>     >> "numeric_factor" or "numericFactor" or anything without a dot.
>>     >>
>>     >> Duncan Murdoch
> 
> -- 
> Sebastien Bihorel
> Cognigen Corporation
> (t) +1 716 633 3463 ext 323
> Cognigen Corporation, a wholly owned subsidiary of Simulations Plus, Inc.
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list