[R] User defined function within a formula
William Dunlap
wdunlap at tibco.com
Fri Jul 17 01:39:52 CEST 2015
OPoly<-function(x,degree=1,weight=1){
weight=round(weight,0)# weight need to be integer
if(length(weight)!=length(x))weight=rep(1,length(x))
p=poly(4*(rep(x,weight)-mean(range(x)))/diff(range(x)),degree)
Z<-(t(t(p[cumsum(weight),])*sqrt(attr(p,"coefs")$norm2[-
seq(2)]))[,degree])
class(Z)<-"OPoly";Z
}
You need to make OPoly to have optional argument(s) that give
the original-regressor-dependent information to OPoly and then
have it return, as attributes, the value of those arguments.
makepredictcall
will take the attributes and attach them to the call in predvars so
predict uses values derived from the original regressors, not value derived
from the data to be predicted from.
Take a look at a pair like makepredictcall.scale() and scale() for an
example:
scale has optional arguments 'center' and 'scale' that it returns as
attributes
and makepredictcall.scale adds those to the call to scale that it is given.
Thus when you predict, the scale and center arguments come from the
original data, not from the data you are predicting from.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Thu, Jul 16, 2015 at 3:43 PM, Kunshan Yin <yinkunshan at gmail.com> wrote:
> Thanks Bill for your quick reply.
>
> I tried your solution and it did work for the simple user defined function
> xploly. But when I try with other function, it gave me error again:
>
> OPoly<-function(x,degree=1,weight=1){
> weight=round(weight,0)# weight need to be integer
> if(length(weight)!=length(x))weight=rep(1,length(x))
> p=poly(4*(rep(x,weight)-mean(range(x)))/diff(range(x)),degree)
>
> Z<-(t(t(p[cumsum(weight),])*sqrt(attr(p,"coefs")$norm2[-seq(2)]))[,degree])
> class(Z)<-"OPoly";Z
> }
>
> ##this OPoly is an FORTRAN orthogonal polynomial routine, it first maps
> the x to range[-2,2] then do QR, then return the results with sqrt(norm2).
> Comparing with poly, this transformation will make the model coefficients
> within a similar range as other variables, the R poly routine will usually
> give you a very large coefficients. I did not find such routine in R, so I
> have to define this as user defined function.
> #######
>
> I also have following function as you suggested:
>
> makepredictcall.OPoly<-function(var,call)
> {
> if (is.call(call)) {
> if (identical(call[[1]], quote(OPoly))) {
> if (!is.null(tmp <- attr(var, "coefs"))) {
> call$coefs <- tmp
> }
> }
> }
> call
> }
>
>
> But I still got error for following:
>
> > g3=glm(lot1 ~ log(u) + OPoly(u,1), data = clotting, family = Gamma)
>
> > predict(g3,dc)Error in poly(4 * (rep(x, weight) - mean(range(x)))/diff(range(x)), degree) :
> missing values are not allowed in 'poly'
>
> I thought it might be due to the /diff(range(x) in the function. But even
> I remove that part, it will still give me error. Any idea?
>
> Many thanks in advance.
>
> Alex
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Thu, Jul 16, 2015 at 2:09 PM, William Dunlap <wdunlap at tibco.com> wrote:
>
>> Read about the 'makepredictcall' generic function. There is a method,
>> makepredictcall.poly(), for poly() that attaches the polynomial
>> coefficients
>> used during the fitting procedure to the call to poly() that predict()
>> makes.
>> You ought to supply a similar method for your xpoly(), and xpoly() needs
>> to return an object of a a new class that will cause that method to be used.
>>
>> E.g.,
>>
>> xpoly <- function(x,degree=1,...){ ret <- poly(x,degree=degree,...);
>> class(ret) <- "xpoly" ; ret }
>> makepredictcall.xpoly <- function (var, call)
>> {
>> if (is.call(call)) {
>> if (identical(call[[1]], quote(xpoly))) {
>> if (!is.null(tmp <- attr(var, "coefs"))) {
>> call$coefs <- tmp
>> }
>> }
>> }
>> call
>> }
>>
>> g2 <- glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family = Gamma)
>> predict(g2,dc)
>> # 1 2 3 4
>> 5
>> #-0.01398928608 -0.01398928608 -0.01398928608 -0.01398928608
>> #-0.01398928608
>> # 6 7 8 9
>> #-0.01398928608 -0.01398928608 -0.01398928608 -0.01398928608
>>
>> You can see the effects of makepredictcall() in the 'terms' component
>> of glm's output. The 'variables' attribute of it gives the original
>> function
>> calls and the 'predvars' attribute gives the calls to be used for
>> prediction:
>> > attr(g2$terms, "variables")
>> list(lot1, log(u), xpoly(u, 1))
>> > attr(g2$terms, "predvars")
>> list(lot1, log(u), xpoly(u, 1, coefs = list(alpha = 40, norm2 = c(1,
>> 9, 8850))))
>>
>>
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>> On Thu, Jul 16, 2015 at 12:35 PM, Kunshan Yin <yinkunshan at gmail.com>
>> wrote:
>>
>>> Hello, I have a question about the formula and the user defined function:
>>>
>>> I can do following:
>>> ###Case 1:
>>> > clotting <- data.frame(
>>> + u = c(5,10,15,20,30,40,60,80,100),
>>> + lot1 = c(118,58,42,35,27,25,21,19,18),
>>> + lot2 = c(69,35,26,21,18,16,13,12,12))
>>> > g1=glm(lot1 ~ log(u) + poly(u,1), data = clotting, family = Gamma)
>>> > dc=clotting
>>> > dc$u=1
>>> > predict(g1,dc)
>>> 1 2 3 4 5
>>> 6 7 8 9
>>> -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929 -0.01398929
>>> -0.01398929 -0.01398929 -0.01398929
>>>
>>> However, if I just simply wrap the poly as a user defined function ( in
>>> reality I would have my own more complex function) then I will get
>>> error:
>>> ###Case 2:
>>> > xpoly<-function(x,degree=1){poly(x,degree)}
>>> > g2=glm(lot1 ~ log(u) + xpoly(u,1), data = clotting, family = Gamma)
>>> > predict(g2,dc)
>>> Error in poly(x, degree) :
>>> 'degree' must be less than number of unique points
>>>
>>> It seems that the predict always treat the user defined function in the
>>> formula with I(). My question is how can I get the results for Case2
>>> same
>>> as case1?
>>>
>>> Anyone can have any idea about this?
>>>
>>> Thank you very much.
>>>
>>> Alex
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list