[Rd] proposed simulate.glm method
Prof Brian Ripley
ripley at stats.ox.ac.uk
Sat Feb 14 22:43:38 CET 2009
On Sat, 14 Feb 2009, Nicholas Lewin-Koh wrote:
> Well, my question wasn't that clear :-), but yes you mostly answered
> it. I guess the one case I would be concerned is in Heather's code,
> where the distribution to simulate from is chosen, that seemed to be
> hard coded.
Rather, all known glm families in R that correspond to actual
probability distributions are listed.
> So if I built a family object, say for a model that assumes errors
> from a zipf distribution,
Hmm, plese explain how you get that into the GLM framework -- it is
pretty restrictive.
> and I did have a predict method (which is a fair assumption) would that
> fail because the rzipf function would not be accessed?
glm has a predict method, so why do you need one? Families do not
create additional classes of fit objects.
We could extend the definition of a family to have a 'simulate'
element, but then existing user-contributed families (principally the
negative binomial) would not have one and so this would not solve the
problem .....
If you know of an actual R implementation of another glm family that
looks generally useful we'll consider adding it (but it seems that the
end user could also do so rather easily).
> Nicholas
> On Sat, 14 Feb 2009 20:10:26 +0100, "Martin Maechler"
> <maechler at stat.math.ethz.ch> said:
>>>>>>> "NicLK" == Nicholas Lewin-Koh <nikko at hailmail.net>
>>>>>>> on Sat, 14 Feb 2009 08:34:45 -0800 writes:
>>
>> NicLK> Hi, For extended glms such as gams, gnm or other
>> NicLK> distributions such as negative binomial, would there
>> NicLK> need to be a separate simulate method?
>>
>> Not necessarily, as I said, the "glm"s are now also dealt with
>> in simulate.lm() and Heather more or less confirmed that this
>> gives correct results for "gnm" objects.
>>
>> For gam(), I'd strongly expect the same to apply, but there
>> maybe sophisticated gam() models where the result is currently
>> not correct. That's, BTW, also true for
>> simulate(lm(...., weights), ...)
>>
>> NicLK> Or, could the current framework, rather than
>> NicLK> stopping with an error look for the appropriate model
>> NicLK> matrix, coefficients, distribution function and
>> NicLK> family object to simulate from?
>>
>> What do you mean?
>> A situation where there's no supported 'family'
>> or a situation where predict(<obj>) does not work as it's
>> supposed in the current framework,
>> or ????
>>
>> If there are such cases, we'd have to consider them together
>> with the corresponding package author. It may often make sense
>> fthen that the author changes his methods {predict(), ..} such
>> that the (now) extended simulate.lm() will work automatically;
>> Alternatively, the author can provide simulate.<myclass>().
>>
>> But I'm not sure I'm answering the question you've asked..
>> Martin
>>
>> NicLK> Nicholas
>>
>>
>> >> Message: 9 Date: Fri, 13 Feb 2009 21:27:57 +0100 From:
>> >> Martin Maechler <maechler at stat.math.ethz.ch> Subject: Re:
>> >> [Rd] proposed simulate.glm method To: Heather Turner
>> >> <Heather.Turner at warwick.ac.uk> Cc: r-devel at r-project.org,
>> >> Martin Maechler <maechler at stat.math.ethz.ch> Message-ID:
>> >> <18837.55245.15158.29378 at cmath-5.math.ethz.ch>
>> >> Content-Type: text/plain; charset=us-ascii
>> >>
>> >> Thank you, Heather and Ben,
>> >>
>> >>>>>>> "HT" == Heather Turner
>> >> <Heather.Turner at warwick.ac.uk> >>>>> on Fri, 13 Feb 2009
>> >> 15:52:37 +0000 writes:
>> >>
>> HT> Yes, thanks to Ben for getting the ball rolling. His
>> HT> code was more streamlined than mine, pointing to further
>> HT> simplifications which I've included in the extended
>> HT> version below.
>> >>
>> HT> The code for the additional families uses functions from
>> HT> MASS and SuppDists - I wasn't sure about the best way to
>> HT> do this, so have just used :: for now.
>> >>
>> HT> It appears to be working happily for both glm and gnm
>> HT> objects (no gnm-specific code used).
>> >>
>> HT> Best wishes,
>> >>
>> HT> Heather
>> >>
>> >> [....]
>> >>
>> >> I have now followed Brian Ripley's suggetion to just
>> >> extend simulate.lm() to also deal with "glm" objects, but
>> >> using Heather's suggestions for the different families;
>> >> I've just commited src/library/stats/R/lm.R with the new
>> >> code. (get it from svn.r-project.org/R/trunk/ or this
>> >> night's R-devel tarball).
>> >>
>> >> One difference to your propsal: Instead of just
>> >> object$fitted , the code is using fitted(object)
>> >> ... something which should properly to the na.action
>> >> used.
>> >>
>> >> For the (MASS and) SuppDists package requirement, I'm
>> >> using the following
>> >>
>> >> if(is.null(tryCatch(loadNamespace("SuppDists"), error =
>> >> function(e) NULL))) stop("Need CRAN package 'SuppDists'
>> >> for 'inverse.gaussian' family")
>> >>
>> >>
>> >> I've not yet updated the help page for simulate(), and
>> >> have only tested relatively few cases for binomial,
>> >> poisson and Gamma. I've wanted to expose this to you, so
>> >> you can provide more feedback and possibly even a patch
>> >> to
>> >> svn.r-project.org/R/trunk/src/library/stats/man/simulate.Rd
>> >>
>> >> Martin
>> >>
>> >>
>> >>
>>
>> ______________________________________________
>> NicLK> R-devel at r-project.org mailing list
>> NicLK> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list