[R-sig-ME] lme capable of running with missing data?

Kenneth Frost kfrost at wisc.edu
Tue Feb 7 01:02:49 CET 2012


Doug-

Thanks for the explanation. I think I can understand what is happening in the column-pivoted QR decomposition you are describing.  I'm not sure if I understand how the sweep operator works (although I'm not too worried about it at the moment).

Ken
 

On 02/06/12, Douglas Bates   wrote:
> On Fri, Feb 3, 2012 at 8:20 PM, Rolf Turner <r.turner at auckland.ac.nz> wrote:
> > On 04/02/12 14:45, Kenneth Frost wrote:
> >>
> >> On 02/03/12, Charles Determan Jr   wrote:
> >>>
> >>> Kevin,
> >>>
> >>> I understand that but then how is SAS accomplishing the interactions?
> >>
> >>
> >> I have been following this conversation a little bit and this seems to be
> >> the right question to ask. I would also like to know the answer. However,
> >> this could be the wrong venue to get an answer to this question.
> >
> > <SNIP>
> >
> > It may be the case that fortune(203) is relevant here! :-)
> 
> Mathematical impossibilty, no (fortune(203) refers to obtaining
> negative estimates of variance components, IIRC).  The problem here is
> determining a full-rank model matrix for a model with interactions and
> missing cells.  Because SAS uses the sweep operator in solving least
> squares problems it does not encounter problems with rank deficiency.
> (I am sorely tempted to make remarks about "sweeping them under the
> carpet".)  In fact, SAS expects to handle rank deficiencies because it
> generates a redundant set of indicators for each factor variable then
> prunes them on the fly.
> 
> The approach in R is to generate a model matrix that should be of
> full-rank except in circumstances like this and to check for rank
> deficiency.  There is special code in the version of the QR
> decomposition used with R to detect rank deficiency and pivot the
> offending columns out but keep the others in their original order.
> 
> Dirk Eddelbuettel and I explored several approaches to handling such
> rank deficiency in the vignette accompanying the RcppEigen package
> (http://cran.us.r-project.org/web/packages/RcppEigen/vignettes/RcppEigen-intro-nojss.pdf).
>  The development version of lme4 (called lme4Eigen on the R-forge
> project site) detects rank deficiency earlier in the calculation but
> does not yet repair the rank deficiency.  Using the column-pivoted QR
> decomposition is probably the best approach but even then it would be
> necessary to find the columns that are linear dependent on columns to
> their left then drop only those columns.  It is not impossible by any
> means, it just requires some work and is not high on the priority list
> right now.
> 
> Regarding type III tests, I have forgotten which ones they are.  Are
> they the sequential sums of squares or the ones where you drop the
> main effect but keep the interactions thereby rendering your null
> model nonsensical is most cases?
> All the silliness about Types I, II, III and IV sums of squares and
> tests was formulated when fitting any model was difficult (see
> fortune("JCL")).  So doing a hypothesis test by fitting the null model
> and fitting the alternative model and comparing the results would take
> much much longer than doing a lot of linear algebra gymnastics on the
> fit of the full or alternative model.  That is no longer the case.  If
> you really want to perform a hypothesis test then formulate it in
> terms of models, fit them and compare them.  It's not difficult and
> has the undeniable advantage of forcing you to think about the model
> and whether it makes sense.  Read Bill Venables' famous unpublished
> paper "Exegeses on Linear Models" (just put the name in a search
> engine).  (By the way, Bill is going to be at the useR conference in
> Nashville in July so maybe if a bunch of us ganged up on him he could
> be convinced to submit a version of that paper for publication.)



More information about the R-sig-mixed-models mailing list