[R-sig-ME] Combination of glmnet-like covariate selection with mixed modeling

Hunsicker, Lawrence lawrence-hunsicker at uiowa.edu
Sat Jun 18 00:40:24 CEST 2016

Greetings, listserv members:

I am involved in the analysis of factors predictive of whether a person that dies in a hospital becomes a transplant organ donor.  To do this analysis, with the help of the NCHS we have linked the list of all organ donors over a seven year period with information of all US deaths over this period obtained from death certificates.  As you might imagine, this is a rather "big data" analysis, with nearly 40,000 donors among about 2,500,000 deaths.

There is also a very large number of ICD-9 codes (and other information) listed in the death certificates.  We anticipate that we will need to reduce the dimensionality of the problem for it to become practical, let alone intelligible, and we are planning to use the grpreg (in R) package to do a two level selection of the most relevant covariates.  But our data also have a nested structure in terms of the US geographical areas of interest -- US counties within the designated service areas of the OPOs (organ procurement organization).   I am not aware of a package that deals simultaneously with covariate selection (a la glmnet or similar packages) and mixed modeling.  I am addressing this e-mail to you all as folks that are expert in the issue of mixed models.

I have read that in fitting a mixed model, one fits first the fixed effects, and then looks for additional explanatory structure among the random effects.  This has suggested to me that one could approach the above problem in a two step manner, first reducing the dimensionality of the problem and deriving coefficients from the glmnet-type analysis, and then doing a mixed model analysis on the residuals from the above.

So the basic question is whether something along the above lines makes sense.  I would deeply appreciate any suggestions or pointers to relevant literature that I could use to understand all this better.

Many thanks in advance for your help.

Larry Hunsicker
L. G. Hunsicker, M.D., Professor (Emeritus) of Internal Medicine
U. Iowa College of Medicine
319-621-3576 (Voice)
lawrence-hunsicker at uiowa.edu<mailto:lawrence-hunsicker at uiowa.edu>

Notice: This UI Health Care e-mail (including attachments) is covered by the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521 and is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If you are not the intended recipient, any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately and delete or destroy all copies of the original message and attachments thereto. Email sent to or from UI Health Care may be retained as required by law or regulation. Thank you.

	[[alternative HTML version deleted]]

More information about the R-sig-mixed-models mailing list