[R-sig-ME] PROC MIXED RANDOM equivalence in R nlme
Dennis F. Kahlbaum
kbomb at umich.edu
Fri Aug 11 23:03:11 CEST 2017
I am trying to reproduce some old SAS PROC MIXED code using R and nlme.
As background, the data consist of 300+ vehicle emission test results
(grams/mile of THC over a given driving cycle) versus the gasoline fuel
properties, which include: Read Vapor Pressure (rv), Temperature for 50%
Evaporation (t5), Temperature for 90% Evaporation (t9), Aromatics (ar),
Olefins, (ol), Sulfur (su), Oxygenate (ox), Benzene (bz), and a code
identifying the study and vehicle (study_vehicle). All variables are
real numbers except "study_vehicle", which is character. Unfortunately,
since the data are confidential, I'm unable to provide an example.
The goal is to determine the relationship between the emissions and the
fuel properties.
The original SAS v6.12 code is provided below:
------------------------------------------------------------------
/* READ DATA */
DATA emiss;
INFILE 'data.tab' LRECL=8000 FIRSTOBS=2 DLM='09'X MISSOVER DSD;
INPUT study_vehicle $ thc rv t5 t9 ar ol ox su bz;
/* CREATE NEW VARIABLE */
ln_thc = log (thc);
/* PERFORM ANALYSIS RELATING LOG EMISSIONS TO FUEL PROPERTIES */
PROC MIXED DATA=emiss MAXITER=1000 CONVH=1E-8 METHOD=REML NOCLPRINT
NOITPRINT;
CLASS study_vehicle;
MODEL ln_thc = rv t5 t9 ar ol ox su bz
/S DDFM=RES;
RANDOM int rv t5 t9 ar ol ox su bz
/SUB=study_vehicle;
RUN;
------------------------------------------------------------------
The R code I've devised for reading/processing the data and to replicate
the PROC MIXED statement is shown below:
------------------------------------------------------------------
# LOAD REQUIRED LIBRARY
library (“lmne”)
# FORCE R TO USE THE SAME "CONTRAST" METHOD AS SAS
options(contrasts = c(factor = "contr.SAS", ordered = "contr.poly"))
# READ DATA (study_vehicle, thc, rv, t5, t9, ar, ol, ox, su, bz)
emiss <- read.table("data.csv", header=TRUE, sep=",") # NA for missing
data (default)
# CREATE NEW VARIABLE
ln_thc <- log(thc)
# PERFORM ANALYSIS RELATING LOG EMISSIONS TO FUEL PROPERTIES
FitTHC <- lme(ln_thc ~ rv + t5 + t9 + ar + ol + ox + su + bz,
data = emiss,
random = ??????? | study_vehicle )
# OUTPUT THE RESULTS
summary(FitTHC)
------------------------------------------------------------------
A review of the dataframe "emiss" shows that all variables, except
"study_vehicle", are of type "num". As expected, the character variable
"study_vehicle" is of type "factor". These properties match those from SAS.
As indicated by the questions marks, the problem I'm having is in
constructing the equivalent R random code for the SAS RANDOM and any
remaining settings (e.g. "/S DDFM=RES"). I've tried
random = ~1 + rv + t5 + t9 + ar + ol + ox + su + bz | study_vehicle)
but R gets caught in a processing loop that produces no errors or
warnings, and no results, even after 15+ minutes of execution time. As I
recall, the SAS code took less than 15 seconds back in 1999.
I have also tried
random = ~1 | study_vehicle)
and R immediately produces output. However, although the regression
coefficients are close to those from SAS, the degrees of freedom from
SAS are higher than those from R.
Therefore, what is the equivalent R random code for the original SAS RANDOM?
Thanks!
More information about the R-sig-mixed-models
mailing list