[R] Using VGAM's vglm function for ordinal logistic regression
Inman, Brant A. M.D.
Inman.Brant at mayo.edu
Sat Jan 6 23:56:43 CET 2007
R-Experts:
I am using the vglm function of the VGAM library to perform proportional
odds ordinal logistic regression. The issue that I would like help with
concerns the format in which the response variable must be provided for
this function to work correctly. Consider the following example:
------
library(VGAM)
library(MASS)
attach(pneumo)
pneumo # Inspect the format of the original dataset
# Restructure the pneumo dataset into a different format
pneumo2 <- data.frame(matrix(ncol=3, nrow=24))
colnames(pneumo2) <- c('exposure.time', 'severity', 'freq')
pneumo2[,1] <- rep(pneumo[,1],3)
pneumo2[,2] <-
as.ordered(c(rep('normal',8),rep('mild',8),rep('severe',8)))
pneumo2[1:8,3] <- pneumo[,2]
pneumo2[9:16,3] <- pneumo[,3]
pneumo2[17:24,3] <- pneumo[,4]
pneumo2 # Inspect the format of the new modified dataset
------
The problem occurs when I try to analyze these two datasets, which are
identical in content, with the proportional odds model using vglm:
------
# Analyze the original dataset with vglm, get one set of results
> vglm(vglm(cbind(normal, mild, severe) ~ log(exposure.time),
data=pneumo,
+ family=cumulative(parallel=T))
Coefficients:
(Intercept):1 (Intercept):2 log(exposure.time)
9.676093 10.581725 -2.596807
Degrees of Freedom: 16 Total; 13 Residual
Residual Deviance: 5.026826
Log-likelihood: -204.2742
# Analyzing the modified dataset with vglm gives another set of results
> vglm(severity ~ log(exposure.time), weights=freq, data=pneumo2,
+ family=cumulative(parallel=T))
Coefficients:
(Intercept):1 (Intercept):2 log(exposure.time)
-1.6306499 2.5630170 -0.1937956
Degrees of Freedom: 48 Total; 45 Residual
Residual Deviance: 503.7251
Log-likelihood: -251.8626
Warning messages:
1: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace =
trace, wzeps = control$wzepsilon)
2: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace =
trace, wzeps = control$wzepsilon)
3: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace =
trace, wzeps = control$wzepsilon)
4: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace =
trace, wzeps = control$wzepsilon)
5: 4 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace =
trace, wzeps = control$wzepsilon)
# Analyzing the modified dataset with polr reproduces these second
results
> polr(severity ~ log(exposure.time), weights=freq, data=pneumo2)
Coefficients:
log(exposure.time)
0.1938154
Intercepts:
mild|normal normal|severe
-1.630612 2.563055
Residual Deviance: 503.7251
AIC: 509.7251
------
Can someone explain what I am doing wrong when using vglm and polr with
the modified dataset? I do not understand why these two formulations
should give different results.
Brant Inman
More information about the R-help
mailing list