[R] Summary.Formula: prmsd and test statistic

David Winsemius dwinsemius at comcast.net
Sun May 15 16:03:29 CEST 2011


On May 14, 2011, at 11:23 AM, Eli Kamara wrote:

> Hello,
>
> I'm a new user to R so apologies if this is a basic question, but  
> after scouring the web on information for summary.formula, I still  
> am searching for an answer.
>
> I made a function to analyze my data - I have a categorical variable  
> and three continuous variables. I am analyzing my continuous  
> variables on the basis of my categorical variables.
>
> radioanal <- function(a)
> {
>
> #Educational status first - pulling variables from my database.  
> categorical is 13 = Edu. numerical is  48=Kyph, 50=Vert, 53=HL.
> a1= a[,c(13,48,50,53)]
>
> #make sure they are in numeric form
> a2= transform(a1, Kyph=as.numeric(as.character(Kyph)),  
> Vert=as.numeric(as.character(Vert)), HL=as.numeric(as.character(HL)))
>
> #see boxplots of the individual variables
> boxplot(a2$Kyph~a2$Edu, main="Education vs Kyphosis angle",
>   xlab="Education", ylab="Kyphosis angle")
> boxplot(a2$Vert~a2$Edu, main="Education vs # of vertebrae affected",
>   xlab="Education", ylab="#of vertebrae affected")
> boxplot(a2$HL~a2$Edu, main="Education vs %HL",
>   xlab="Education", ylab="%HL")
>
> #see distribution of data
> d=summary.formula(a2$Edu~a2$Kyph+a2$HL+a2$Vert, method="reverse",  
> overall=T, continuous=5, add=TRUE, test=T)
>
I noticed that you were addressing the columns individually. That  
rather defeats the strategy of passing a data argument to a function  
and using only the column names in the formula. It often causes  
strange errors in model calls and I wouldn be surprised if you got  
better results with something like:

d=summary.formula( Edu~ Kyph+ HL+ Vert, data=a2, method="reverse",  
overall=T, continuous=5, add=TRUE, test=T)

-- 
David
> #perform MANOVA
> a3=manova(cbind(Kyph, Vert, HL)~as.factor(Edu), data=a2)
>
> #return results
> a4=list("Results of Educational Status MANOVA",
> print(d),
> summary(a3, test="Hotelling-Lawley"),
> summary(a3, test="Roy") ,
> summary(a3, test="Pillai"),
> summary(a3, test="Wilks"),
> summary.aov(a3)
> )
>
> print(a4)	
>
> }
>
> This function works as is, but I want to add the mean and standard  
> deviation to my table. When I add the following code to line 36  
> where I print "d"
> print(d, prmsd=TRUE)
>
> The numbers in my table disappear. When I use the same commands from  
> the command line, the same thing happens. After reading the manual,  
> I think the error might be due to the missing numbers in my  
> database, so I tried adding na.action to my set of commands:
>
> print(summary.formula(a2$Edu~a2$Kyph+a2$HL+a2$Vert, na.action,  
> method="reverse", overall=T, continuous=5, add=TRUE, test=T),  
> prmsd=TRUE)
>
> but then I get the following error:
> Error in as.data.frame.default(data, optional = TRUE) :
>  cannot coerce class '"function"' into a data.frame

It may be trying to do something with 'data' and doesn't find a 'data'  
object until it get to the 'data' function.

>
> Any ideas?
>
>
> Also, does anyone know what kind of test statistic this function  
> calculates?

Huh. You do realize this function in the rms package has a help page,  
right?


> I compared the F and p values to a manual ANOVA but they were  
> different.
>

I think you break further questions down into components and post  
something that is reproducible.
####------------------------------------------------------------####
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

####------------------------------------------------------------####
>

--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list