[R] Odp: Fw: R function for Gage R&R

Mon Aug 8 10:15:15 CEST 2011

Hi Elaine

I do not use it very often. I programmed it to mimic Minitab functions 
(partly) with some adons from czech statistics textbook written by 
M.Meloun (meloun  militky statistics - first hit in google)

Basically you can have your data in some data frame or they can be as 
separated vectors. The function itself expects input of 3 vectors, but you 
can easily to modify it for imput as three column data frame.

Here are some explanations of terms used. I post it also to Rlist for 
future reference.

operator is a person performing the task (lab worker, ...) - expected a 
factor, but is ok if numeric or character vector
vzorek is sample analysed - expected a factor, but is ok if numeric or 
character vector
vysledek is a result of analysis - expected numeric vector
sigma is some coefficient
DM is lower limit, HM is upper limit, toler is range either computed from 
DM and HM or set by you - addon from mentioned textbook
plotit is switch for enabling or disabling plot function - expected 
logical value
dig is number of digits for rounding result

The function is based on anova

Here is an example

 test<-expand.grid(operator=letters[1:4],vzorek=LETTERS[5:8]) 
 test<-sapply(test,rep,4)
 test<-as.data.frame(test)
set.seed(1)
test$result<-runif(64)
test$result
 [1] 0.26550866 0.37212390 0.57285336 0.90820779 0.20168193 0.89838968
 [7] 0.94467527 0.66079779 0.62911404 0.06178627 0.20597457 0.17655675
...

ORanal(test$lab.work, test$sample, test$result, dig=4)

$tabulka
          rozptyl smer.odch 5.15 sm.odch proc.rozpt. proc.sm.odch
Vzorky     0.0033    0.0573       0.2949      4.5105      21.2378
Oper       0.0005    0.0221       0.1137      0.6703       8.1872
Interakce  0.0031    0.0560       0.2886      4.3194      20.7832
Reprod     0.0027    0.0515       0.2653      3.6491      19.1026
Opak       0.0721    0.2685       1.3826     99.1386      99.5684
O&R        0.0694    0.2635       1.3569     95.4895      97.7188
Suma       0.0727    0.2696       1.3886    100.0000     100.0000

$vartab
                Df Sum Sq  Mean Sq F value Pr(>F)
vzorek           3 0.3359 0.111980  1.5537 0.2128
operator         3 0.2019 0.067311  0.9339 0.4316
vzorek:operator  9 0.5356 0.059514  0.8257 0.5957
Residuals       48 3.4596 0.072075 

$kategorie
[1] 1

$rozlisitelnost
[1] 0.516411

Here is:
in $tabulka table like in Minitab output
rozptyl = variance
smer.odch = standard deviation
5.15 smer.odch = standard deviation * 5.15
proc.rozpt = variance percentage
proc.sm.odch = standard deviation percentage

Vzorky = Samples
Oper = Operators
Interakce = Interaction
Reprod = Reproducibilty
Opak = Repeatability
O&R = sum of Reproducibility and Repeatability
Suma = Total

in $vartab
anova table (output from aov call)

in $kategorie
no of categories which can be estimated safely with method used for 
analysis

in $rozlisitelnost
some value from Meloun/Militky book

Basically the function takes input of 3 vectors of equal length(operator, 
sample, result) makes analysis of variance and takes third column from 
anova table

fit<-aov(result~lab.work*sample, data=test)

anova(fit)
Analysis of Variance Table

Response: result
                Df Sum Sq  Mean Sq F value Pr(>F)
lab.work         3 0.2019 0.067311  0.9339 0.4316
sample           3 0.3359 0.111980  1.5537 0.2128
lab.work:sample  9 0.5356 0.059514  0.8257 0.5957
Residuals       48 3.4596 0.072075 

anova(fit)[,3]
[1] 0.06731072 0.11197975 0.05951360 0.07207455

and do all required computation which is than put into result.

Do not ask me why it is computed like that  I made it about 10 years ago. 
I would need to go to textbook(s), find appropriate chapter and study it 
to be able to say why it is computed like that. If you have some older 
result from any textbook or package together with data, you can easily 
check if my function gives the same results.

Feel free to modify it for your purpose. Especially plotting is not very 
sophisticated so you can modify it to give you proper labeling. Regarding 
not many posts about GR&R analysis you need to ask in R help. Probably R 
is most often used for research or finance and for those areas such type 
of specialised analysis is not of much use. You need to have structured 
data, like in above example, which is quite often not the case in those 
areas.

If you needed some more explanation feel free to ask me.

Regards
Petr

----------------------------------------------------------

ORanal<-function(operator,vzorek,vysledek, sigma=5.15, DM=NULL ,HM=NULL, 
toler=NULL, plotit=F, dig=4)
{

if (!is.factor(vzorek)) vzorek<-factor(vzorek)
if (!is.factor(operator)) operator<-factor(operator)
if (is.null(toler)) toler<-abs(HM-DM)

fit<-aov(vysledek~vzorek*operator)

volop<-nlevels(operator[,drop=T])
volvzor<-nlevels(vzorek[,drop=T])
opak<-length(operator)/(volop*volvzor)
ctver<-anova(fit)[,3]

        result=NULL
        result[3]<-(ctver[3]-ctver[4])/opak
        result[5]<-ctver[4]
        result[1]<-(ctver[1]-ctver[4]-result[3]*opak)/(volop*opak)
        result[2]<-(ctver[2]-ctver[4]-result[3]*opak)/(volvzor*opak)

                if (result[2]<0) result[2]<-0

        result[4]<-result[2]+result[3]
        result[6]<-result[4]+result[5]
        result[7]<-result[1]+result[6]

result<-abs(result)
proc<-result/result[7]*100
smodch<-sqrt(result)
smodch5<-smodch*sigma
proc.smodch<-smodch/smodch[7]*100
proc.toler<-smodch5/toler*100

result<-cbind(result,smodch,smodch5,proc,proc.smodch,proc.toler)

result<-data.frame(result,row.names=c("Vzorky","Oper","Interakce","Reprod","Opak","O&R","Suma"))

meno<-c("rozptyl","smer.odch", "5.15 sm.odch" ,"proc.rozpt." 
,"proc.sm.odch", "toler.sm.odch")

velik<-dim(result)[2]

names(result)<-meno[1:velik]

suma<-summary(fit)

kateg<-trunc(smodch[1]/smodch[6]*1.41)+1

if (plotit) 

{

par(mfrow=c(3,2))
posice<-volvzor*1:volop
mat<-aggregate(vysledek,list(vzorek,operator),mean,na.rm=T)
mat1<-aggregate(vysledek,list(vzorek,operator),sd,na.rm=T)

#fig1

hmdm<-c(mean(mat$x)+(mean(mat1$x)/.9)/sqrt(opak)*3, 
mean(mat$x)-(mean(mat1$x)/.9)/sqrt(opak)*3)
        plot(mat$x,type="b",ylab="Prumery")
        abline(h=mean(mat$x),col=3)
        abline(h=hmdm,col=2)
        abline(v=posice+.5,col="grey",lwd=2,lty=2)
        text(posice-volvzor/2,max(mat$x),levels(operator[,drop=T]))

#fig2
# mat<-reshape(mat,direction="wide",idvar="Group.1",timevar="Group.2")
# 
matplot(as.matrix(mat[,2:(1+volop)]),type="b",pch=1:volop,col=1:volop,ylab="Prumer",axes=F)
#       axis(side=1,at=1:volvzor,labels=levels(vzorek[,drop=T]))
#       box()
#       axis(2)
# 
legend(1,min(mat[,2:(1+volop)]),levels(operator[,drop=T]),pch=1:volop,col=1:volop,cex=.8,horiz=T,yjust=0,xjust=0)

 mat<-reshape(mat,direction="wide",idvar="Group.1",timevar="Group.2")

matplot(as.matrix(mat[,2:(1+volop)]),type="b",pch=1:volop,col=1:volop,ylab="Prumer",axes=F)
        axis(side=1,at=1:volvzor,labels=levels(vzorek[,drop=T]))
        box()
        axis(2)

legend(1,min(mat[,2:(1+volop)]),levels(operator[,drop=T]),pch=1:volop,col=1:volop,cex=.8,horiz=T,yjust=0,xjust=0)

#fig3
hmdm<-c(mean(mat1$x)*sqrt(qchisq(.99865,opak,lower.tail=F)/(opak-1)), 
mean(mat1$x)*sqrt(qchisq(.99865,opak)/(opak-1)))

        plot(mat1$x,type="b",ylab="Smerod. odch")
        abline(h=mean(mat1$x),col=3)
        abline(h=hmdm,col=2)
        abline(v=posice+.5,col="grey",lwd=2,lty=2)
        text(posice-volvzor/2,max(mat1$x),levels(operator[,drop=T]))

#fig4
        boxplot(split(vysledek,operator),notch=T)
#fig5
        barplot(t(as.matrix(result[c(1,4,5,6),4:velik])),beside=T)
#fig6
        boxplot(split(vysledek,vzorek),notch=T)

par(mfrow=c(1,1))

mtext(paste("R+R analýza",  deparse(substitute(vysledek))), side=3, 
line=2, cex=1.5)

}
rozlis<-smodch[6]*qnorm(0.975)
list(tabulka=round(result,dig),vartab=suma,kategorie=kateg,rozlisitelnost=rozlis)

}

#--------------------------------------------------------------------------------------------------------

Elaine Jones <jones2 at us.ibm.com> napsal dne 05.08.2011 21:41:10:

> 
> Hi Petr,
> 
> I found this post on the R-help website:
> http://www.mail-archive.com/r-help@r-project.org/msg00447.html
> 
> A coworker just asked me to help him with some GR&R studies.  Could you
> tell me  about how your function works? How is the input data 
structured?
> What is vysledek? (the response variable?)
> 
> I was surprised to see very few posts about GR&R on the R-help website, 
and
> packages.  Any help you can offer is appreciated.
> 
>       [R] Odp: GR&R - Best methods in R
> 
> 
>       Petr PIKAL
>       Mon, 17 Sep 2007 01:04:27 -0700
> 
> 
>       Hi
> 
>       below you can find a function I wrote to mimic a Minitab R+R
>       procedure. It
>       is intended for myself only so it is not commented. If you want to
>       know
>       how it functions contact me off-list.
> 
>       Regards
> 
>       Petr
> 
> 
> 
> Sincerely,
> **************** Elaine McGovern Jones ************************
> 
>        Phone:  408  705-9588  Internal tieline: 587-9588
>        jones2 at us.ibm.com
> 
> 
> 
> 
> 
>