[R] random effects model

arun smartpink111 at yahoo.com
Sat Jan 12 18:28:37 CET 2013


HI,

Regarding the script I sent to you earlier, it should be modified a bit according to your dataset.  For example, the variable "Sex" was in column 2.  Also, your data had some missing values for the "Sex" column, which I removed before running the code.  
With respect to the Errors that you showed here, you were using different package and I don't have a clue what you were trying to do. 

I applied the same code to your dataset.  The results are as follows:

BP_2b<-read.csv("BP_2b.csv",sep="\t")
nrow(BP_2b)
#[1] 6898
BP_2bSexNoMV<-BP_2b[!is.na(BP_2b$Sex),]
 nrow(BP_2bSexNoMV)
#[1] 6260
 unique(BP_2bSexNoMV$Sex)
#[1] 2 1
library(car)
BP_2bSexNoMV$Sex<-recode(BP_2bSexNoMV$Sex,"1='Male';2='Female'") #assuming that 1=Male and 2=Female
 unique(BP_2bSexNoMV$Sex)
#[1] "Female" "Male"  

#whole dataset
 unlist(lapply(lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(x))*100),function(x) paste(x,"%",sep="")))
 #            Female                Male 
#"72.6552179656539%" "74.4740099009901%" 
#splitting the above into components:

 lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) head(x,2))
#$Female
#  CODEA    Sex MaternalAge Education Birthplace AggScore IntScore Obese14
#2     3 Female           3         3          1        0        0       0
#3     4 Female           3         6          1       NA       NA      NA
#  Overweight14 Overweight21 Obese21 hibp14 hibp21
#2            0            1       0      0      0
#3           NA           NA      NA     NA     NA

#$Male
#  CODEA  Sex MaternalAge Education Birthplace AggScore IntScore Obese14
#6     9 Male           3         6          2        0        0       0
#8    11 Male           3         4          1        0        0      NA
 # Overweight14 Overweight21 Obese21 hibp14 hibp21
#6            0            0       0      0      0
#8           NA            0       0     NA      0



 lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x)nrow(x))
#$Female
#[1] 3028
#
#$Male
#[1] 3232
lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) nrow(x[!complete.cases(x[,-2]),])) #rows which have at least one NA
#$Female
#[1] 2200
#
#$Male
#[1] 2407

 lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) head(x[!complete.cases(x[,-2]),],2))
#$Female
#   CODEA    Sex MaternalAge Education Birthplace AggScore IntScore Obese14
#3      4 Female           3         6          1       NA       NA      NA
#10    13 Female           3         4          2       NA       NA      NA
 #  Overweight14 Overweight21 Obese21 hibp14 hibp21
#3            NA           NA      NA     NA     NA
#10           NA           NA      NA     NA     NA

#$Male
#  CODEA  Sex MaternalAge Education Birthplace AggScore IntScore Obese14
#8    11 Male           3         4          1        0        0      NA
#9    12 Male           3         4          2       NA       NA      NA
#  Overweight14 Overweight21 Obese21 hibp14 hibp21
#8           NA            0       0     NA      0
#9           NA           NA      NA     NA     NA

lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(x))*100) #gives the percentage of rows of missing #values from the overall rows for Males and Females 
#$Female
#[1] 72.65522
#
#$Male
#[1] 74.47401

#iF you want the percentage from the total number rows in Males and Females (without NA's in the the Sex column)
 lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(BP_2bSexNoMV))*100)
#$Female
#[1] 35.14377
#
#$Male
#[1] 38.45048
unlist(lapply(lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(BP_2bSexNoMV))*100),function(x) paste(x,"%",sep="")))#paste the "%" to the numbers
#             Female                Male 
#"35.1437699680511%" "38.4504792332268%" 






#If it is to find the percentage of missing values for each variable in females and males:
res<-do.call(rbind,lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) paste((colSums(is.na(x[,-2]))/nrow(BP_2bSexNoMV))*100,"%",sep=""))) #If #you want percentage of missing values per category, replace by nrow(x)
 colnames(res)<-colnames(BP_2bSexNoMV)[-2]
res
#       CODEA MaternalAge Education           Birthplace         
#Female "0%"  "0%"        "1.03833865814696%" "1.18210862619808%"
#Male   "0%"  "0%"        "1.10223642172524%" "1.3258785942492%" 
 #      AggScore            IntScore            Obese14            
#Female "15.2076677316294%" "15.2076677316294%" "24.0894568690096%"
#Male   "16.5015974440895%" "16.5015974440895%" "25.814696485623%" 
#       Overweight14        Overweight21        Obese21            
#Female "24.0894568690096%" "23.3865814696486%" "23.3865814696486%"
#Male   "25.814696485623%"  "29.1533546325879%" "29.1533546325879%"
#       hibp14              hibp21             
#Female "24.1693290734824%" "31.3418530351438%"
#Male   "25.8466453674121%" "35.223642172524%" 

A.K.







________________________________
From: Usha Gurunathan <usha.nathan at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Cc: R help <r-help at r-project.org> 
Sent: Saturday, January 12, 2013 1:42 AM
Subject: Re: [R] random effects model


 Hi 

 I do want  percentages of the categories inthe whole data set. But, I am a bit unclear about this syntax. Can you explain please. This is the error message I got with your script?

Error: could not find function "Copy.of.BP_2". Not sure what, because the data frame was already loaded.
Also

I was trying out package: vmv( after installing)

as data(df,package, package="vmv")

In data(Copy.of.BP_2c, package = "vmv") : data set ‘Copy.of.BP_2c’ not foundtablemissing(df, sort by="variable")

Error in tablemissing(Copy.of.BP_2, sortby = "Sex") :  object 'tabfinal' not found 

## Same problem with "vim" package.

## What mistake could I have done?

Thanks.




On Sat, Jan 12, 2013 at 3:11 PM, arun <smartpink111 at yahoo.com> wrote:

HI,
>
>If you want to find out the percentage of missing values in the whole dataset in females and males:
> set.seed(51)
> dat1<-data.frame(Gender=rep(c("M","F"),each=10),V1=sample(c(1:3,NA),20,replace=TRUE),V2=sample(c(21:24,NA),20,replace=TRUE))
> unlist(lapply(lapply(split(dat1,dat1$Gender),function(x) (nrow(x[!complete.cases(x[,-1]),])/nrow(x))*100),function(x) paste(x,"%",sep="")))
>#    F     M
>#"20%" "70%"
>
>#If it is to find the percentage of missing values for each variable in females and males:
> res<-do.call(rbind,lapply(split(dat1,dat1$Gender),function(x) paste((colSums(is.na(x[,-1]))/nrow(x))*100,"%",sep="")))
> colnames(res)<-colnames(dat1)[-1]
> res
>#  V1    V2  
>#F "0%"  "20%"
>#M "50%" "20%"
>
>A.K.
>
>
>
>
>
>----- Original Message -----
>
>From: rex2013 <usha.nathan at gmail.com>
>To: r-help at r-project.org
>Cc:
>
>Sent: Friday, January 11, 2013 2:16 AM
>Subject: Re: [R] random effects model
>
>
>Hi AK
>
>Regarding the missing values, I would like to find out the patterns of
>missing values in my data set. I know the overall values for each variable.
>
>using
>
>colSums(is.na(df))
>
>                      but what I wanted is  to find out the percentages
>with each level of the variable with my dataset, as in if there is more
>missing data in females or males etc?.
>
>I installed "mi" package, but unable to produce a plot with it( i would
>also like to produce a plot). I searched the responses in the relevant
>sections in r but could n't find an answer.
>
>Thanks,
>
>
>
>
>
>On Wed, Jan 9, 2013 at 12:31 PM, arun kirshna [via R] <
>ml-node+s789695n4654996h3 at n4.nabble.com> wrote:
>
>> HI,
>>
>> In your dataset, the "exchangeable" or "compound symmetry" may work as
>> there are only two levels for time.  In experimental data analysis
>> involving a factor time with more than 2 levels, randomization of
>> combination of levels of factors applied to the subject/plot etc. gets
>> affected as time is unidirectional.  I guess your data is observational,
>> and with two time levels, it may not hurt to use "CS" as option, though, it
>> would help if you check different options.
>>
>> In the link I sent previously, QIC was used.
>> http://stats.stackexchange.com/questions/577/is-there-any-reason-to-prefer-the-aic-or-bic-over-the-other
>>
>> I am not sure whether AIC/BIC is better than QIC or viceversa.
>>
>> You could sent email to the maintainer of geepack (Jun Yan <[hidden email]<http://user/SendEmail.jtp?type=node&node=4654996&i=0>>).
>
>>
>> Regarding the reference links,
>> You can check this link "www.jstatsoft.org/v15/i02/paper"; .  Other
>> references are in the paper.
>> "
>> 4.3. Missing values (waves)
>> In case of missing values, the GEE estimates are consistent if the values
>> are missing com-
>> pletely at random (Rubin 1976). The geeglm function assumes by default
>> that observations
>> are equally separated in time. Therefore, one has to inform the function
>> about different sep-
>> arations if there are missing values and other correlation structures than
>> the independence or
>> exchangeable structures are used. The waves arguments takes an integer
>> vector that indicates
>> that two observations of the same cluster with the values of the vector of
>> k respectively l have
>> a correlation of rkl ."
>>
>> Hope it helps.
>> A.K.
>>
>>
>>
>>
>> ----- Original Message -----
>
>> From: rex2013 <[hidden email]<http://user/SendEmail.jtp?type=node&node=4654996&i=1>>
>>
>> To: [hidden email] <http://user/SendEmail.jtp?type=node&node=4654996&i=2>
>
>> Cc:
>> Sent: Tuesday, January 8, 2013 5:29 PM
>> Subject: Re: [R] random effects model
>>
>> Hi
>>
>> Thanks a lot, the corstr "exchangeable"does work. Didn't strike to me
>> for so long. Does the AIC value come out with the gee output?
>>
>> By reference, I meant reference to a easy-read paper or web address
>> that can give me knowledge about implications of missing data.
>>
>> Ta.
>>
>> On 1/8/13, arun kirshna [via R]
>> <[hidden email] <http://user/SendEmail.jtp?type=node&node=4654996&i=3>>
>
>> wrote:
>>
>> >
>> >
>> > HI,
>> > BP.stack5 is the one without missing values.
>> > na.omit(....).  Otherwise, I have to use the option na.action=.. in the
>> > ?geese() statement
>> >
>> > You need to read about the correlation structures.  IN unstructured
>> option,
>> > more number of parameters needs to be estimated,  In repeated measures
>> > design, when the underlying structure is not known, it would be better
>> to
>> > compare using different options (exchangeable is similar to compound
>> > symmetry) and select the one which provide the least value for AIC or
>> BIC.
>> > Have a look at
>> >
>> >
>> http://stats.stackexchange.com/questions/21771/how-to-perform-model-selection-in-gee-in-r
>> > It's not clear to me  "reference to write about missing values".
>> > A.K.
>> >
>> >
>> >
>> >
>> > ----- Original Message -----
>
>> > From: Usha Gurunathan <[hidden email]<http://user/SendEmail.jtp?type=node&node=4654996&i=4>>
>>
>> > To: arun <[hidden email]<http://user/SendEmail.jtp?type=node&node=4654996&i=5>>
>>
>
>> > Cc:
>> > Sent: Monday, January 7, 2013 6:12 PM
>> > Subject: Re: [R] random effects model
>> >
>> > Hi AK
>> >
>> > 2)I shall try putting exch. and check when I get home. Btw, what is
>> > BP.stack5? is it with missing values or only complete cases?
>> >
>> > I guess I am still not clear about the unstructured and exchangeable
>> > options, as in which one is better.
>> >
>> > 1)Rgding the summary(p): NA thing, I tried putting one of my gee
>> equation.
>> >
>> > Can you suggest me a reference to write about" missing values and the
>> > implications for my results"
>> >
>> > Thanks.
>> >
>> >
>> >
>> > On 1/8/13, arun <[hidden email]<http://user/SendEmail.jtp?type=node&node=4654996&i=6>>
>
>> wrote:
>> >> HI,
>> >>
>> >> Just to add:
>> >>
>> fit3<-geese(hibp~MaternalAge*time,id=CODEA,data=BP.stack5,family=binomial,corstr="exch",scale.fix=TRUE)
>>
>> >> #works
>> >>  summary(fit3)$mean["p"]
>> >> #                             p
>> >> #(Intercept)         0.00000000
>> >> #MaternalAge4        0.49099242
>> >> #MaternalAge5        0.04686295
>> >> #time21              0.86164351
>> >> #MaternalAge4:time21 0.59258221
>> >> #MaternalAge5:time21 0.79909832
>> >>
>> >>
>> fit4<-geese(hibp~MaternalAge*time,id=CODEA,data=BP.stack5,family=binomial,corstr="unstructured",scale.fix=TRUE)
>>
>> >> #when the correlation structure is changed to "unstructured"
>> >> #Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
>> >>  # contrasts can be applied only to factors with 2 or more levels
>> >> #In addition: Warning message:
>> >> #In is.na(rows) : is.na() applied to non-(list or vector) of type
>> 'NULL'
>> >>
>> >>
>> >> Though, it works with data(Ohio)
>> >>
>> >>
>> fit1<-geese(resp~age+smoke+age:smoke,id=id,data=ohio1,family=binomial,corstr="unstructured",scale.fix=TRUE)
>>
>> >>  summary(fit1)$mean["p"]
>> >> #                      p
>> >> #(Intercept)  0.00000000
>> >> #age-1        0.60555454
>> >> #age0         0.45322698
>> >> #age1         0.01187725
>> >> #smoke1       0.86262269
>> >> #age-1:smoke1 0.17239050
>> >> #age0:smoke1  0.32223942
>> >> #age1:smoke1  0.36686706
>> >>
>> >>
>> >>
>> >> By checking:
>> >>  with(BP.stack5,table(MaternalAge,time))
>> >> #           time
>> >> #MaternalAge   14   21
>> >>   #        3 1104  864
>> >>    #       4  875  667
>> >>     #     5   67   53 #less number of observations
>> >>
>> >>
>> >>  BP.stack6 <- BP.stack5[order(BP.stack5$CODEA, BP.stack5$time),]
>> >>  head(BP.stack6)  # very few IDs with  MaternalAge==5
>> >> #       X CODEA Sex MaternalAge Education Birthplace AggScore IntScore
>> >> #1493 3.1     3   2           3         3          1        0        0
>> >> #3202 3.2     3   2           3         3          1        0        0
>> >> #1306 7.1     7   2           4         6          1        0        0
>> >> #3064 7.2     7   2           4         6          1        0        0
>> >> #1    8.1     8   2           4         4          1        0        0
>> >> #2047 8.2     8   2           4         4          1        0        0
>> >>  #         Categ time Obese Overweight hibp
>> >> #1493 Overweight   14     0          0    0
>> >> #3202 Overweight   21     0          1    0
>> >> #1306      Obese   14     0          0    0
>> >> #3064      Obese   21     1          1    0
>> >> #1        Normal   14     0          0    0
>> >> #2047     Normal   21     0          0    0
>> >> BP.stack7<-BP.stack6[BP.stack6$MaternalAge!=5,]
>> >>
>> >>
>> BP.stack7$MaternalAge<-factor(as.numeric(as.character(BP.stack7$MaternalAge)
>>
>> >>
>> >>
>> fit5<-geese(hibp~MaternalAge*time,id=CODEA,data=BP.stack7,family=binomial,corstr="unstructured",scale.fix=TRUE)
>>
>> >> #Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
>> >>  # contrasts can be applied only to factors with 2 or more levels
>> >>
>> >>  with(BP.stack7,table(MaternalAge,time))  #It looks like the
>> combinations
>> >> are still there
>> >> #           time
>> >> #MaternalAge   14   21
>> >>  #         3 1104  864
>> >>    #       4  875  667
>> >>
>> >> It works also with corstr="ar1".   Why do you gave the option
>> >> "unstructured"?
>> >> A.K.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> ----- Original Message -----
>
>> >> From: rex2013 <[hidden email]<http://user/SendEmail.jtp?type=node&node=4654996&i=7>>
>>
>> >> To: [hidden email]<http://user/SendEmail.jtp?type=node&node=4654996&i=8>
>
>> >> Cc:
>> >> Sent: Monday, January 7, 2013 6:15 AM
>> >> Subject: Re: [R] random effects model
>> >>
>> >> Hi A.K
>> >>
>> >> Below is the comment I get, not sure why.
>> >>
>> >> BP.sub3 is the stacked data without the missing values.
>> >>
>> >> BP.geese3 <- geese(HiBP~time*MaternalAge,data=BP.sub3,id=CODEA,
>> >> family=binomial, corstr="unstructured", na.action=na.omit)Error in
>> >> `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
>> >>   contrasts can be applied only to factors with 2 or more levels
>> >>
>> >> Even though age has 3 levels; time has 14 years & 21 years; HIBP is a
>> >> binary response outcome.
>> >>
>> >> 2) When you mentioned summary(m1)$mean["p"] what did the p mean? i
>> >> used this in one of the gee command, it produced NA as answer?
>> >>
>> >> Many thanks
>> >>
>> >>
>> >>
>> >> On Mon, Jan 7, 2013 at 5:26 AM, arun kirshna [via R] <
>> >> [hidden email] <http://user/SendEmail.jtp?type=node&node=4654996&i=9>>
>
>> wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> I am  not very familiar with the geese/geeglm().  Is it from
>> >>> library(geepack)?
>> >>> Regarding your question:
>> >>> "
>> >>> Can you tell me if I can use the geese or geeglm function with this
>> data
>> >>> eg: : HIBP~ time* Age
>> >>> Here age is a factor with 3 levels, time: 2 levels, HIBP = yes/no.
>> >>>
>> >>> From your original data:
>> >>> BP_2b<-read.csv("BP_2b.csv",sep="\t")
>> >>> head(BP_2b,2)
>> >>> #  CODEA Sex MaternalAge Education Birthplace AggScore IntScore
>> Obese14
>> >>> #1     1  NA           3         4          1       NA       NA
>> NA
>> >>> #2     3   2           3         3          1        0        0
>>  0
>> >>>  # Overweight14 Overweight21 Obese21 hibp14 hibp21
>> >>> #1           NA           NA      NA     NA     NA
>> >>> #2            0            1       0      0      0
>> >>>
>> >>> If I understand your new classification:
>> >>> BP.stacknormal<- subset(BP_2b,Obese14==0 & Overweight14==0 &
>> Obese21==0
>> >>> &
>> >>> Overweight21==0)
>> >>> BP.stackObese <- subset(BP_2b,(Obese14==1& Overweight14==0 &
>> >>> Obese14==1&Overweight14==1)|(Obese14==1&Overweight14==1 & Obese21==1 &
>> >>> Overweight21==0)|(Obese14==1&Overweight14==0 & Obese21==0 &
>> >>> Overweight21==0)|(Obese14==0 & Overweight14==0 & Obese21==1 &
>> >>> Overweight21==0)|(Obese14==0 & Overweight14==0 & Obese21==1 &
>> >>> Overweight21==1)|(Obese14==0 & Overweight14==1 & Obese21==1
>> >>> &Overweight21==1)|(Obese14==1& Overweight14==1 & Obese21==1&
>> >>> Overweight21==1)) #check whether there are more classification that
>> fits
>> >>> to
>> >>> #Obese
>> >>>  BP.stackOverweight <- subset(BP_2b,(Obese14==0 & Overweight14==1 &
>> >>> Obese21==0 & Overweight21==1)|(Obese14==0 &Overweight14==1 &
>> Obese21==0
>> >>> &
>> >>> Overweight21==0)|(Obese14==0 & Overweight14==0 & Obese21==0 &
>> >>> Overweight21==1))
>> >>> BP.stacknormal$Categ<-"Normal"
>> >>> BP.stackObese$Categ<-"Obese"
>> >>> BP.stackOverweight$Categ <- "Overweight"
>> >>>
>> >>>
>> BP.newObeseOverweightNormal<-na.omit(rbind(BP.stacknormal,BP.stackObese,BP.stackOverweight))
>>
>> >>>
>> >>>  nrow(BP.newObeseOverweightNormal)
>> >>> #[1] 1581
>> >>> BP.stack3 <-
>> >>>
>> reshape(BP.newObeseOverweightNormal,idvar="CODEA",timevar="time",sep="_",varying=list(c("Obese14","Obese21"),c("Overweight14","Overweight21"),c("hibp14","hibp21")),v.names=c("Obese","Overweight","hibp"),direction="long")
>>
>> >>>
>> >>> library(car)
>> >>> BP.stack3$time<-recode(BP.stack3$time,"1=14;2=21")
>> >>> head(BP.stack3,2)
>> >>>   #  CODEA Sex MaternalAge Education Birthplace AggScore IntScore
>> Categ
>> >>> time
>> >>> #8.1     8   2           4         4          1        0        0
>> Normal
>> >>> 14
>> >>> #9.1     9   1           3         6          2        0        0
>> Normal
>> >>> 14
>> >>>   #  Obese Overweight hibp
>> >>> #8.1     0          0    0
>> >>>
>> >>> Now, your formula: (HIBP~time*Age), is it MaternalAge?
>> >>> If it is, it has three values
>> >>> unique(BP.stack3$MaternalAge)
>> >>> #[1] 4 3 5
>> >>> and for time (14,21) # If it says that geese/geeglm, contrasts could
>> be
>> >>> applied with factors>=2 levels, what is the problem?
>> >>> If you take "Categ" variable, it also has 3 levels (Normal, Obese,
>> >>> Overweight).
>> >>>
>> >>>  BP.stack3$MaternalAge<-factor(BP.stack3$MaternalAge)
>> >>>  BP.stack3$time<-factor(BP.stack3$time)
>> >>>
>> >>> library(geepack)
>> >>> For your last question about how to get the p-values:
>> >>> # Using one of the example datasets:
>> >>> data(seizure)
>> >>>      seiz.l <- reshape(seizure,
>> >>>                        varying=list(c("base","y1", "y2", "y3", "y4")),
>> >>>                        v.names="y", times=0:4, direction="long")
>> >>>      seiz.l <- seiz.l[order(seiz.l$id, seiz.l$time),]
>> >>>      seiz.l$t <- ifelse(seiz.l$time == 0, 8, 2)
>> >>>      seiz.l$x <- ifelse(seiz.l$time == 0, 0, 1)
>> >>>      m1 <- geese(y ~ offset(log(t)) + x + trt + x:trt, id = id,
>> >>>                  data=seiz.l, corstr="exch", family=poisson)
>> >>>      summary(m1)
>> >>>
>> >>>  summary(m1)$mean["p"]
>> >>> #                    p
>> >>> #(Intercept) 0.0000000
>> >>> #x           0.3347040
>> >>> #trt         0.9011982
>> >>> #x:trt       0.6236769
>> >>>
>> >>>
>> >>> #If you need the p-values of the scale
>> >>>    summary(m1)$scale["p"]
>> >>>  #                   p
>> >>> #(Intercept) 0.0254634
>> >>>
>> >>> Hope it helps.
>> >>>
>> >>> A.K.
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> ----- Original Message -----
>> >>> From: rex2013 <[hidden
>> >>> email]<http://user/SendEmail.jtp?type=node&node=4654795&i=0>>
>> >>>
>> >>> To: [hidden email]
>> >>> <http://user/SendEmail.jtp?type=node&node=4654795&i=1>
>> >>> Cc:
>> >>> Sent: Sunday, January 6, 2013 4:55 AM
>> >>> Subject: Re: [R] random effects model
>> >>>
>> >>> Hi A.K
>> >>>
>> >>> Regarding my question on comparing normal/ obese/overweight with blood
>> >>> pressure change, I did finally as per the first suggestion of stacking
>> >>> the
>> >>> data and creating a normal category . This only gives me a obese not
>> >>> obese
>> >>> 14, but when I did with the wide format hoping to  get  a
>> >>> obese14,normal14,overweight 14 Vs hibp 21, i could not complete any of
>> >>> the
>> >>> models.
>> >>> This time I classified obese=1 & overweight=1 as obese itself.
>> >>>
>> >>> Can you tell me if I can use the geese or geeglm function with this
>> data
>> >>> eg: : HIBP~ time* Age
>> >>> Here age is a factor with 3 levels, time: 2 levels, HIBP = yes/no.
>> >>>
>> >>> It says geese/geeglm: contrast can be applied only with factor with 2
>> or
>> >>> more levels. What is the way to overcome this. Can I manipulate the
>> data
>> >>> to
>> >>> make it work.
>> >>>
>> >>> I need to know if the demogrphic variables affect change in blood
>> >>> pressure
>> >>> status over time?
>> >>>
>> >>> How to get the p values with gee model?
>> >>>
>> >>> Thanks
>> >>> On Thu, Jan 3, 2013 at 5:06 AM, arun kirshna [via R] <
>> >>> [hidden email] <http://user/SendEmail.jtp?type=node&node=4654795&i=2>>
>>
>> >>> wrote:
>> >>>
>> >>> > HI Rex,
>> >>> > If I take a small subset from your whole dataset, and go through
>> your
>> >>> > codes:
>> >>> > BP_2b<-read.csv("BP_2b.csv",sep="\t")
>> >>> >  BP.sub<-BP_2b[410:418,c(1,8:11,13)] #deleted the columns that are
>> not
>> >>> > needed
>> >>> >  BP.stacknormal<- subset(BP.subnew,Obese14==0 & Overweight14==0)
>> >>> > BP.stackObese <- subset(BP.subnew,Obese14==1)
>> >>> >  BP.stackOverweight <- subset(BP.subnew,Overweight14==1)
>> >>> > BP.stacknormal$Categ<-"Normal14"
>> >>> > BP.stackObese$Categ<-"Obese14"
>> >>> > BP.stackOverweight$Categ <- "Overweight14"
>> >>> >
>> >>>
>> BP.newObeseOverweightNormal<-rbind(BP.stacknormal,BP.stackObese,BP.stackOverweight)
>>
>> >>>
>> >>> >
>> >>> >  BP.newObeseOverweightNormal
>> >>> > #    CODEA Obese14 Overweight14 Overweight21 Obese21 hibp21
>> >>> > Categ
>> >>> > #411   541       0            0            0       0      0
>> >>> > Normal14
>> >>> > #415   545       0            0            1       1      1
>> >>> > Normal14
>> >>> > #418   549       0            0            1       0      0
>> >>> > Normal14
>> >>> > #413   543       1            0            1       1      0
>> >>> > Obese14
>> >>> > #417   548       0            1            1       0      0
>> >>> > Overweight14
>> >>> > BP.newObeseOverweightNormal$Categ<-
>> >>> > factor(BP.newObeseOverweightNormal$Categ)
>> >>> > BP.stack3 <-
>> >>> >
>> >>>
>> reshape(BP.newObeseOverweightNormal,idvar="CODEA",timevar="time",sep="_",varying=list(c("Obese14","Obese21"),c("Overweight14","Overweight21")),v.names=c("Obese","Overweight"),direction="long")
>>
>> >>>
>> >>> >
>> >>> > library(car)
>> >>> > BP.stack3$time<-recode(BP.stack3$time,"1=14;2=21")
>> >>> > BP.stack3 #Here Normal14 gets repeated even at time==21.  Given that
>> >>> > you
>> >>> > are using the "Categ" and "time" #columns in the analysis, it will
>> >>> > give
>> >>> > incorrect results.
>> >>> > #      CODEA hibp21        Categ time Obese Overweight
>> >>> > #541.1   541      0     Normal14   14     0          0
>> >>> > #545.1   545      1     Normal14   14     0          0
>> >>> > #549.1   549      0     Normal14   14     0          0
>> >>> > #543.1   543      0      Obese14   14     1          0
>> >>> > #548.1   548      0 Overweight14   14     0          1
>> >>> > #541.2   541      0     Normal14   21     0          0
>> >>> > #545.2   545      1     Normal14   21     1          1
>> >>> > #549.2   549      0     Normal14   21     0          1
>> >>> > #543.2   543      0      Obese14   21     1          1
>> >>> > #548.2   548      0 Overweight14   21     0          1
>> >>> > #Even if I correct the above codes, this will give incorrect
>> >>> > results/(error as you shown) because the response variable (hibp21)
>> >>> > gets
>> >>> > #repeated when you reshape it from wide to long.
>> >>> >
>> >>> > The correct classification might be:
>> >>> > BP_2b<-read.csv("BP_2b.csv",sep="\t")
>> >>> >  BP.sub<-BP_2b[410:418,c(1,8:11,13)]
>> >>> >
>> >>>
>> BP.subnew<-reshape(BP.sub,idvar="CODEA",timevar="time",sep="",varying=list(c("Obese14","Obese21"),c("Overweight14","Overweight21")),v.names=c("Obese","Overweight"),direction="long")
>>
>> >>>
>> >>> >
>> >>> > BP.subnew$time<-recode(BP.subnew$time,"1=14;2=21")
>> >>> >  BP.subnew<-na.omit(BP.subnew)
>> >>> >
>> >>> > BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==14 &
>> >>> > BP.subnew$Obese==0]<-"Overweight14"
>> >>> > BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==21 &
>> >>> > BP.subnew$Obese==0]<-"Overweight21"
>> >>> >  BP.subnew$Categ[BP.subnew$Obese==1 & BP.subnew$time==14 &
>> >>> > BP.subnew$Overweight==0]<-"Obese14"
>> >>> >  BP.subnew$Categ[BP.subnew$Obese==1 & BP.subnew$time==21 &
>> >>> > BP.subnew$Overweight==0]<-"Obese21"
>> >>> >  BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==21&
>> >>> > BP.subnew$Obese==1]<-"ObeseOverweight21"
>> >>> >  BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==14&
>> >>> > BP.subnew$Obese==1]<-"ObeseOverweight14"
>> >>> > BP.subnew$Categ[BP.subnew$Overweight==0 & BP.subnew$Obese==0
>> >>> > &BP.subnew$time==14]<-"Normal14"
>> >>> >  BP.subnew$Categ[BP.subnew$Overweight==0 & BP.subnew$Obese==0
>> >>> > &BP.subnew$time==21]<-"Normal21"
>> >>> >
>> >>> > BP.subnew$Categ<-factor(BP.subnew$Categ)
>> >>> > BP.subnew$time<-factor(BP.subnew$time)
>> >>> > BP.subnew
>> >>> > #      CODEA hibp21 time Obese Overweight             Categ
>> >>> > #541.1   541      0   14     0          0          Normal14
>> >>> > #543.1   543      0   14     1          0           Obese14
>> >>> > #545.1   545      1   14     0          0          Normal14
>> >>> > #548.1   548      0   14     0          1      Overweight14
>> >>> > #549.1   549      0   14     0          0          Normal14
>> >>> > #541.2   541      0   21     0          0          Normal21
>> >>> > #543.2   543      0   21     1          1 ObeseOverweight21
>> >>> > #545.2   545      1   21     1          1 ObeseOverweight21
>> >>> > #548.2   548      0   21     0          1      Overweight21
>> >>> > #549.2   549      0   21     0          1      Overweight21
>> >>> >
>> >>> > #NOw with the whole dataset:
>> >>> > BP.sub<-BP_2b[,c(1,8:11,13)] #change here and paste the above lines:
>> >>> >  head(BP.subnew)
>> >>> >     # CODEA hibp21 time Obese Overweight    Categ
>> >>> > #3.1      3      0   14     0          0 Normal14
>> >>> > #7.1      7      0   14     0          0 Normal14
>> >>> > #8.1      8      0   14     0          0 Normal14
>> >>> > #9.1      9      0   14     0          0 Normal14
>> >>> > #14.1    14      1   14     0          0 Normal14
>> >>> > #21.1    21      0   14     0          0 Normal14
>> >>> >
>> >>> > tail(BP.subnew)
>> >>> >   #     CODEA hibp21 time Obese Overweight             Categ
>> >>> > #8485.2  8485      0   21     1          1 ObeseOverweight21
>> >>> > #8506.2  8506      0   21     0          1      Overweight21
>> >>> > #8520.2  8520      0   21     0          0          Normal21
>> >>> > #8529.2  8529      1   21     1          1 ObeseOverweight21
>> >>> > #8550.2  8550      0   21     1          1 ObeseOverweight21
>> >>> > #8554.2  8554      0   21     0          0          Normal21
>> >>> >
>> >>> > summary(lme.1 <- lme(hibp21~time+Categ+ time*Categ,
>> >>> > data=BP.subnew,random=~1|CODEA, na.action=na.omit))
>> >>> > #Error in MEEM(object, conLin, control$niterEM) :
>> >>> >   #Singularity in backsolve at level 0, block 1
>> >>> > #May be because of the reasons I mentioned above.
>> >>> >
>> >>> > #YOu didn't mention the library(gee)
>> >>> > BP.gee8 <- gee(hibp21~time+Categ+time*Categ,
>> >>> > data=BP.subnew,id=CODEA,family=binomial,
>> >>> > corstr="exchangeable",na.action=na.omit)
>> >>> > #Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
>> >>> > #Error in gee(hibp21 ~ time + Categ + time * Categ, data =
>> BP.subnew,
>> >>> > id
>> >>> =
>> >>> > CODEA,  :
>> >>> >   #rank-deficient model matrix
>> >>> > With your codes, it might have worked, but the results may be
>> >>> > inaccurate
>> >>> > # After running your whole codes:
>> >>> >  BP.gee8 <- gee(hibp21~time+Categ+time*Categ,
>> >>> > data=BP.stack3,id=CODEA,family=binomial,
>> >>> > corstr="exchangeable",na.action=na.omit)
>> >>> > #Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
>> >>> > #running glm to get initial regression estimate
>> >>> >    #        (Intercept)                   time
>>  CategObese14
>> >>> >      #    -2.456607e+01           9.940875e-15
>>  2.087584e-13
>> >>> >     # CategOverweight14      time:CategObese14
>> time:CategOverweight14
>> >>> >       #    2.087584e-13          -9.940875e-15
>> -9.940875e-15
>> >>> > #Error in gee(hibp21 ~ time + Categ + time * Categ, data =
>> BP.stack3,
>> >>> > id
>> >>> =
>> >>> > CODEA,  :
>> >>> >  # Cgee: error: logistic model for probability has fitted value very
>> >>> close
>> >>> > to 1.
>> >>> > #estimates diverging; iteration terminated.
>> >>> >
>> >>> > In short, I think it would be better to go with the suggestion in my
>> >>> > previous email with adequate changes in "Categ" variable (adding
>> >>> > ObeseOverweight14, ObeseOverweight21 etc) as I showed here.
>> >>> >
>> >>> > A.K.
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> > ------------------------------
>> >>> >  If you reply to this email, your message will be added to the
>> >>> discussion
>> >>> > below:
>> >>> >
>> >>>
>> >>> > .
>> >>> > NAML<
>> >>>
>> http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>> >>>
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> View this message in context:
>> >>>
>> http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654776.html
>> >>> Sent from the R help mailing list archive at Nabble.com.
>> >>>     [[alternative HTML version deleted]]
>> >>>
>> >>> ______________________________________________
>> >>> [hidden email]
>> >>> <http://user/SendEmail.jtp?type=node&node=4654795&i=3>mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> PLEASE do read the posting guide
>> >>> http://www.R-project.org/posting-guide.html
>> >>> and provide commented, minimal, self-contained, reproducible code.
>> >>>
>> >>>
>> >>> ______________________________________________
>> >>> [hidden email]
>> >>> <http://user/SendEmail.jtp?type=node&node=4654795&i=4>mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> PLEASE do read the posting guide
>> >>> http://www.R-project.org/posting-guide.html
>> >>> and provide commented, minimal, self-contained, reproducible code.
>> >>>
>> >>>
>> >>> ------------------------------
>> >>>  If you reply to this email, your message will be added to the
>> >>> discussion
>> >>> below:
>> >>>
>> http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654795.html
>> >>>  To unsubscribe from random effects model, click
>> >>> here<
>>
>> >>> .
>> >>> NAML<
>> http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>> >>>
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> View this message in context:
>> >>
>> http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654826.html
>> >> Sent from the R help mailing list archive at Nabble.com.
>> >>     [[alternative HTML version deleted]]
>> >>
>> >> ______________________________________________
>> >> [hidden email] <http://user/SendEmail.jtp?type=node&node=4654996&i=10>mailing list
>
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> >> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >>
>> >>
>> >
>> >
>> > ______________________________________________
>> > [hidden email] <http://user/SendEmail.jtp?type=node&node=4654996&i=11>mailing list
>
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > If you reply to this email, your message will be added to the discussion
>> > below:
>> >
>> http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654902.html
>> >
>> > To unsubscribe from random effects model, visit
>> >
>>
>> View this message in context:
>> http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654986.html
>>
>> Sent from the R help mailing list archive at Nabble.com.
>>     [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] <http://user/SendEmail.jtp?type=node&node=4654996&i=12>mailing list
>
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> ______________________________________________
>> [hidden email] <http://user/SendEmail.jtp?type=node&node=4654996&i=13>mailing list
>
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> ------------------------------
>>  If you reply to this email, your message will be added to the discussion
>> below:
>> http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654996.html
>>  To unsubscribe from random effects model, click here<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4654346&code=dXNoYS5uYXRoYW5AZ21haWwuY29tfDQ2NTQzNDZ8MjAyMjE1NDI0>
>
>> .
>> NAML<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>
>
>
>--
>View this message in context: http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4655206.html
>
>Sent from the R help mailing list archive at Nabble.com.
>    [[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list
>
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>




More information about the R-help mailing list