[R] random effects model
arun
smartpink111 at yahoo.com
Sat Jan 12 18:28:37 CET 2013
HI,
Regarding the script I sent to you earlier, it should be modified a bit according to your dataset. For example, the variable "Sex" was in column 2. Also, your data had some missing values for the "Sex" column, which I removed before running the code.
With respect to the Errors that you showed here, you were using different package and I don't have a clue what you were trying to do.
I applied the same code to your dataset. The results are as follows:
BP_2b<-read.csv("BP_2b.csv",sep="\t")
nrow(BP_2b)
#[1] 6898
BP_2bSexNoMV<-BP_2b[!is.na(BP_2b$Sex),]
nrow(BP_2bSexNoMV)
#[1] 6260
unique(BP_2bSexNoMV$Sex)
#[1] 2 1
library(car)
BP_2bSexNoMV$Sex<-recode(BP_2bSexNoMV$Sex,"1='Male';2='Female'") #assuming that 1=Male and 2=Female
unique(BP_2bSexNoMV$Sex)
#[1] "Female" "Male"
#whole dataset
unlist(lapply(lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(x))*100),function(x) paste(x,"%",sep="")))
# Female Male
#"72.6552179656539%" "74.4740099009901%"
#splitting the above into components:
lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) head(x,2))
#$Female
# CODEA Sex MaternalAge Education Birthplace AggScore IntScore Obese14
#2 3 Female 3 3 1 0 0 0
#3 4 Female 3 6 1 NA NA NA
# Overweight14 Overweight21 Obese21 hibp14 hibp21
#2 0 1 0 0 0
#3 NA NA NA NA NA
#$Male
# CODEA Sex MaternalAge Education Birthplace AggScore IntScore Obese14
#6 9 Male 3 6 2 0 0 0
#8 11 Male 3 4 1 0 0 NA
# Overweight14 Overweight21 Obese21 hibp14 hibp21
#6 0 0 0 0 0
#8 NA 0 0 NA 0
lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x)nrow(x))
#$Female
#[1] 3028
#
#$Male
#[1] 3232
lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) nrow(x[!complete.cases(x[,-2]),])) #rows which have at least one NA
#$Female
#[1] 2200
#
#$Male
#[1] 2407
lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) head(x[!complete.cases(x[,-2]),],2))
#$Female
# CODEA Sex MaternalAge Education Birthplace AggScore IntScore Obese14
#3 4 Female 3 6 1 NA NA NA
#10 13 Female 3 4 2 NA NA NA
# Overweight14 Overweight21 Obese21 hibp14 hibp21
#3 NA NA NA NA NA
#10 NA NA NA NA NA
#$Male
# CODEA Sex MaternalAge Education Birthplace AggScore IntScore Obese14
#8 11 Male 3 4 1 0 0 NA
#9 12 Male 3 4 2 NA NA NA
# Overweight14 Overweight21 Obese21 hibp14 hibp21
#8 NA 0 0 NA 0
#9 NA NA NA NA NA
lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(x))*100) #gives the percentage of rows of missing #values from the overall rows for Males and Females
#$Female
#[1] 72.65522
#
#$Male
#[1] 74.47401
#iF you want the percentage from the total number rows in Males and Females (without NA's in the the Sex column)
lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(BP_2bSexNoMV))*100)
#$Female
#[1] 35.14377
#
#$Male
#[1] 38.45048
unlist(lapply(lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(BP_2bSexNoMV))*100),function(x) paste(x,"%",sep="")))#paste the "%" to the numbers
# Female Male
#"35.1437699680511%" "38.4504792332268%"
#If it is to find the percentage of missing values for each variable in females and males:
res<-do.call(rbind,lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) paste((colSums(is.na(x[,-2]))/nrow(BP_2bSexNoMV))*100,"%",sep=""))) #If #you want percentage of missing values per category, replace by nrow(x)
colnames(res)<-colnames(BP_2bSexNoMV)[-2]
res
# CODEA MaternalAge Education Birthplace
#Female "0%" "0%" "1.03833865814696%" "1.18210862619808%"
#Male "0%" "0%" "1.10223642172524%" "1.3258785942492%"
# AggScore IntScore Obese14
#Female "15.2076677316294%" "15.2076677316294%" "24.0894568690096%"
#Male "16.5015974440895%" "16.5015974440895%" "25.814696485623%"
# Overweight14 Overweight21 Obese21
#Female "24.0894568690096%" "23.3865814696486%" "23.3865814696486%"
#Male "25.814696485623%" "29.1533546325879%" "29.1533546325879%"
# hibp14 hibp21
#Female "24.1693290734824%" "31.3418530351438%"
#Male "25.8466453674121%" "35.223642172524%"
A.K.
________________________________
From: Usha Gurunathan <usha.nathan at gmail.com>
To: arun <smartpink111 at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Saturday, January 12, 2013 1:42 AM
Subject: Re: [R] random effects model
Hi
I do want percentages of the categories inthe whole data set. But, I am a bit unclear about this syntax. Can you explain please. This is the error message I got with your script?
Error: could not find function "Copy.of.BP_2". Not sure what, because the data frame was already loaded.
Also
I was trying out package: vmv( after installing)
as data(df,package, package="vmv")
In data(Copy.of.BP_2c, package = "vmv") : data set ‘Copy.of.BP_2c’ not foundtablemissing(df, sort by="variable")
Error in tablemissing(Copy.of.BP_2, sortby = "Sex") : object 'tabfinal' not found
## Same problem with "vim" package.
## What mistake could I have done?
Thanks.
On Sat, Jan 12, 2013 at 3:11 PM, arun <smartpink111 at yahoo.com> wrote:
HI,
>
>If you want to find out the percentage of missing values in the whole dataset in females and males:
> set.seed(51)
> dat1<-data.frame(Gender=rep(c("M","F"),each=10),V1=sample(c(1:3,NA),20,replace=TRUE),V2=sample(c(21:24,NA),20,replace=TRUE))
> unlist(lapply(lapply(split(dat1,dat1$Gender),function(x) (nrow(x[!complete.cases(x[,-1]),])/nrow(x))*100),function(x) paste(x,"%",sep="")))
># F M
>#"20%" "70%"
>
>#If it is to find the percentage of missing values for each variable in females and males:
> res<-do.call(rbind,lapply(split(dat1,dat1$Gender),function(x) paste((colSums(is.na(x[,-1]))/nrow(x))*100,"%",sep="")))
> colnames(res)<-colnames(dat1)[-1]
> res
># V1 V2
>#F "0%" "20%"
>#M "50%" "20%"
>
>A.K.
>
>
>
>
>
>----- Original Message -----
>
>From: rex2013 <usha.nathan at gmail.com>
>To: r-help at r-project.org
>Cc:
>
>Sent: Friday, January 11, 2013 2:16 AM
>Subject: Re: [R] random effects model
>
>
>Hi AK
>
>Regarding the missing values, I would like to find out the patterns of
>missing values in my data set. I know the overall values for each variable.
>
>using
>
>colSums(is.na(df))
>
> but what I wanted is to find out the percentages
>with each level of the variable with my dataset, as in if there is more
>missing data in females or males etc?.
>
>I installed "mi" package, but unable to produce a plot with it( i would
>also like to produce a plot). I searched the responses in the relevant
>sections in r but could n't find an answer.
>
>Thanks,
>
>
>
>
>
>On Wed, Jan 9, 2013 at 12:31 PM, arun kirshna [via R] <
>ml-node+s789695n4654996h3 at n4.nabble.com> wrote:
>
>> HI,
>>
>> In your dataset, the "exchangeable" or "compound symmetry" may work as
>> there are only two levels for time. In experimental data analysis
>> involving a factor time with more than 2 levels, randomization of
>> combination of levels of factors applied to the subject/plot etc. gets
>> affected as time is unidirectional. I guess your data is observational,
>> and with two time levels, it may not hurt to use "CS" as option, though, it
>> would help if you check different options.
>>
>> In the link I sent previously, QIC was used.
>> http://stats.stackexchange.com/questions/577/is-there-any-reason-to-prefer-the-aic-or-bic-over-the-other
>>
>> I am not sure whether AIC/BIC is better than QIC or viceversa.
>>
>> You could sent email to the maintainer of geepack (Jun Yan <[hidden email]<http://user/SendEmail.jtp?type=node&node=4654996&i=0>>).
>
>>
>> Regarding the reference links,
>> You can check this link "www.jstatsoft.org/v15/i02/paper"; . Other
>> references are in the paper.
>> "
>> 4.3. Missing values (waves)
>> In case of missing values, the GEE estimates are consistent if the values
>> are missing com-
>> pletely at random (Rubin 1976). The geeglm function assumes by default
>> that observations
>> are equally separated in time. Therefore, one has to inform the function
>> about different sep-
>> arations if there are missing values and other correlation structures than
>> the independence or
>> exchangeable structures are used. The waves arguments takes an integer
>> vector that indicates
>> that two observations of the same cluster with the values of the vector of
>> k respectively l have
>> a correlation of rkl ."
>>
>> Hope it helps.
>> A.K.
>>
>>
>>
>>
>> ----- Original Message -----
>
>> From: rex2013 <[hidden email]<http://user/SendEmail.jtp?type=node&node=4654996&i=1>>
>>
>> To: [hidden email] <http://user/SendEmail.jtp?type=node&node=4654996&i=2>
>
>> Cc:
>> Sent: Tuesday, January 8, 2013 5:29 PM
>> Subject: Re: [R] random effects model
>>
>> Hi
>>
>> Thanks a lot, the corstr "exchangeable"does work. Didn't strike to me
>> for so long. Does the AIC value come out with the gee output?
>>
>> By reference, I meant reference to a easy-read paper or web address
>> that can give me knowledge about implications of missing data.
>>
>> Ta.
>>
>> On 1/8/13, arun kirshna [via R]
>> <[hidden email] <http://user/SendEmail.jtp?type=node&node=4654996&i=3>>
>
>> wrote:
>>
>> >
>> >
>> > HI,
>> > BP.stack5 is the one without missing values.
>> > na.omit(....). Otherwise, I have to use the option na.action=.. in the
>> > ?geese() statement
>> >
>> > You need to read about the correlation structures. IN unstructured
>> option,
>> > more number of parameters needs to be estimated, In repeated measures
>> > design, when the underlying structure is not known, it would be better
>> to
>> > compare using different options (exchangeable is similar to compound
>> > symmetry) and select the one which provide the least value for AIC or
>> BIC.
>> > Have a look at
>> >
>> >
>> http://stats.stackexchange.com/questions/21771/how-to-perform-model-selection-in-gee-in-r
>> > It's not clear to me "reference to write about missing values".
>> > A.K.
>> >
>> >
>> >
>> >
>> > ----- Original Message -----
>
>> > From: Usha Gurunathan <[hidden email]<http://user/SendEmail.jtp?type=node&node=4654996&i=4>>
>>
>> > To: arun <[hidden email]<http://user/SendEmail.jtp?type=node&node=4654996&i=5>>
>>
>
>> > Cc:
>> > Sent: Monday, January 7, 2013 6:12 PM
>> > Subject: Re: [R] random effects model
>> >
>> > Hi AK
>> >
>> > 2)I shall try putting exch. and check when I get home. Btw, what is
>> > BP.stack5? is it with missing values or only complete cases?
>> >
>> > I guess I am still not clear about the unstructured and exchangeable
>> > options, as in which one is better.
>> >
>> > 1)Rgding the summary(p): NA thing, I tried putting one of my gee
>> equation.
>> >
>> > Can you suggest me a reference to write about" missing values and the
>> > implications for my results"
>> >
>> > Thanks.
>> >
>> >
>> >
>> > On 1/8/13, arun <[hidden email]<http://user/SendEmail.jtp?type=node&node=4654996&i=6>>
>
>> wrote:
>> >> HI,
>> >>
>> >> Just to add:
>> >>
>> fit3<-geese(hibp~MaternalAge*time,id=CODEA,data=BP.stack5,family=binomial,corstr="exch",scale.fix=TRUE)
>>
>> >> #works
>> >> summary(fit3)$mean["p"]
>> >> # p
>> >> #(Intercept) 0.00000000
>> >> #MaternalAge4 0.49099242
>> >> #MaternalAge5 0.04686295
>> >> #time21 0.86164351
>> >> #MaternalAge4:time21 0.59258221
>> >> #MaternalAge5:time21 0.79909832
>> >>
>> >>
>> fit4<-geese(hibp~MaternalAge*time,id=CODEA,data=BP.stack5,family=binomial,corstr="unstructured",scale.fix=TRUE)
>>
>> >> #when the correlation structure is changed to "unstructured"
>> >> #Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
>> >> # contrasts can be applied only to factors with 2 or more levels
>> >> #In addition: Warning message:
>> >> #In is.na(rows) : is.na() applied to non-(list or vector) of type
>> 'NULL'
>> >>
>> >>
>> >> Though, it works with data(Ohio)
>> >>
>> >>
>> fit1<-geese(resp~age+smoke+age:smoke,id=id,data=ohio1,family=binomial,corstr="unstructured",scale.fix=TRUE)
>>
>> >> summary(fit1)$mean["p"]
>> >> # p
>> >> #(Intercept) 0.00000000
>> >> #age-1 0.60555454
>> >> #age0 0.45322698
>> >> #age1 0.01187725
>> >> #smoke1 0.86262269
>> >> #age-1:smoke1 0.17239050
>> >> #age0:smoke1 0.32223942
>> >> #age1:smoke1 0.36686706
>> >>
>> >>
>> >>
>> >> By checking:
>> >> with(BP.stack5,table(MaternalAge,time))
>> >> # time
>> >> #MaternalAge 14 21
>> >> # 3 1104 864
>> >> # 4 875 667
>> >> # 5 67 53 #less number of observations
>> >>
>> >>
>> >> BP.stack6 <- BP.stack5[order(BP.stack5$CODEA, BP.stack5$time),]
>> >> head(BP.stack6) # very few IDs with MaternalAge==5
>> >> # X CODEA Sex MaternalAge Education Birthplace AggScore IntScore
>> >> #1493 3.1 3 2 3 3 1 0 0
>> >> #3202 3.2 3 2 3 3 1 0 0
>> >> #1306 7.1 7 2 4 6 1 0 0
>> >> #3064 7.2 7 2 4 6 1 0 0
>> >> #1 8.1 8 2 4 4 1 0 0
>> >> #2047 8.2 8 2 4 4 1 0 0
>> >> # Categ time Obese Overweight hibp
>> >> #1493 Overweight 14 0 0 0
>> >> #3202 Overweight 21 0 1 0
>> >> #1306 Obese 14 0 0 0
>> >> #3064 Obese 21 1 1 0
>> >> #1 Normal 14 0 0 0
>> >> #2047 Normal 21 0 0 0
>> >> BP.stack7<-BP.stack6[BP.stack6$MaternalAge!=5,]
>> >>
>> >>
>> BP.stack7$MaternalAge<-factor(as.numeric(as.character(BP.stack7$MaternalAge)
>>
>> >>
>> >>
>> fit5<-geese(hibp~MaternalAge*time,id=CODEA,data=BP.stack7,family=binomial,corstr="unstructured",scale.fix=TRUE)
>>
>> >> #Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
>> >> # contrasts can be applied only to factors with 2 or more levels
>> >>
>> >> with(BP.stack7,table(MaternalAge,time)) #It looks like the
>> combinations
>> >> are still there
>> >> # time
>> >> #MaternalAge 14 21
>> >> # 3 1104 864
>> >> # 4 875 667
>> >>
>> >> It works also with corstr="ar1". Why do you gave the option
>> >> "unstructured"?
>> >> A.K.
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> ----- Original Message -----
>
>> >> From: rex2013 <[hidden email]<http://user/SendEmail.jtp?type=node&node=4654996&i=7>>
>>
>> >> To: [hidden email]<http://user/SendEmail.jtp?type=node&node=4654996&i=8>
>
>> >> Cc:
>> >> Sent: Monday, January 7, 2013 6:15 AM
>> >> Subject: Re: [R] random effects model
>> >>
>> >> Hi A.K
>> >>
>> >> Below is the comment I get, not sure why.
>> >>
>> >> BP.sub3 is the stacked data without the missing values.
>> >>
>> >> BP.geese3 <- geese(HiBP~time*MaternalAge,data=BP.sub3,id=CODEA,
>> >> family=binomial, corstr="unstructured", na.action=na.omit)Error in
>> >> `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
>> >> contrasts can be applied only to factors with 2 or more levels
>> >>
>> >> Even though age has 3 levels; time has 14 years & 21 years; HIBP is a
>> >> binary response outcome.
>> >>
>> >> 2) When you mentioned summary(m1)$mean["p"] what did the p mean? i
>> >> used this in one of the gee command, it produced NA as answer?
>> >>
>> >> Many thanks
>> >>
>> >>
>> >>
>> >> On Mon, Jan 7, 2013 at 5:26 AM, arun kirshna [via R] <
>> >> [hidden email] <http://user/SendEmail.jtp?type=node&node=4654996&i=9>>
>
>> wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> I am not very familiar with the geese/geeglm(). Is it from
>> >>> library(geepack)?
>> >>> Regarding your question:
>> >>> "
>> >>> Can you tell me if I can use the geese or geeglm function with this
>> data
>> >>> eg: : HIBP~ time* Age
>> >>> Here age is a factor with 3 levels, time: 2 levels, HIBP = yes/no.
>> >>>
>> >>> From your original data:
>> >>> BP_2b<-read.csv("BP_2b.csv",sep="\t")
>> >>> head(BP_2b,2)
>> >>> # CODEA Sex MaternalAge Education Birthplace AggScore IntScore
>> Obese14
>> >>> #1 1 NA 3 4 1 NA NA
>> NA
>> >>> #2 3 2 3 3 1 0 0
>> 0
>> >>> # Overweight14 Overweight21 Obese21 hibp14 hibp21
>> >>> #1 NA NA NA NA NA
>> >>> #2 0 1 0 0 0
>> >>>
>> >>> If I understand your new classification:
>> >>> BP.stacknormal<- subset(BP_2b,Obese14==0 & Overweight14==0 &
>> Obese21==0
>> >>> &
>> >>> Overweight21==0)
>> >>> BP.stackObese <- subset(BP_2b,(Obese14==1& Overweight14==0 &
>> >>> Obese14==1&Overweight14==1)|(Obese14==1&Overweight14==1 & Obese21==1 &
>> >>> Overweight21==0)|(Obese14==1&Overweight14==0 & Obese21==0 &
>> >>> Overweight21==0)|(Obese14==0 & Overweight14==0 & Obese21==1 &
>> >>> Overweight21==0)|(Obese14==0 & Overweight14==0 & Obese21==1 &
>> >>> Overweight21==1)|(Obese14==0 & Overweight14==1 & Obese21==1
>> >>> &Overweight21==1)|(Obese14==1& Overweight14==1 & Obese21==1&
>> >>> Overweight21==1)) #check whether there are more classification that
>> fits
>> >>> to
>> >>> #Obese
>> >>> BP.stackOverweight <- subset(BP_2b,(Obese14==0 & Overweight14==1 &
>> >>> Obese21==0 & Overweight21==1)|(Obese14==0 &Overweight14==1 &
>> Obese21==0
>> >>> &
>> >>> Overweight21==0)|(Obese14==0 & Overweight14==0 & Obese21==0 &
>> >>> Overweight21==1))
>> >>> BP.stacknormal$Categ<-"Normal"
>> >>> BP.stackObese$Categ<-"Obese"
>> >>> BP.stackOverweight$Categ <- "Overweight"
>> >>>
>> >>>
>> BP.newObeseOverweightNormal<-na.omit(rbind(BP.stacknormal,BP.stackObese,BP.stackOverweight))
>>
>> >>>
>> >>> nrow(BP.newObeseOverweightNormal)
>> >>> #[1] 1581
>> >>> BP.stack3 <-
>> >>>
>> reshape(BP.newObeseOverweightNormal,idvar="CODEA",timevar="time",sep="_",varying=list(c("Obese14","Obese21"),c("Overweight14","Overweight21"),c("hibp14","hibp21")),v.names=c("Obese","Overweight","hibp"),direction="long")
>>
>> >>>
>> >>> library(car)
>> >>> BP.stack3$time<-recode(BP.stack3$time,"1=14;2=21")
>> >>> head(BP.stack3,2)
>> >>> # CODEA Sex MaternalAge Education Birthplace AggScore IntScore
>> Categ
>> >>> time
>> >>> #8.1 8 2 4 4 1 0 0
>> Normal
>> >>> 14
>> >>> #9.1 9 1 3 6 2 0 0
>> Normal
>> >>> 14
>> >>> # Obese Overweight hibp
>> >>> #8.1 0 0 0
>> >>>
>> >>> Now, your formula: (HIBP~time*Age), is it MaternalAge?
>> >>> If it is, it has three values
>> >>> unique(BP.stack3$MaternalAge)
>> >>> #[1] 4 3 5
>> >>> and for time (14,21) # If it says that geese/geeglm, contrasts could
>> be
>> >>> applied with factors>=2 levels, what is the problem?
>> >>> If you take "Categ" variable, it also has 3 levels (Normal, Obese,
>> >>> Overweight).
>> >>>
>> >>> BP.stack3$MaternalAge<-factor(BP.stack3$MaternalAge)
>> >>> BP.stack3$time<-factor(BP.stack3$time)
>> >>>
>> >>> library(geepack)
>> >>> For your last question about how to get the p-values:
>> >>> # Using one of the example datasets:
>> >>> data(seizure)
>> >>> seiz.l <- reshape(seizure,
>> >>> varying=list(c("base","y1", "y2", "y3", "y4")),
>> >>> v.names="y", times=0:4, direction="long")
>> >>> seiz.l <- seiz.l[order(seiz.l$id, seiz.l$time),]
>> >>> seiz.l$t <- ifelse(seiz.l$time == 0, 8, 2)
>> >>> seiz.l$x <- ifelse(seiz.l$time == 0, 0, 1)
>> >>> m1 <- geese(y ~ offset(log(t)) + x + trt + x:trt, id = id,
>> >>> data=seiz.l, corstr="exch", family=poisson)
>> >>> summary(m1)
>> >>>
>> >>> summary(m1)$mean["p"]
>> >>> # p
>> >>> #(Intercept) 0.0000000
>> >>> #x 0.3347040
>> >>> #trt 0.9011982
>> >>> #x:trt 0.6236769
>> >>>
>> >>>
>> >>> #If you need the p-values of the scale
>> >>> summary(m1)$scale["p"]
>> >>> # p
>> >>> #(Intercept) 0.0254634
>> >>>
>> >>> Hope it helps.
>> >>>
>> >>> A.K.
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> ----- Original Message -----
>> >>> From: rex2013 <[hidden
>> >>> email]<http://user/SendEmail.jtp?type=node&node=4654795&i=0>>
>> >>>
>> >>> To: [hidden email]
>> >>> <http://user/SendEmail.jtp?type=node&node=4654795&i=1>
>> >>> Cc:
>> >>> Sent: Sunday, January 6, 2013 4:55 AM
>> >>> Subject: Re: [R] random effects model
>> >>>
>> >>> Hi A.K
>> >>>
>> >>> Regarding my question on comparing normal/ obese/overweight with blood
>> >>> pressure change, I did finally as per the first suggestion of stacking
>> >>> the
>> >>> data and creating a normal category . This only gives me a obese not
>> >>> obese
>> >>> 14, but when I did with the wide format hoping to get a
>> >>> obese14,normal14,overweight 14 Vs hibp 21, i could not complete any of
>> >>> the
>> >>> models.
>> >>> This time I classified obese=1 & overweight=1 as obese itself.
>> >>>
>> >>> Can you tell me if I can use the geese or geeglm function with this
>> data
>> >>> eg: : HIBP~ time* Age
>> >>> Here age is a factor with 3 levels, time: 2 levels, HIBP = yes/no.
>> >>>
>> >>> It says geese/geeglm: contrast can be applied only with factor with 2
>> or
>> >>> more levels. What is the way to overcome this. Can I manipulate the
>> data
>> >>> to
>> >>> make it work.
>> >>>
>> >>> I need to know if the demogrphic variables affect change in blood
>> >>> pressure
>> >>> status over time?
>> >>>
>> >>> How to get the p values with gee model?
>> >>>
>> >>> Thanks
>> >>> On Thu, Jan 3, 2013 at 5:06 AM, arun kirshna [via R] <
>> >>> [hidden email] <http://user/SendEmail.jtp?type=node&node=4654795&i=2>>
>>
>> >>> wrote:
>> >>>
>> >>> > HI Rex,
>> >>> > If I take a small subset from your whole dataset, and go through
>> your
>> >>> > codes:
>> >>> > BP_2b<-read.csv("BP_2b.csv",sep="\t")
>> >>> > BP.sub<-BP_2b[410:418,c(1,8:11,13)] #deleted the columns that are
>> not
>> >>> > needed
>> >>> > BP.stacknormal<- subset(BP.subnew,Obese14==0 & Overweight14==0)
>> >>> > BP.stackObese <- subset(BP.subnew,Obese14==1)
>> >>> > BP.stackOverweight <- subset(BP.subnew,Overweight14==1)
>> >>> > BP.stacknormal$Categ<-"Normal14"
>> >>> > BP.stackObese$Categ<-"Obese14"
>> >>> > BP.stackOverweight$Categ <- "Overweight14"
>> >>> >
>> >>>
>> BP.newObeseOverweightNormal<-rbind(BP.stacknormal,BP.stackObese,BP.stackOverweight)
>>
>> >>>
>> >>> >
>> >>> > BP.newObeseOverweightNormal
>> >>> > # CODEA Obese14 Overweight14 Overweight21 Obese21 hibp21
>> >>> > Categ
>> >>> > #411 541 0 0 0 0 0
>> >>> > Normal14
>> >>> > #415 545 0 0 1 1 1
>> >>> > Normal14
>> >>> > #418 549 0 0 1 0 0
>> >>> > Normal14
>> >>> > #413 543 1 0 1 1 0
>> >>> > Obese14
>> >>> > #417 548 0 1 1 0 0
>> >>> > Overweight14
>> >>> > BP.newObeseOverweightNormal$Categ<-
>> >>> > factor(BP.newObeseOverweightNormal$Categ)
>> >>> > BP.stack3 <-
>> >>> >
>> >>>
>> reshape(BP.newObeseOverweightNormal,idvar="CODEA",timevar="time",sep="_",varying=list(c("Obese14","Obese21"),c("Overweight14","Overweight21")),v.names=c("Obese","Overweight"),direction="long")
>>
>> >>>
>> >>> >
>> >>> > library(car)
>> >>> > BP.stack3$time<-recode(BP.stack3$time,"1=14;2=21")
>> >>> > BP.stack3 #Here Normal14 gets repeated even at time==21. Given that
>> >>> > you
>> >>> > are using the "Categ" and "time" #columns in the analysis, it will
>> >>> > give
>> >>> > incorrect results.
>> >>> > # CODEA hibp21 Categ time Obese Overweight
>> >>> > #541.1 541 0 Normal14 14 0 0
>> >>> > #545.1 545 1 Normal14 14 0 0
>> >>> > #549.1 549 0 Normal14 14 0 0
>> >>> > #543.1 543 0 Obese14 14 1 0
>> >>> > #548.1 548 0 Overweight14 14 0 1
>> >>> > #541.2 541 0 Normal14 21 0 0
>> >>> > #545.2 545 1 Normal14 21 1 1
>> >>> > #549.2 549 0 Normal14 21 0 1
>> >>> > #543.2 543 0 Obese14 21 1 1
>> >>> > #548.2 548 0 Overweight14 21 0 1
>> >>> > #Even if I correct the above codes, this will give incorrect
>> >>> > results/(error as you shown) because the response variable (hibp21)
>> >>> > gets
>> >>> > #repeated when you reshape it from wide to long.
>> >>> >
>> >>> > The correct classification might be:
>> >>> > BP_2b<-read.csv("BP_2b.csv",sep="\t")
>> >>> > BP.sub<-BP_2b[410:418,c(1,8:11,13)]
>> >>> >
>> >>>
>> BP.subnew<-reshape(BP.sub,idvar="CODEA",timevar="time",sep="",varying=list(c("Obese14","Obese21"),c("Overweight14","Overweight21")),v.names=c("Obese","Overweight"),direction="long")
>>
>> >>>
>> >>> >
>> >>> > BP.subnew$time<-recode(BP.subnew$time,"1=14;2=21")
>> >>> > BP.subnew<-na.omit(BP.subnew)
>> >>> >
>> >>> > BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==14 &
>> >>> > BP.subnew$Obese==0]<-"Overweight14"
>> >>> > BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==21 &
>> >>> > BP.subnew$Obese==0]<-"Overweight21"
>> >>> > BP.subnew$Categ[BP.subnew$Obese==1 & BP.subnew$time==14 &
>> >>> > BP.subnew$Overweight==0]<-"Obese14"
>> >>> > BP.subnew$Categ[BP.subnew$Obese==1 & BP.subnew$time==21 &
>> >>> > BP.subnew$Overweight==0]<-"Obese21"
>> >>> > BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==21&
>> >>> > BP.subnew$Obese==1]<-"ObeseOverweight21"
>> >>> > BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==14&
>> >>> > BP.subnew$Obese==1]<-"ObeseOverweight14"
>> >>> > BP.subnew$Categ[BP.subnew$Overweight==0 & BP.subnew$Obese==0
>> >>> > &BP.subnew$time==14]<-"Normal14"
>> >>> > BP.subnew$Categ[BP.subnew$Overweight==0 & BP.subnew$Obese==0
>> >>> > &BP.subnew$time==21]<-"Normal21"
>> >>> >
>> >>> > BP.subnew$Categ<-factor(BP.subnew$Categ)
>> >>> > BP.subnew$time<-factor(BP.subnew$time)
>> >>> > BP.subnew
>> >>> > # CODEA hibp21 time Obese Overweight Categ
>> >>> > #541.1 541 0 14 0 0 Normal14
>> >>> > #543.1 543 0 14 1 0 Obese14
>> >>> > #545.1 545 1 14 0 0 Normal14
>> >>> > #548.1 548 0 14 0 1 Overweight14
>> >>> > #549.1 549 0 14 0 0 Normal14
>> >>> > #541.2 541 0 21 0 0 Normal21
>> >>> > #543.2 543 0 21 1 1 ObeseOverweight21
>> >>> > #545.2 545 1 21 1 1 ObeseOverweight21
>> >>> > #548.2 548 0 21 0 1 Overweight21
>> >>> > #549.2 549 0 21 0 1 Overweight21
>> >>> >
>> >>> > #NOw with the whole dataset:
>> >>> > BP.sub<-BP_2b[,c(1,8:11,13)] #change here and paste the above lines:
>> >>> > head(BP.subnew)
>> >>> > # CODEA hibp21 time Obese Overweight Categ
>> >>> > #3.1 3 0 14 0 0 Normal14
>> >>> > #7.1 7 0 14 0 0 Normal14
>> >>> > #8.1 8 0 14 0 0 Normal14
>> >>> > #9.1 9 0 14 0 0 Normal14
>> >>> > #14.1 14 1 14 0 0 Normal14
>> >>> > #21.1 21 0 14 0 0 Normal14
>> >>> >
>> >>> > tail(BP.subnew)
>> >>> > # CODEA hibp21 time Obese Overweight Categ
>> >>> > #8485.2 8485 0 21 1 1 ObeseOverweight21
>> >>> > #8506.2 8506 0 21 0 1 Overweight21
>> >>> > #8520.2 8520 0 21 0 0 Normal21
>> >>> > #8529.2 8529 1 21 1 1 ObeseOverweight21
>> >>> > #8550.2 8550 0 21 1 1 ObeseOverweight21
>> >>> > #8554.2 8554 0 21 0 0 Normal21
>> >>> >
>> >>> > summary(lme.1 <- lme(hibp21~time+Categ+ time*Categ,
>> >>> > data=BP.subnew,random=~1|CODEA, na.action=na.omit))
>> >>> > #Error in MEEM(object, conLin, control$niterEM) :
>> >>> > #Singularity in backsolve at level 0, block 1
>> >>> > #May be because of the reasons I mentioned above.
>> >>> >
>> >>> > #YOu didn't mention the library(gee)
>> >>> > BP.gee8 <- gee(hibp21~time+Categ+time*Categ,
>> >>> > data=BP.subnew,id=CODEA,family=binomial,
>> >>> > corstr="exchangeable",na.action=na.omit)
>> >>> > #Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
>> >>> > #Error in gee(hibp21 ~ time + Categ + time * Categ, data =
>> BP.subnew,
>> >>> > id
>> >>> =
>> >>> > CODEA, :
>> >>> > #rank-deficient model matrix
>> >>> > With your codes, it might have worked, but the results may be
>> >>> > inaccurate
>> >>> > # After running your whole codes:
>> >>> > BP.gee8 <- gee(hibp21~time+Categ+time*Categ,
>> >>> > data=BP.stack3,id=CODEA,family=binomial,
>> >>> > corstr="exchangeable",na.action=na.omit)
>> >>> > #Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
>> >>> > #running glm to get initial regression estimate
>> >>> > # (Intercept) time
>> CategObese14
>> >>> > # -2.456607e+01 9.940875e-15
>> 2.087584e-13
>> >>> > # CategOverweight14 time:CategObese14
>> time:CategOverweight14
>> >>> > # 2.087584e-13 -9.940875e-15
>> -9.940875e-15
>> >>> > #Error in gee(hibp21 ~ time + Categ + time * Categ, data =
>> BP.stack3,
>> >>> > id
>> >>> =
>> >>> > CODEA, :
>> >>> > # Cgee: error: logistic model for probability has fitted value very
>> >>> close
>> >>> > to 1.
>> >>> > #estimates diverging; iteration terminated.
>> >>> >
>> >>> > In short, I think it would be better to go with the suggestion in my
>> >>> > previous email with adequate changes in "Categ" variable (adding
>> >>> > ObeseOverweight14, ObeseOverweight21 etc) as I showed here.
>> >>> >
>> >>> > A.K.
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> > ------------------------------
>> >>> > If you reply to this email, your message will be added to the
>> >>> discussion
>> >>> > below:
>> >>> >
>> >>>
>> >>> > .
>> >>> > NAML<
>> >>>
>> http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>> >>>
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> View this message in context:
>> >>>
>> http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654776.html
>> >>> Sent from the R help mailing list archive at Nabble.com.
>> >>> [[alternative HTML version deleted]]
>> >>>
>> >>> ______________________________________________
>> >>> [hidden email]
>> >>> <http://user/SendEmail.jtp?type=node&node=4654795&i=3>mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> PLEASE do read the posting guide
>> >>> http://www.R-project.org/posting-guide.html
>> >>> and provide commented, minimal, self-contained, reproducible code.
>> >>>
>> >>>
>> >>> ______________________________________________
>> >>> [hidden email]
>> >>> <http://user/SendEmail.jtp?type=node&node=4654795&i=4>mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> PLEASE do read the posting guide
>> >>> http://www.R-project.org/posting-guide.html
>> >>> and provide commented, minimal, self-contained, reproducible code.
>> >>>
>> >>>
>> >>> ------------------------------
>> >>> If you reply to this email, your message will be added to the
>> >>> discussion
>> >>> below:
>> >>>
>> http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654795.html
>> >>> To unsubscribe from random effects model, click
>> >>> here<
>>
>> >>> .
>> >>> NAML<
>> http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>> >>>
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> View this message in context:
>> >>
>> http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654826.html
>> >> Sent from the R help mailing list archive at Nabble.com.
>> >> [[alternative HTML version deleted]]
>> >>
>> >> ______________________________________________
>> >> [hidden email] <http://user/SendEmail.jtp?type=node&node=4654996&i=10>mailing list
>
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> >> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >>
>> >>
>> >
>> >
>> > ______________________________________________
>> > [hidden email] <http://user/SendEmail.jtp?type=node&node=4654996&i=11>mailing list
>
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > If you reply to this email, your message will be added to the discussion
>> > below:
>> >
>> http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654902.html
>> >
>> > To unsubscribe from random effects model, visit
>> >
>>
>> View this message in context:
>> http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654986.html
>>
>> Sent from the R help mailing list archive at Nabble.com.
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> [hidden email] <http://user/SendEmail.jtp?type=node&node=4654996&i=12>mailing list
>
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> ______________________________________________
>> [hidden email] <http://user/SendEmail.jtp?type=node&node=4654996&i=13>mailing list
>
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> ------------------------------
>> If you reply to this email, your message will be added to the discussion
>> below:
>> http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654996.html
>> To unsubscribe from random effects model, click here<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4654346&code=dXNoYS5uYXRoYW5AZ21haWwuY29tfDQ2NTQzNDZ8MjAyMjE1NDI0>
>
>> .
>> NAML<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>
>
>
>--
>View this message in context: http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4655206.html
>
>Sent from the R help mailing list archive at Nabble.com.
> [[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list
>
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>
More information about the R-help
mailing list