[R] Automating R for Hypothesis Testing
Rui Barradas
ruipbarradas at sapo.pt
Wed May 9 16:55:05 CEST 2012
Hello,
Yes, it does help. Now we can see your data and what you're doing.
What follows is a suggestion on what you could do, not full solution.
(You forgot to say what X1 is, but I don't think it's important to
understand the suggestion.)
(If I'm wrong, say something.)
milwaukeephos <- read.csv("milwaukeephos.csv", header=TRUE,
stringsAsFactors=FALSE)
# list of data.frames, one per month
ls1 <- split(milwaukeephos, milwaukeephos$month)
#--------- if you want to keep the models, not needed if you don't.
# (yoy probably don't)
modelH <- vector("list", 12)
modelHa <- vector("list", 12)
modelH2 <- vector("list", 12)
modelH2a <- vector("list", 12)
#--------- values to record, these are needed, create them beforehand.
chi_fm <- numeric(12)
chi_fms <- numeric(12)
#
seq_months <- c(1:12, 1) # wrap months around.
for(i in 1:12){
month_this <- seq_months[i]
month_next <- seq_months[i + 1]
lload <- c(ls1[[month_this]]$load_kg, ls1[[month_next]]$load_kg)
lflow <- c(ls1[[month_this]]$flow, ls1[[month_next]]$flow)
modelH[[i]] <- lm(lload ~ lflow)
# If you don't want to keep the models, use modelH only
# ( without [[i]] )
# and do the same with X1
# rest of your code for first test goes here
chi_fm[i] <- bfm %*% var_fm %*% (bunres_fm - bres_fm)
# and the same for the second test
chi_fms[i] <- ...etc...
}
Hope this helps,
Rui Barradas
meredith wrote
>
> dput: http://r.789695.n4.nabble.com/file/n4620188/milwaukeephos.csv
> milwaukeephos.csv
>
> # Feb-march
>> modelH_febmarch<-lm(llfeb_march~lffeb_march)
>>modelHa_febmarch<-lm(llfeb_march~X1feb_mar+lffeb_march)
>> anova(modelHa_febmarch)
>> coefficients(modelH_febmarch)
> (Intercept) lffeb_march
> -2.429890 1.172821
>> coefficients(modelHa_febmarch)
> (Intercept) X1feb_mar lffeb_march
> -2.8957776 -0.5272793 1.3016303
>> bres_fm<-matrix(c(-2.429890,0,1.172821),nrow=3)
>> bunres_fm<-matrix(c(-2.8957776,-0.5272793,1.3016303),nrow=3)
>>bfm<-t(bunres_fm-bres_fm)
>> fmvect<-seq(1,1,length=34)
>> X1a_febmar<-seq(0,0,length=9) # dummy variable step 1
>> X1b_febmar<-seq(1,1,length=25) # dummy variable step 2
>> X1feb_mar<-c(X1a_febmar,X1b_febmar) #dummy variable creation
> # Test Stat Equation for Chisq
>> fmxx<-cbind(fmvect,X1feb_mar,lffeb_march)
>> tfmx<-t(fmxx)
>> xcom_fm<-(tfmx %*% fmxx)
>> xinv_fm<-ginv(xcom_fm)
>> var_fm<-xinv_fm*0.307
>> chi_fm<-bfm %*% var_fm %*% (bunres_fm-bres_fm)
>> chi_fm # chisq value for recording
> if less than CV move onto to slope modification
>> modelH2_febmarch<-lm(llfeb_march~X3feb_march)
>> modelH2a_febmarch<-lm(llfeb_march~X3feb_march+X4feb_march)
>> anova(modelH2a_febmarch)
>> coefficients(modelH2_febmarch) # get coefficients to make beta vectors
>> for test
> (Intercept) X3feb_march
> 5.342130 1.172821
>> coefficients(modelH2a_febmarch)
> (Intercept) X3feb_march X4feb_march
> 5.2936263 1.0353752 0.2407557
> # Test Stat
>> bsres_fm<-matrix(c(5.342130,1.172821,0),nrow=3)
>> bsunres_fm<-matrix(c(5.2936263,1.0353752,0.2407557),nrow=3)
>> bsfm<-t(bsunres_fm-bsres_fm)
>> #X matrix
>> fmxs<-cbind(fmvect,X3feb_march,X4feb_march)
>> tfmxs<-t(fmxs)
>> xcoms_fm<-(tfmxs %*% fmxs)
>> xinvs_fm<-ginv(xcoms_fm)
>> var_fms<-xinvs_fm*0.341
>> chi_fms<-bsfm %*% var_fms %*% (bsunres_fm-bsres_fm)
>> chi_fms
> # Record Chisq value
>
> Does this help?
> Here lffeb_march is the combination of Feb and March log flows
> and llfeb_march is the combination of Feb and March log loads
> X3: lffeb_march-mean(feb_march)
> X4: X1*X3
>
> Thanks
>
> Rui Barradas wrote
>>
>> Hello,
>>
>> I'm not at all sure if I understand your problem. Does this describe it?
>>
>>
>> test first model for months 1 and 2
>> if test statistic less than critical value{
>> test second model for months 1 and 2
>> print results of the first and second tests? just one of them?
>> }
>> move on to months 2 and 3
>> etc, until months 12 and 1
>>
>>
>> Please post example data using dput(dataset).
>> Just copy it's output and paste it in your post.
>>
>> And example code, what you're already doing.
>> (Possibly simplified)
>>
>> Rui Barradas
>>
>>
>> meredith wrote
>>>
>>> R Users-
>>> I have been trying to automate a manual code that I have developed for
>>> calling in a .csv file, isolating certain rows and columns that
>>> correspond to specified months:
>>> something to the effect
>>> i=name.csv
>>> N=length(i$month)
>>> iphos1=0
>>> iphos2=0
>>> isphos3=0
>>> for i=1,N
>>> if month=1
>>> iphos1=iphos+1
>>> iphos1(iphos1)=i
>>>
>>> an so on to call out the months into there own arrays (unless there is a
>>> way I can wrap it into the next automation)
>>>
>>> Next: I would like to run a simple linear regression combining each of
>>> the months 1 by 1:
>>> for instance I want to run a regression on a combined model from months
>>> 1 and 2 and a dummy model for 1 and 2, compare them using a Chi-sq
>>> distribution, if Chi-sq is less than the Critical value, we accept and
>>> go on to test another set of models with both 1 and 2. If it rejects,
>>> then we proceed to months 2 and 3. If we move on to the second set on
>>> months 1 and 2, and the critical value is accepted, I want to print an
>>> accept or reject and move on to months 2 and 3, until finally comparing
>>> months 12-1 at the end.
>>> Is there a way to loop or automate this in R?
>>>
>>> Thanks
>>> Meredith
>>>
>>
>
> Rui Barradas wrote
>>
>> Hello,
>>
>> I'm not at all sure if I understand your problem. Does this describe it?
>>
>>
>> test first model for months 1 and 2
>> if test statistic less than critical value{
>> test second model for months 1 and 2
>> print results of the first and second tests? just one of them?
>> }
>> move on to months 2 and 3
>> etc, until months 12 and 1
>>
>>
>> Please post example data using dput(dataset).
>> Just copy it's output and paste it in your post.
>>
>> And example code, what you're already doing.
>> (Possibly simplified)
>>
>> Rui Barradas
>>
>>
>> meredith wrote
>>>
>>> R Users-
>>> I have been trying to automate a manual code that I have developed for
>>> calling in a .csv file, isolating certain rows and columns that
>>> correspond to specified months:
>>> something to the effect
>>> i=name.csv
>>> N=length(i$month)
>>> iphos1=0
>>> iphos2=0
>>> isphos3=0
>>> for i=1,N
>>> if month=1
>>> iphos1=iphos+1
>>> iphos1(iphos1)=i
>>>
>>> an so on to call out the months into there own arrays (unless there is a
>>> way I can wrap it into the next automation)
>>>
>>> Next: I would like to run a simple linear regression combining each of
>>> the months 1 by 1:
>>> for instance I want to run a regression on a combined model from months
>>> 1 and 2 and a dummy model for 1 and 2, compare them using a Chi-sq
>>> distribution, if Chi-sq is less than the Critical value, we accept and
>>> go on to test another set of models with both 1 and 2. If it rejects,
>>> then we proceed to months 2 and 3. If we move on to the second set on
>>> months 1 and 2, and the critical value is accepted, I want to print an
>>> accept or reject and move on to months 2 and 3, until finally comparing
>>> months 12-1 at the end.
>>> Is there a way to loop or automate this in R?
>>>
>>> Thanks
>>> Meredith
>>>
>>
>
--
View this message in context: http://r.789695.n4.nabble.com/Automating-R-for-Hypothesis-Testing-tp4618653p4620696.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list