[R] Automating R for Hypothesis Testing
meredith
mmballar at mtu.edu
Thu May 10 16:40:36 CEST 2012
Rui-
Thanks this definitely helps, just one quick question. How would you code
the values of chi-fm and chi-fms to change based on the degrees of freedom
of each model H(i)?
Meredith
Rui Barradas wrote
>
> Hello,
>
> Yes, it does help. Now we can see your data and what you're doing.
> What follows is a suggestion on what you could do, not full solution.
> (You forgot to say what X1 is, but I don't think it's important to
> understand the suggestion.)
> (If I'm wrong, say something.)
>
>
> milwaukeephos <- read.csv("milwaukeephos.csv", header=TRUE,
> stringsAsFactors=FALSE)
> # list of data.frames, one per month
> ls1 <- split(milwaukeephos, milwaukeephos$month)
>
> #--------- if you want to keep the models, not needed if you don't.
> # (yoy probably don't)
> modelH <- vector("list", 12)
> modelHa <- vector("list", 12)
> modelH2 <- vector("list", 12)
> modelH2a <- vector("list", 12)
> #--------- values to record, these are needed, create them beforehand.
> chi_fm <- numeric(12)
> chi_fms <- numeric(12)
> #
> seq_months <- c(1:12, 1) # wrap months around.
> for(i in 1:12){
> month_this <- seq_months[i]
> month_next <- seq_months[i + 1]
>
> lload <- c(ls1[[month_this]]$load_kg, ls1[[month_next]]$load_kg)
> lflow <- c(ls1[[month_this]]$flow, ls1[[month_next]]$flow)
> modelH[[i]] <- lm(lload ~ lflow)
> # If you don't want to keep the models, use modelH only
> # ( without [[i]] )
> # and do the same with X1
>
> # rest of your code for first test goes here
> chi_fm[i] <- bfm %*% var_fm %*% (bunres_fm - bres_fm)
>
> # and the same for the second test
> chi_fms[i] <- ...etc...
> }
>
>
> Hope this helps,
>
> Rui Barradas
>
>
> meredith wrote
>>
>> dput: http://r.789695.n4.nabble.com/file/n4620188/milwaukeephos.csv
>> milwaukeephos.csv
>>
>> # Feb-march
>>> modelH_febmarch<-lm(llfeb_march~lffeb_march)
>>>modelHa_febmarch<-lm(llfeb_march~X1feb_mar+lffeb_march)
>>> anova(modelHa_febmarch)
>>> coefficients(modelH_febmarch)
>> (Intercept) lffeb_march
>> -2.429890 1.172821
>>> coefficients(modelHa_febmarch)
>> (Intercept) X1feb_mar lffeb_march
>> -2.8957776 -0.5272793 1.3016303
>>> bres_fm<-matrix(c(-2.429890,0,1.172821),nrow=3)
>>> bunres_fm<-matrix(c(-2.8957776,-0.5272793,1.3016303),nrow=3)
>>>bfm<-t(bunres_fm-bres_fm)
>>> fmvect<-seq(1,1,length=34)
>>> X1a_febmar<-seq(0,0,length=9) # dummy variable step 1
>>> X1b_febmar<-seq(1,1,length=25) # dummy variable step 2
>>> X1feb_mar<-c(X1a_febmar,X1b_febmar) #dummy variable creation
>> # Test Stat Equation for Chisq
>>> fmxx<-cbind(fmvect,X1feb_mar,lffeb_march)
>>> tfmx<-t(fmxx)
>>> xcom_fm<-(tfmx %*% fmxx)
>>> xinv_fm<-ginv(xcom_fm)
>>> var_fm<-xinv_fm*0.307
>>> chi_fm<-bfm %*% var_fm %*% (bunres_fm-bres_fm)
>>> chi_fm # chisq value for recording
>> if less than CV move onto to slope modification
>>> modelH2_febmarch<-lm(llfeb_march~X3feb_march)
>>> modelH2a_febmarch<-lm(llfeb_march~X3feb_march+X4feb_march)
>>> anova(modelH2a_febmarch)
>>> coefficients(modelH2_febmarch) # get coefficients to make beta vectors
>>> for test
>> (Intercept) X3feb_march
>> 5.342130 1.172821
>>> coefficients(modelH2a_febmarch)
>> (Intercept) X3feb_march X4feb_march
>> 5.2936263 1.0353752 0.2407557
>> # Test Stat
>>> bsres_fm<-matrix(c(5.342130,1.172821,0),nrow=3)
>>> bsunres_fm<-matrix(c(5.2936263,1.0353752,0.2407557),nrow=3)
>>> bsfm<-t(bsunres_fm-bsres_fm)
>>> #X matrix
>>> fmxs<-cbind(fmvect,X3feb_march,X4feb_march)
>>> tfmxs<-t(fmxs)
>>> xcoms_fm<-(tfmxs %*% fmxs)
>>> xinvs_fm<-ginv(xcoms_fm)
>>> var_fms<-xinvs_fm*0.341
>>> chi_fms<-bsfm %*% var_fms %*% (bsunres_fm-bsres_fm)
>>> chi_fms
>> # Record Chisq value
>>
>> Does this help?
>> Here lffeb_march is the combination of Feb and March log flows
>> and llfeb_march is the combination of Feb and March log loads
>> X3: lffeb_march-mean(feb_march)
>> X4: X1*X3
>>
>> Thanks
>>
>> Rui Barradas wrote
>>>
>>> Hello,
>>>
>>> I'm not at all sure if I understand your problem. Does this describe it?
>>>
>>>
>>> test first model for months 1 and 2
>>> if test statistic less than critical value{
>>> test second model for months 1 and 2
>>> print results of the first and second tests? just one of them?
>>> }
>>> move on to months 2 and 3
>>> etc, until months 12 and 1
>>>
>>>
>>> Please post example data using dput(dataset).
>>> Just copy it's output and paste it in your post.
>>>
>>> And example code, what you're already doing.
>>> (Possibly simplified)
>>>
>>> Rui Barradas
>>>
>>>
>>> meredith wrote
>>>>
>>>> R Users-
>>>> I have been trying to automate a manual code that I have developed
>>>> for calling in a .csv file, isolating certain rows and columns that
>>>> correspond to specified months:
>>>> something to the effect
>>>> i=name.csv
>>>> N=length(i$month)
>>>> iphos1=0
>>>> iphos2=0
>>>> isphos3=0
>>>> for i=1,N
>>>> if month=1
>>>> iphos1=iphos+1
>>>> iphos1(iphos1)=i
>>>>
>>>> an so on to call out the months into there own arrays (unless there is
>>>> a way I can wrap it into the next automation)
>>>>
>>>> Next: I would like to run a simple linear regression combining each of
>>>> the months 1 by 1:
>>>> for instance I want to run a regression on a combined model from months
>>>> 1 and 2 and a dummy model for 1 and 2, compare them using a Chi-sq
>>>> distribution, if Chi-sq is less than the Critical value, we accept and
>>>> go on to test another set of models with both 1 and 2. If it rejects,
>>>> then we proceed to months 2 and 3. If we move on to the second set on
>>>> months 1 and 2, and the critical value is accepted, I want to print an
>>>> accept or reject and move on to months 2 and 3, until finally comparing
>>>> months 12-1 at the end.
>>>> Is there a way to loop or automate this in R?
>>>>
>>>> Thanks
>>>> Meredith
>>>>
>>>
>>
>> Rui Barradas wrote
>>>
>>> Hello,
>>>
>>> I'm not at all sure if I understand your problem. Does this describe it?
>>>
>>>
>>> test first model for months 1 and 2
>>> if test statistic less than critical value{
>>> test second model for months 1 and 2
>>> print results of the first and second tests? just one of them?
>>> }
>>> move on to months 2 and 3
>>> etc, until months 12 and 1
>>>
>>>
>>> Please post example data using dput(dataset).
>>> Just copy it's output and paste it in your post.
>>>
>>> And example code, what you're already doing.
>>> (Possibly simplified)
>>>
>>> Rui Barradas
>>>
>>>
>>> meredith wrote
>>>>
>>>> R Users-
>>>> I have been trying to automate a manual code that I have developed
>>>> for calling in a .csv file, isolating certain rows and columns that
>>>> correspond to specified months:
>>>> something to the effect
>>>> i=name.csv
>>>> N=length(i$month)
>>>> iphos1=0
>>>> iphos2=0
>>>> isphos3=0
>>>> for i=1,N
>>>> if month=1
>>>> iphos1=iphos+1
>>>> iphos1(iphos1)=i
>>>>
>>>> an so on to call out the months into there own arrays (unless there is
>>>> a way I can wrap it into the next automation)
>>>>
>>>> Next: I would like to run a simple linear regression combining each of
>>>> the months 1 by 1:
>>>> for instance I want to run a regression on a combined model from months
>>>> 1 and 2 and a dummy model for 1 and 2, compare them using a Chi-sq
>>>> distribution, if Chi-sq is less than the Critical value, we accept and
>>>> go on to test another set of models with both 1 and 2. If it rejects,
>>>> then we proceed to months 2 and 3. If we move on to the second set on
>>>> months 1 and 2, and the critical value is accepted, I want to print an
>>>> accept or reject and move on to months 2 and 3, until finally comparing
>>>> months 12-1 at the end.
>>>> Is there a way to loop or automate this in R?
>>>>
>>>> Thanks
>>>> Meredith
>>>>
>>>
>>
>
--
View this message in context: http://r.789695.n4.nabble.com/Automating-R-for-Hypothesis-Testing-tp4618653p4623689.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list