[R] Automating R for Hypothesis Testing

Rui Barradas ruipbarradas at sapo.pt
Wed May 9 16:55:05 CEST 2012


Hello,

Yes, it does help. Now we can see your data and what you're doing.
What follows is a suggestion on what you could do, not full solution.
(You forgot to say what X1 is, but I don't think it's important to
understand the suggestion.)
(If I'm wrong, say something.)


milwaukeephos <- read.csv("milwaukeephos.csv", header=TRUE,
stringsAsFactors=FALSE)
# list of data.frames, one per month
ls1 <- split(milwaukeephos, milwaukeephos$month)

#--------- if you want to keep the models, not needed if you don't.
#          (yoy probably don't)
modelH <- vector("list", 12)
modelHa <- vector("list", 12)
modelH2 <- vector("list", 12)
modelH2a <- vector("list", 12)
#--------- values to record, these are needed, create them beforehand.
chi_fm <- numeric(12)
chi_fms <- numeric(12)
#
seq_months <- c(1:12, 1) # wrap months around.
for(i in 1:12){
	month_this <- seq_months[i]
	month_next <- seq_months[i + 1]

	lload <- c(ls1[[month_this]]$load_kg, ls1[[month_next]]$load_kg)
	lflow <- c(ls1[[month_this]]$flow, ls1[[month_next]]$flow)
	modelH[[i]] <- lm(lload ~ lflow)
	# If you don't want to keep the models, use modelH only
	# ( without [[i]] )
	# and do the same with X1

	# rest of your code for first test goes here
	chi_fm[i] <- bfm %*% var_fm %*% (bunres_fm - bres_fm)

	# and the same for the second test
	chi_fms[i] <- ...etc...
}


Hope this helps,

Rui Barradas


meredith wrote
> 
> dput:  http://r.789695.n4.nabble.com/file/n4620188/milwaukeephos.csv
> milwaukeephos.csv 
> 
> # Feb-march
>> modelH_febmarch<-lm(llfeb_march~lffeb_march)
>>modelHa_febmarch<-lm(llfeb_march~X1feb_mar+lffeb_march)
>> anova(modelHa_febmarch)
>> coefficients(modelH_febmarch)
> (Intercept) lffeb_march 
>   -2.429890    1.172821 
>> coefficients(modelHa_febmarch)
> (Intercept)   X1feb_mar lffeb_march 
>  -2.8957776  -0.5272793   1.3016303 
>> bres_fm<-matrix(c(-2.429890,0,1.172821),nrow=3)
>> bunres_fm<-matrix(c(-2.8957776,-0.5272793,1.3016303),nrow=3)
>>bfm<-t(bunres_fm-bres_fm)
>> fmvect<-seq(1,1,length=34)
>> X1a_febmar<-seq(0,0,length=9) # dummy variable step 1
>> X1b_febmar<-seq(1,1,length=25) # dummy variable step 2
>> X1feb_mar<-c(X1a_febmar,X1b_febmar) #dummy variable creation
> # Test Stat Equation for Chisq
>> fmxx<-cbind(fmvect,X1feb_mar,lffeb_march)
>> tfmx<-t(fmxx)
>> xcom_fm<-(tfmx %*% fmxx)
>> xinv_fm<-ginv(xcom_fm)
>> var_fm<-xinv_fm*0.307
>> chi_fm<-bfm %*% var_fm %*% (bunres_fm-bres_fm)
>> chi_fm # chisq value for recording
> if less than CV move onto to slope modification
>> modelH2_febmarch<-lm(llfeb_march~X3feb_march)
>> modelH2a_febmarch<-lm(llfeb_march~X3feb_march+X4feb_march)
>> anova(modelH2a_febmarch)
>> coefficients(modelH2_febmarch) # get coefficients to make beta vectors
>> for test
> (Intercept) X3feb_march 
>    5.342130    1.172821 
>> coefficients(modelH2a_febmarch)
> (Intercept) X3feb_march X4feb_march 
>   5.2936263   1.0353752   0.2407557 
> # Test Stat
>> bsres_fm<-matrix(c(5.342130,1.172821,0),nrow=3)
>> bsunres_fm<-matrix(c(5.2936263,1.0353752,0.2407557),nrow=3)
>> bsfm<-t(bsunres_fm-bsres_fm)
>> #X matrix
>> fmxs<-cbind(fmvect,X3feb_march,X4feb_march)
>> tfmxs<-t(fmxs)
>> xcoms_fm<-(tfmxs %*% fmxs)
>> xinvs_fm<-ginv(xcoms_fm)
>> var_fms<-xinvs_fm*0.341
>> chi_fms<-bsfm %*% var_fms %*% (bsunres_fm-bsres_fm)
>> chi_fms
> # Record Chisq value
> 
> Does this help?
> Here lffeb_march is the combination of Feb and March log flows
> and llfeb_march is the combination of Feb and March log loads
> X3: lffeb_march-mean(feb_march)
> X4: X1*X3
> 
> Thanks
> 
> Rui Barradas wrote
>> 
>> Hello,
>> 
>> I'm not at all sure if I understand your problem. Does this describe it?
>> 
>> 
>> test first model for months 1 and 2
>> if test statistic less than critical value{
>> 	test second model for months 1 and 2
>> 	print results of the first and second tests? just one of them?
>> }
>> move on to months 2 and 3
>> etc, until months 12 and 1
>> 
>> 
>> Please post example data using dput(dataset).
>> Just copy it's output and paste it in your post.
>> 
>> And example code, what you're already doing.
>> (Possibly simplified)
>> 
>> Rui Barradas
>> 
>> 
>> meredith wrote
>>> 
>>> R Users-
>>>   I have been trying to automate a manual code that I have developed for
>>> calling in a .csv file, isolating certain rows and columns that
>>> correspond to specified months:
>>> something to the effect
>>> i=name.csv
>>> N=length(i$month)
>>> iphos1=0
>>> iphos2=0
>>> isphos3=0
>>> for i=1,N
>>>  if month=1
>>>     iphos1=iphos+1
>>>     iphos1(iphos1)=i
>>> 
>>> an so on to call out the months into there own arrays (unless there is a
>>> way I can wrap it into the next automation)
>>> 
>>> Next: I would like to run a simple linear regression combining each of
>>> the months 1 by 1:
>>> for instance I want to run a regression on a combined model from months
>>> 1 and 2 and a dummy model for 1 and 2, compare them using a Chi-sq
>>> distribution, if Chi-sq is less than the Critical value, we accept and
>>> go on to test another set of models with both 1 and 2. If it rejects,
>>> then we proceed to months 2 and 3.  If we move on to the second set on
>>> months 1 and 2, and the critical value is accepted, I want to print an
>>> accept or reject and move on to months 2 and 3, until finally comparing
>>> months 12-1 at the end.
>>> Is there a way to loop or automate this in R?
>>> 
>>> Thanks
>>> Meredith
>>> 
>> 
> 
> Rui Barradas wrote
>> 
>> Hello,
>> 
>> I'm not at all sure if I understand your problem. Does this describe it?
>> 
>> 
>> test first model for months 1 and 2
>> if test statistic less than critical value{
>> 	test second model for months 1 and 2
>> 	print results of the first and second tests? just one of them?
>> }
>> move on to months 2 and 3
>> etc, until months 12 and 1
>> 
>> 
>> Please post example data using dput(dataset).
>> Just copy it's output and paste it in your post.
>> 
>> And example code, what you're already doing.
>> (Possibly simplified)
>> 
>> Rui Barradas
>> 
>> 
>> meredith wrote
>>> 
>>> R Users-
>>>   I have been trying to automate a manual code that I have developed for
>>> calling in a .csv file, isolating certain rows and columns that
>>> correspond to specified months:
>>> something to the effect
>>> i=name.csv
>>> N=length(i$month)
>>> iphos1=0
>>> iphos2=0
>>> isphos3=0
>>> for i=1,N
>>>  if month=1
>>>     iphos1=iphos+1
>>>     iphos1(iphos1)=i
>>> 
>>> an so on to call out the months into there own arrays (unless there is a
>>> way I can wrap it into the next automation)
>>> 
>>> Next: I would like to run a simple linear regression combining each of
>>> the months 1 by 1:
>>> for instance I want to run a regression on a combined model from months
>>> 1 and 2 and a dummy model for 1 and 2, compare them using a Chi-sq
>>> distribution, if Chi-sq is less than the Critical value, we accept and
>>> go on to test another set of models with both 1 and 2. If it rejects,
>>> then we proceed to months 2 and 3.  If we move on to the second set on
>>> months 1 and 2, and the critical value is accepted, I want to print an
>>> accept or reject and move on to months 2 and 3, until finally comparing
>>> months 12-1 at the end.
>>> Is there a way to loop or automate this in R?
>>> 
>>> Thanks
>>> Meredith
>>> 
>> 
> 


--
View this message in context: http://r.789695.n4.nabble.com/Automating-R-for-Hypothesis-Testing-tp4618653p4620696.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list