[R] Theil test help
David Winsemius
dwinsemius at comcast.net
Fri Jul 17 00:49:59 CEST 2009
On Jul 16, 2009, at 5:53 PM, KARSTEN G HOLMQUIST wrote:
> Hello,
>
> I have a series of questions that I hope will be simple to answer.
> Basically I would like a code to do the following so that I can
> compute the distribution free test for the slope of a postulated
> regression line (Theil test). As I am testing the null hypothesis
> that slope = 0 against the general alternative the slope does not
> equal 0, it should be pretty straight-forward.
>
> I have a data frame as follows;
>
>
> index Score Mean_Temp SD_Mean_Temp
> i 10 3.7 0.65
> i+1 12 7.5 2.11
> i+2 6 10.3 0.85
> ... ... ... ...
> n 8 12.6 4.2
>
> The first question is how can I compute the n(n-1)/2 differences
> between Score i and Score j where i is from 1 to (n-1) and j is from
> (i+1) to n? I would then like R to spit out the differences in a
> file something like the following;
>
> i j Difference
> 1 2 2
> 1 3 -4
> ... .. ...
> (n-1) n D(n-1,n)
Without an example, I am going to suggest something:
dfg<-data.frame(index=1:10, Score=sample(1:20,
10),Mean_Temp=10*abs(rnorm(10)), SD_Mean_Temp=abs(rnorm(10)) )
>
The trick is to get the differences and here is one method:
mapply(FUN="-", dfg$Mean_Temp[combn(dfg$index,2)[1,] ], dfg
$Mean_Temp[combn(dfg$index,2)[2,] ])
Should be pretty obvious how to line up the indices since they are
just the arguments to "[".
> The second question is for each indexed sample from i to n, I would
> like to use something like rnorm(n, mean = 0, sd = 1) so that I
> could generate 1000 random draws from the distribution specified so
> that the arguments for rnorm () are as follows; mean = the value of
> Mean_Temp & sd = SD_Mean_Temp for indexed values i to n. I would,
> of course, then like R to spit out the massive table with 1000
> columns of randomly generated temperature values for each index i to
> n.
You can get 1000 such values with rnorm(1000, Mean_Temp, SD_Mean_Temp)
Perhaps something along these lines:
Mtx <- matrix( , ncol=1000, nrow=nrow(dfg))
Mtx[1:nrow(dfg), ] <- apply(dfg, 1, function(x) rnorm(1000, x[3],
x[4]) )
> Finally I would like to compute the slope estimator associated with
> the Theil statistic. All I need to do is compute the n(n-1)/n
^^^^^^^^^^==n, is that what you meant?
Or is it n(n-1)/2?
> individual slope values; Sij = (Yj -Yi)/(Xj-Xi) where the Ys are the
> Scores for samples i to n and the Xs are the set of randomly
> generated temperature values, as before, 1 less than or equal to i
> less than j less than or equal to n. The median is then the
> estimator of Beta (the slope). I would like to compute the slopes
> for all 1000 sets of randomly generated temperature values.
There is a package, mblm, that computes the Theil estimator. I don't
see why we should reinvent that wheel.
(Found via tsearching on theil statistic)
http://search.r-project.org/cgi-bin/namazu.cgi?query=theil+statistic&max=100&result=normal&sort=score&idxname=functions&idxname=Rhelp08&idxname=views
...and following what seemed like promising links
library(mblm)
--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
More information about the R-help
mailing list