[R] Simplify formula for heterogeneity
Stefaan Lhermitte
stefaan.lhermitte at biw.kuleuven.be
Thu May 26 17:06:22 CEST 2005
Dear R-ians,
I'm looking for a computational simplified formula to calculate a
measure for heterogeneity (let's say H ):
H = sqrt [ (Si (Sj (Xi - Xj)² ) ) /n ]
where:
sqrt = square root
Si = summation over i (= 0 to n)
Sj = summation over j (= 0 to n)
Xi = element of X with index i
Xj = element of X with index j
I can simplify the formula to:
H = sqrt [ ( 2 * n * Si (Xi) - 2 Si (Sj ( Xi * Xj)) ) / n]
Unfortunately this formula stays difficult in iterative programming,
because I have to keep every element of X to calculate H.
I know a computional simplified formula exists for the standard
deviation (sd) that is much easier in iterative programming.
Therefore I wondered I anybody knew about analog simplifications to
simplify H:
sd = sqrt [ ( Si (Xi - mean(X) )² ) /n ] -> simplified computation ->
sqrt [ (n * Si( X² ) - ( Si( X ) )² )/ n² ]
This simplied formula is much easier in iterative programming, since I
don't have to keep every element of X.
E.g.: I have a vector X[1:10] and I already have caculated Si( X[1:10]²
) (I will call this A) and Si( X ) (I will call this B).
When X gets extendend by 1 element (eg. X[11]) it easy fairly simple to
calculate sd(X[1:11]) without having to reuse the elements of X[1:10].
I just have to calculate:
sd = sqrt [ (n * (A + X[11]²) - (A + X[11]²)² ) / n² ]
This is failry easy in an iterative process, since before we continue
with the next step we set:
A = (A + X[11]²)
B = (B + X[11])
Can anybody help me to do something comparable for H? Any other help to
calculate H easily in an iterative process is also welcome!
Thanx in advance!
Kind regards,
Stef
More information about the R-help
mailing list