[R] (no subject)

Richard A. O'Keefe ok at cs.otago.ac.nz
Fri Nov 12 03:04:02 CET 2004


	On 11-Nov-04 Wei Yang wrote:
	> Hi, 
	> 
	> I have a list of numbers. For each of the numbers, I take
	> sum of squares of the numbers centered on the number chosen.
	> If it is less than a certain constant, I will take the
	> average of the numbers chosen.  

Assuming I've understood this correctly, one approach is

    mean(v[k > sapply(v, function (x) sum((v-x)^2))])

where v is the vector of numbers
  and k is the "certain constant".
However, this formulation requires O(length(v)^2) time,
which means that it is not a particularly efficient way to do it.

What to me is far more interesting is WHY is this calculation to be done?
If you think about it, if v is sorted, the "k > sapply(...)" part will be
FALSE... TRUE... FALSE...
so this is an arithmetic mean of a "central" subset of values.  Why not just
use an ordinary trimmed mean (see ?mean to find out about the trim= argument)?
Or an M-estimator if some other robust estimate of location is wanted?
I ask this in all seriousness, because in the few quick experiments I tried,
this estimator was _further_ from the population mean than the classical mean.
Is that the point of it?




More information about the R-help mailing list