[R] Remove data 3 standard deviatons from the mean using R?
Berend Hasselman
bhh at xs4all.nl
Tue Apr 9 15:25:33 CEST 2013
On 09-04-2013, at 13:12, Lorna <lornam at essex.ac.uk> wrote:
> Hi Everyone,
>
> I have a very long list of data-points (+2300) and i know from my histogram
> that there are outliers which are affecting my mean.
>
> I was wondering if anyone on here knows a way i can quickly get R to
> calculate and remove data which is 3 standard deviations from the mean? I am
> hoping this will tidy my data and give me a repeatable method of tidying for
> future data collection.
>
> Please if you do post code, make it as user friendly as possible! I am not a
> very good programmer, i can load my data into R and do basic stats on it
> however i havent tried much else....
# some test data + standard deviation of same
testdata <- rnorm(100,0,5)
sd.td <- sd(testdata)
# threshold (set to 3.0 for your specific situation)
alpha <- 1.5
# determine which items fall within bounds and select them
pidx <- (testdata<mean(testdata)+alpha*sd.td) & (testdata>mean(testdata)-alpha*sd.td)
testdata[pidx]
Berend
More information about the R-help
mailing list