[R] grubbs test to detect all outliers
AbouEl-Makarim Aboueissa
@boue|m@k@r|m1962 @end|ng |rom gm@||@com
Fri Apr 28 15:09:50 CEST 2023
*R: *Grubbs Test to detect all outliers Per group for all columns in a data
frame
Dear All: good morning
I have a dataset (as an example) with two column factors (factor1 and
factor2) and 5 numerical columns (X,Y,Z,U,V). The X and Y columns have same
length as factor1; and Z, U, and V have same length as factor2. Please see
dataset is copied below. Please note that all dataset columns have NAs
values.
*Need help on this:*
Can we use the grubbs.test() function to detect all outliers and replace it
by NA in X and Y datasets per group in factor1; and in Z, U, and V datasets
per group in factor2. Columns in the dataframe have different lengths, but
when I read the .csv file, R added NA values for the shorter columns.
If you need the .csv data file, please let me know.
Thank you very much for your help in advance.
install.packages("outliers")
library(outliers)
datafortest<-read.csv("G:/data_for_test.csv", header=TRUE)
datafortest
datafortest<-data.frame(datafortest)
datafortest$factor1<-as.factor(datafortest$factor1)
datafortest$factor2<-as.factor(datafortest$factor2)
str(datafortest)
##### tried to use grubbs.test() on a single column of the dataframe, but
still not working
tests.for.outliers.X<- grubbs.test(datafortest$X, na.rm = TRUE, type=11)
####################################
*grubbs.test() on a single dataset: but this can only detect if the min and
the max are outliers.*
xx999<-c(0.088,1,2,3,4,5,6,7,8,9,88,98,99)
grubbs.test(xx999, type=11)
With many thanks
Abou
factor1 X Y factor2 Z U
V
1 4455.077 888 1 999 NA 999
1 4348.031 333 1 475 NA 240
1 9999.789 618 1 507 252 394
1 3813.139 417 1 603 332 265
1 7512.65 344 1 442 216 NA
1 5642.667 NA 1 486 217 275
1 6684.386 341 1 927 698 479
2 5165.731 999 1 971 311 562
2 NA 265 1 388 999 512
2 3259.241 557 2 888 444 777
2 3288.383 234 2 514 NA 322
2 1997.878 383 2 409 311 NA
2 99990.61 NA 2 546 327 728
2 2655.977 NA 2 523 228 653
3 3189.49 7777 2 313 456 450
3 1826.851 287 2 296 412 576
3 4386.002 352 2 320 251 NA
3 3295.091 308 2 388 888 396.5
3 2120.902 526 3 9999 398 888
3 NA 489 3 677 438 307
3 2056.123 291 3 555 428 219
3 1995.088 444 3 NA 319 NA
3 NA 349 3 479 NA 321
3 2539.873 333 3 257 406 417
3 313 334 409
3 296 465 546
3 320 180 523
3 388 999 313
______________________
*AbouEl-Makarim Aboueissa, PhD*
*Professor, Mathematics and Statistics*
*Graduate Coordinator*
*Department of Mathematics and Statistics*
*University of Southern Maine*
[[alternative HTML version deleted]]
More information about the R-help
mailing list