[R] remove outlier
David Winsemius
dwinsemius at comcast.net
Fri Jan 2 18:09:33 CET 2015
On Jan 2, 2015, at 4:58 AM, Methekar, Pushpa (GE Transportation, Non-GE) wrote:
> Hi ,
> I am working on a function .
>
> rm.outliers = function(dataset,model){
> dataset$predicted = predict(model)
> dataset$stdres = rstudent(model)
> m = 1
> for(i in 1:length(dataset$stdres)){
> dataset$outlier_counter[i] = if(dataset$stdres[i] >= 3 |
> dataset$stdres[i] <= -3) {m}
> else{0}
> }
> j = length(which(dataset$outlier_counter >= 1))
> while(j>=1){
> print(dataset[which(dataset$outlier_counter >= 1),])
> dataset = dataset[which(dataset$outlier_counter == 0),]
> dataset$predicted = predict(model)
> dataset$stdres = rstudent(model)
> m = m+1
> for(k in 1:length(dataset$stdres)){
> dataset$outlier_counter[k] = if(dataset$stdres[k] >= 3 |
> dataset$stdres[k] <= -3) {m} else{0}
> }
> j = length(which(dataset$outlier_counter >= 1))
> }
> return(dataset)
> }
> When I pass
> rm.outliers(xsys,fitted.modely1.temp.l)
> fitted.modely1 .temp.l is mylinear model.
> It shows me error like
>
> Error in `$<-.data.frame`(`*tmp*`, "predicted", value = c(0.306726561735386, :
>
> replacement has 731 rows, data has 717
When you get a mismatch of "data" and replacement lengths like that it suggests you have NA values in some of the model variables. If that's the case then the absence of an example means we are not be able to demonstrate that effect, but you should be able to make a more modest example and test that hypothesis.
>
> Called from: `$<-`(`*tmp*`, "predicted", value = c(0.306726561735386, 0.306726561)
>
>
>
> Help me out
>
>
> Xsys data has 331 rows and 18 column
>
> [[alternative HTML version deleted]]
>
This is a plain text mailing list. Please read the Posting Guide more thoroughly.
--
David.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list