[R] Thank you your help.

arun smartpink111 at yahoo.com
Mon Jan 28 15:48:51 CET 2013



Hi,
temp3<- read.table(text="
ID CTIME WEIGHT
HM001 1223 24.0
HM001 1224 25.2
HM001 1225 23.1
HM001 1226 NA
HM001 1227 32.1
HM001 1228 32.4
HM001 1229 1323.2
HM001 1230 27.4
HM001 1231 22.4236 #changed here to test the previous solution
",sep="",header=TRUE,stringsAsFactors=FALSE)
 tempnew<- na.omit(temp3)


 grep("\\d{4}",temp3$WEIGHT) 
#[1] 7 9 #not correct


temp3[,3][grep("\\d{4}..*",temp3$WEIGHT)]<-NA #match 4 digit numbers before the decimals
tail(temp3)
#     ID CTIME  WEIGHT
#4 HM001  1226      NA
#5 HM001  1227 32.1000
#6 HM001  1228 32.4000
#7 HM001  1229      NA
#8 HM001  1230 27.4000
#9 HM001  1231 22.4236

#Based on the variance,
You could set up some limit, for example 50 and use:
tempnew$WEIGHT<- ifelse(tempnew$WEIGHT>50,NA,tempnew$WEIGHT)
A.K.





________________________________
From: 남윤주 <jamansymptom at naver.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Monday, January 28, 2013 2:20 AM
Subject: Re: Thank you your help.



Thank you for your reply again.  Your understanding is exactly right.
I attached a picture that show dataset.
'weight' is a dependent variable. And CTIME means hour/minute. This data will have accumulated for years.
Speaking of accepted variance range, it would be from 10 to 50. 
Actually, I am java programmer. So, I am strange this R Language.
Can u give me some example to use grep function?
-----Original Message-----
From: "arun"<smartpink111 at yahoo.com> 
To: "jamansymptom at naver.com"<jamansymptom at naver.com>; 
Cc: 
Sent: 2013-01-28 (월) 15:27:12
Subject: Re: Thank you your help.

Hi,
Your original post was that 
"...it was evaluated from 20kg -40kg. But By some errors, it is evaluated 2000 kg".

So, my understanding was that you get values 2000 or 2000-4000 reads in place of 20-40 occasionally due to some misreading.

If your dataset contains observed value, strange value and NA and you want to replace the strange value to NA, could you mention the range of strange values.  If the strange value ranges anywhere between 1000-9999, it should get replaced with the ?grep() solution.  But, if it depends upon something else, you need to specify.  Also, regarding the variance, what is your accepted range of variance.
A.K.





----- Original Message -----
From: "jamansymptom at naver.com" <jamansymptom>@naver.com>
To: smartpink111 at yahoo.com
Cc: 
Sent: Monday, January 28, 2013 1:15 AM
Subject: Thank you your help.

Thank you to answer my question. 
It is not exactly what I want. I should have informed detailed situation. 
There is a sensor get data every minute. And that data will be accumulated and be portion of dataset. 
And the dataset contains observed value, strange value and NA. 
Namely, I am not sure where strange value will be occured. 
And I can't expect when strange value will be occured. 

I need the procedure performing like below.  
1. using a method, set the range of variance 
2. using for(i) statement, check whether variance(weihgt) is in the range. 
3. when variance is out of range, impute weight[i] as NA. 

Thank you. 



More information about the R-help mailing list