[R] Data Checking

Ott Toomet siim at obs.ee
Mon Mar 25 11:04:20 CET 2002


 On Sun, 24 Mar 2002, Uwe Ligges wrote:

  |Ko-Kang Kevin Wang wrote:
  |> 
  |> Hi,
  |> 
  |> This is a simple question with if elseif....however I am having trouble
  |> constructing the solution for some reason.
  |> 
  |> Suppose I have a data set with 3 variables, a, b and c say.  Let's say c
  |> is the sum of a and b.  So:
  |>   a  b  c
  |>   1  2  3
  |>   2  3  5
  |>   3  4  7
  |>   .  .  .
  |>   .  .  .
  |>   .  .  .
  |> 
  |> Suppose that I know there have been some data entry errors and I want to
  |> check if ALL values in c is really the sum of a and b, and if not, print
  |> out the whole line (i.e. all values of a, b and c in that row).
  |> 
  |> Any help on how I can write this if elseif block will be apprecaited!
  |
  |Assume your "data set" is a data.frame():
  |
  | X <- data.frame(a=1:5, b=2:6, ce=c(3,5,7,6,11)) 
  | # Let's call it a, b, ce --- c already is a function
  | # Now get the rows with errors:
  | X[X$a + X$b != X$ce, ]
  |

I do recommend not to use == or != to test equality.  Operators == or != are usable
== _only_ for integers.  Generally, on digital computers you can never know
when two real numbers are equal.  Here is a simle example (R 1.4.0, P-II,
linux):

> 1 + 2
[1] 3
> 1 + 2 == 3
[1] TRUE
> 1.1 + 2.2
[1] 3.3
> 1.1 + 2.2 == 3.3
[1] FALSE

So, using Uwe's example, you should write something like

tolerance= .Machine$double.eps ^ 0.5
> X[abs(X$a + X$b - X$ce) > tolerance,]
  a b ce
4 4 5  6


Regards,

Ott Toomet


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list