[R] rowSums - am I getting something wrong?
rex.dwyer at syngenta.com
rex.dwyer at syngenta.com
Mon Mar 7 15:28:51 CET 2011
Hi Thomas,
Several of us explained this in different ways just last week, so you might search the archive. Floating point numbers are an approximate representation of real numbers. Things that can be expressed exactly in powers of 10 can't be expressed exactly in powers of 2. So the sum 0.6+0.3+0.1 is NOT clearly 1.0.
You can use signif and round to overcome this
> a = seq(0,1,0.1)
> a
[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
> a[7]-0.6
[1] 1.110223e-16
>
> 1-(a[4]+a[7]+a[2])
[1] -2.220446e-16
> b = rev(seq(1,0,-0.1))
> b
[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
> a-b
[1] 0.000000e+00 2.775558e-17 5.551115e-17 1.110223e-16 1.110223e-16
[6] 0.000000e+00 1.110223e-16 1.110223e-16 0.000000e+00 0.000000e+00
[11] 0.000000e+00
> round(a-b,10)
[1] 0 0 0 0 0 0 0 0 0 0 0
> round(a,10)-round(b,10)
[1] 0 0 0 0 0 0 0 0 0 0 0
>
The first commandment of floating point programming is
THOU SHALT NOT TEST WHETHER TWO FP NUMBERS ARE EQUAL
HTH
Rex
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Thomas.Salvesen at syngenta.com
Sent: Monday, March 07, 2011 2:09 AM
To: r-help at r-project.org
Subject: [R] rowSums - am I getting something wrong?
I am trying to construct a data set with some sequences for example:
a = seq(0,1,0.1)
m = matrix(nrow = 1331, ncol = 3)
m[,1] = rep(a,121)
m[,2] = rep(a,11,each = 11)
m[,3] = rep(a,1,each = 121)
I realize that there may be better ways of doing this, but this approach demonstrates the problem I'm having.
I then want to get the sum of the rows and delete any row with a sum of greater than 1. But have a problem with rows containing any combination of the values 0.6, 0.3 and 0.1 as the sum of these is clearly 1, but a request for which rows have a sum greater than 1 will return rows with these values. Row 161 is the first row containing these values:
[161,] 0.6 0.3 0.1
which(rowSum(m)>1)
> [53] 119 120 121 132 142 143 152 153 154 161 162
As far as I can tell this only affects combinations of 0.6, 0.3 and 0.1 (though I haven't checked every value in the matrix)
If I try the following:
q=rowSums(m)
which(q>1)
>[53] 119 120 121 132 142 143 152 153 154 161 162
But if I add and subtract 1 from this:
q=q+1
q=q-1
which(q>1)
[53] 119 120 121 132 142 143 152 153 154 162
What exactly is going on here? I don't have the problem with other combinations (eg 0.7, 0.2, 0.1). I assume that there is something about the data format that I don't understand, but if I make a data frame of the matrix I found the same effect.
Any help would be great
Tom
message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited.
More information about the R-help
mailing list