[R] 'mean' and 'sd' calculations do not match

Petr Pikal petr.pikal at precheza.cz
Thu Dec 8 14:18:12 CET 2005


Hi

you see the differenc between factors and numbers.

columns with <NA> are factors
columns with NA ar numeric

you can see it by 

str(chemicS) which will reveal a structure of your data

So either change factors by
as.numric(as.character())

or read it with forcing columns to numeric

?read.table

HTH
Petr





On 8 Dec 2005 at 11:50, Ulrich Leopold wrote:

From:           	Ulrich Leopold <uleopold at science.uva.nl>
To:             	R-help <R-help at stat.math.ethz.ch>
Organization:   	University of Amsterdam
Date sent:      	Thu, 08 Dec 2005 11:50:25 +0100
Subject:        	[R] 'mean' and 'sd' calculations do not match

> Dear list,
> 
> I am using R 2.1.1 on a Fedora 3 Linux, 32 bit PC.
> 
> If I compute the aggregated mean and the standard deviation I get
> standard deviation values for factors where the mean was not computed.
> It seems to me that this is somehow related to the NA values. But I
> don't quite understand what is going wrong?
> 
> Could it be related to the data import already? Some of the imported
> data got the character strings NA and others <NA>. But they are
> defined from the same values, -9999.  
> 
> I used the code below. Below the code are parts of the results.
> 
> Cheers, Ulrich
> 
> Data import:
> 
> chemicS <- read.table("ChemieUlli_4_Quellen.csv", header = TRUE, sep =
> ",",na.strings = "-9999")
> 
> Count EC        NO3    NO2    NH4
> 3504  630.0000  33.00  0.001  0.01 
> 3505        NA  26.66   <NA>  <NA> 
> 3506        NA   0.72   <NA>  <NA> 
> 3507        NA     NA   <NA>  <NA> 
> 3508        NA     NA   <NA>  <NA> 
> 3509        NA     NA   <NA>  <NA> 
> 3510 1210.0000  14.00  0.001  0.01 
> 3511 1265.0000  12.00  0.001  0.01 
> 3512 1400.0000  14.00  0.001  0.01 
> 3513 1427.0000  12.00  0.001  0.01 
> 3514 1410.0000   7.00      0     0 
> 3515 1520.0000   8.00  0.001  0.01 
> 3516 1470.0000   7.60      0     0 
> 3517 1170.0000  10.00  0.001  0.01 
> 3518 4570.0000  20.00  0.001  0.45 
> 3519 8560.0000   0.50   0.14  0.31 
> 3520  708.0000  39.00  0.001  0.01 
> 3521  833.0000  40.00   0.01  0.01 
> 3522        NA     NA   <NA>  <NA> 
> 
> Computing the mean:
> 
> aggregate(chemicS$EC, by = list(east=chemicS$EST, north=chemicS$NORD),
> FUN = mean)
> 
> Count   east    north   Mean
> 350    89885   103160  318.50000
> 351    55870   103510  400.00000
> 352    82570   104845  637.33333
> 353    79119   107433         NA
> 354    79160   107462  362.77778
> 355    83010   108990         NA
> 356    82810   109010         NA
> 357    69135   112992         NA
> 358    55490   120140  142.25000
> 359    56580   120600         NA
> 360    56582   120607         NA
> 361    58050   125350         NA
> 362    58059   125360         NA
> 363    60360   128191         NA
> 364    65448   128293  252.50000
> 365  65472.5 128308.1         NA
> 366    61412   131141         NA
> 
> Computing the standard deviation:
> 
> aggregate(chemicS$EC, by = list(east=chemicS$EST, north=chemicS$NORD),
> FUN = sd, na.rm = TRUE)
> 
> Count  east    north     Stdev.
> 350    89885   103160    4.9497475
> 351    55870   103510           NA
> 352    82570   104845   19.6553640
> 353    79119   107433           NA
> 354    79160   107462   73.6745848
> 355    83010   108990           NA
> 356    82810   109010   15.6950098
> 357    69135   112992           NA
> 358    55490   120140    5.3150729
> 359    56580   120600           NA
> 360    56582   120607   22.4435801
> 361    58050   125350           NA
> 362    58059   125360   23.3108523
> 363    60360   128191   20.9789577
> 364    65448   128293   10.6066017
> 365  65472.5 128308.1           NA
> 366    61412   131141    8.6184556
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

Petr Pikal
petr.pikal at precheza.cz




More information about the R-help mailing list