[R] How to join data.frames and vectors of different length, in an inteligent way?
Chuck Cleland
ccleland at optonline.net
Tue Jun 10 16:24:43 CEST 2008
You could put the group averages back into dafSamp using ave():
dafSamp <- data.frame(cbind(c(1972,1984,1969,1976,1999,1996,1976,1984,1976),
c(117,73,92,113,80,78,98,106,99)))
dafSamp$Ay <- ave(dafSamp$X2, dafSamp$X1, FUN=mean)
dafSamp$vecAA <- dafSamp$X2 * (dafSamp$Ay / mean(dafSamp$X2))
dafSamp
X1 X2 Ay vecAA
1 1972 117 117.0000 143.92640
2 1984 73 89.5000 68.69334
3 1969 92 92.0000 88.99065
4 1976 113 103.3333 122.76869
5 1999 80 80.0000 67.28972
6 1996 78 78.0000 63.96729
7 1976 98 103.3333 106.47196
8 1984 106 89.5000 99.74650
9 1976 99 103.3333 107.55841
?ave
On 6/10/2008 9:05 AM, Hvidberg, Martin wrote:
> I have a data set something like this:
>
>
>
> "YYYY", "Value"
>
> 1972 , 117
>
> 1984 , 73
>
> 1969 , 92
>
> 1976 , 113
>
> 1999 , 80
>
> 1996 , 78
>
> 1976 , 98
>
> 1984 , 106
>
> 1976 , 99
>
>
>
> it could be created with:
>
>> dafSamp <- data.frame(cbind(c(1972,1984,1969,1976,1999,1996,1976,1984,1976),c(117,73,92,113,80,78,98,106,99)))
>
>
>
> The real dataset is of cause much larger, app. 100.000 samples
>
>
>
> I need to adjust each value to remove any tendency of some years generally having higher values and others lower, since this is an unwanted artifact from different measuring traditions.
>
> My plan is to generate an average for each year Ay, as well as a global average Ag. Then each value should be multiplied by Ay/Ag.
>
>
>
>
>
> I can make the averages like this:
>
>
>
>> Ag <- mean(dafSamp[,2])
>
>> Ag
>
> [1] 95.11111
>
>
>
>> Ay <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]), FUN='mean')
>
>> Ay
>
> Group.1 x
>
> 1 1969 92.0000
>
> 2 1972 117.0000
>
> 3 1976 103.3333
>
> 4 1984 89.5000
>
> 5 1996 78.0000
>
> 6 1999 80.0000
>
>
>
>
>
> To see how many samples from each year I could write:
>
>
>
>> Cy <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]), FUN='length')
>
>> Cy
>
> Group.1 x
>
> 1 1969 1
>
> 2 1972 1
>
> 3 1976 3
>
> 4 1984 2
>
> 5 1996 1
>
> 6 1999 1
>
>
>
>
>
> I would like to create a new vector with the adjusted values (dafSmap[,2] * Ay(for a relevant year) / Ag)
>
>
>
> I tried to write:
>
>
>
> vecAA <- dafSamp[,2] * Ay[which(Ay[,1]==dafSamp[,1]),2] / Ag
>
>
>
> but the result is all NAs :-( Might have seen that coming, Not the same length...
>
>
>
> Question: How do I go about making such calculation?
>
>
>
> :-) Martin Hvidberg
>
>
>
> Here is the code in full, if you want to try it...
>
>
>
> dafSamp <- data.frame(cbind(c(1972,1984,1969,1976,1999,1996,1976,1984,1976),c(117,73,92,113,80,78,98,106,99)))
>
> Ag <- mean(dafSamp[,2])
>
> Ag
>
> Ay <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]), FUN='mean')
>
> Ay
>
> Cy <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]), FUN='length')
>
> Cy
>
> vecAA <- dafSamp[,2] * Ay[which(Ay[,1]==dafSamp[,1]),2] / Ag
>
>
>
>
>
>
>
> University of Aarhus <http://www.au.dk/en> Danmarks Miljøundersøgelser <http://www.dmu.dk/>
>
> Hvidberg, Martin <http://www2.dmu.dk/1_Om_DMU/2_medarbejdere/cv/employee2_NH.asp?PersonID=MHV>
> Senior Geographer (Climatology, Spatial modeling) <http://www.geogr.ku.dk/>
> N 55°41m43.48s E 12°06m05.13s ETRS89
> National Environmental Research Inst. <http://www.dmu.dk/International/>
> P.O. Box 358
> Frederiksborgvej 399
> DK-4000 Roskilde
> Martin.Hvidberg at dmu.dk
> www.dmu.dk/AtmosphericEnvironment/ tel:
> fax: +45 46 30 11 55
> +45 46 30 12 14
>
> [[alternative HTML version deleted]]
>
> ------------------------------------------------------------------------
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894
More information about the R-help
mailing list