[R] Formula in a data-frame

Rui Barradas ruipbarradas at sapo.pt
Tue Sep 18 13:49:41 CEST 2012


Hello,

1) Instead of computing TFrequency and TVolume like you have, try the 
following.


TF <- with(Frequency, ave(Frequency, Specie, FUN = sum))
TV <- with(Volume, ave(Volume, Specie, FUN = sum))
Fi <- with(Frequency, Frequency/TF)
Vi <- with(Volume, Volume/TV)

Importance <- Fi*Vi/sum(Fi*Vi)

2) Using TFrequency and TVolume, you can solve the different nrows 
problem with merge()

?merge
m1 <- merge(Frequency, Volume)
m2 <- merge(m1, TFrequency)
m3 <- merge(m2, TVolume, by = 'Specie')

Fi <- with(m3, Frequency / TF)
Vi <- with(m3, Volume.x / Volume.y)
Importance <- Fi*Vi/sum(Fi*Vi)

3) Maybe you can combine both ways and find a use for the data.frame 
'm1'. And have

m1$Importance <- ...etc...

Hope this helps,

Rui Barradas



Em 18-09-2012 05:48, Raoni Rodrigues escreveu:
> Hello all,
>
> I'm new in R, and I have a data-frame like this (dput information below):
>
> Specie           Fooditem Occurrence Volume
> 1  Schizodon            vegetal          1   0.05
> 2  Schizodon           sediment          1   0.60
> 3  Schizodon            vegetal          1   0.15
> 4  Schizodon               alga          1   0.05
> 5  Schizodon           sediment          1   0.90
> 6  Schizodon           sediment          1   0.30
> 7  Schizodon           sediment          1   0.90
> 8   Astyanax terrestrial_insect          1   0.10
> 9   Astyanax            vegetal          1   0.85
> 10  Astyanax   aquatical_insect          1   0.05
> 11  Astyanax            vegetal          1   0.90
> 12  Astyanax          un_insect          1   0.85
>
>
> for each specie, I have to calculate a food item importance index, that is:
>
> Fi x Vi / Sum (Fi x Vi)
>
> Fi  = percentual frequency of occurrence of a food item
> Vi = percentual volume of a food item
>
> So, using ddply (plyr) function, I was able to calculate the total
> frequency of occurrence and total volume of each food item, using:
>
> Frequency = ddply (dieta, c('Specie','Fooditem') , summarise,
> Frequency = sum (Occurrence))
>
> Volume = ddply (dieta, c('Specie','Fooditem') , summarise, Volume =
> sum (Volume))
>
> and calculate total frequency and total volume for a given specie:
>
> TFrequency = ddply (Frequency, 'Specie' , summarise, TF = sum (Frequency))
>
> TVolume = ddply (dieta, c('Specie') , summarise, Volume = sum (Volume))
>
> but once they have different length, I could not use together to
> create a percentage needed in my formula.
>
> Any suggestions?
>
> Thanks in advanced for help and attention,
>
> Raoni
>
> dput (diet)
>
> structure(list(Specie = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 1L, 1L, 1L, 1L, 1L), .Label = c("Astyanax", "Schizodon"), class = "factor"),
>      Fooditem = structure(c(6L, 3L, 6L, 1L, 3L, 3L, 3L, 4L, 6L,
>      2L, 6L, 5L), .Label = c("alga", "aquatical_insect", "sediment",
>      "terrestrial_insect", "un_insect", "vegetal"), class = "factor"),
>      Occurrence = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>      1L), Volume = c(0.05, 0.6, 0.15, 0.05, 0.9, 0.3, 0.9, 0.1,
>      0.85, 0.05, 0.9, 0.85)), .Names = c("Specie", "Fooditem",
> "Occurrence", "Volume"), class = "data.frame", row.names = c(NA,
> -12L))
>
> sessionInfo()
> R version 2.15.1 (2012-06-22)
> Platform: i386-pc-mingw32/i386 (32-bit)
> Windows XP




More information about the R-help mailing list