Hello,
1) Instead of computing TFrequency and TVolume like you have, try the
following.
TF <- with(Frequency, ave(Frequency, Specie, FUN = sum))
TV <- with(Volume, ave(Volume, Specie, FUN = sum))
Fi <- with(Frequency, Frequency/TF)
Vi <- with(Volume, Volume/TV)
Importance <- Fi*Vi/sum(Fi*Vi)
2) Using TFrequency and TVolume, you can solve the different nrows
problem with merge()
?merge
m1 <- merge(Frequency, Volume)
m2 <- merge(m1, TFrequency)
m3 <- merge(m2, TVolume, by = 'Specie')
Fi <- with(m3, Frequency / TF)
Vi <- with(m3, Volume.x / Volume.y)
Importance <- Fi*Vi/sum(Fi*Vi)
3) Maybe you can combine both ways and find a use for the data.frame
'm1'. And have
m1$Importance <- ...etc...
Hope this helps,
Rui Barradas
Em 18-09-2012 05:48, Raoni Rodrigues escreveu:
Hello all,
I'm new in R, and I have a data-frame like this (dput information below):
Specie Fooditem Occurrence Volume
1 Schizodon vegetal 1 0.05
2 Schizodon sediment 1 0.60
3 Schizodon vegetal 1 0.15
4 Schizodon alga 1 0.05
5 Schizodon sediment 1 0.90
6 Schizodon sediment 1 0.30
7 Schizodon sediment 1 0.90
8 Astyanax terrestrial_insect 1 0.10
9 Astyanax vegetal 1 0.85
10 Astyanax aquatical_insect 1 0.05
11 Astyanax vegetal 1 0.90
12 Astyanax un_insect 1 0.85
for each specie, I have to calculate a food item importance index, that is:
Fi x Vi / Sum (Fi x Vi)
Fi = percentual frequency of occurrence of a food item
Vi = percentual volume of a food item
So, using ddply (plyr) function, I was able to calculate the total
frequency of occurrence and total volume of each food item, using:
Frequency = ddply (dieta, c('Specie','Fooditem') , summarise,
Frequency = sum (Occurrence))
Volume = ddply (dieta, c('Specie','Fooditem') , summarise, Volume =
sum (Volume))
and calculate total frequency and total volume for a given specie:
TFrequency = ddply (Frequency, 'Specie' , summarise, TF = sum (Frequency))
TVolume = ddply (dieta, c('Specie') , summarise, Volume = sum (Volume))
but once they have different length, I could not use together to
create a percentage needed in my formula.
Any suggestions?
Thanks in advanced for help and attention,
Raoni
dput (diet)
structure(list(Specie = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 1L, 1L, 1L), .Label = c("Astyanax", "Schizodon"), class = "factor"),
Fooditem = structure(c(6L, 3L, 6L, 1L, 3L, 3L, 3L, 4L, 6L,
2L, 6L, 5L), .Label = c("alga", "aquatical_insect", "sediment",
"terrestrial_insect", "un_insect", "vegetal"), class = "factor"),
Occurrence = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L), Volume = c(0.05, 0.6, 0.15, 0.05, 0.9, 0.3, 0.9, 0.1,
0.85, 0.05, 0.9, 0.85)), .Names = c("Specie", "Fooditem",
"Occurrence", "Volume"), class = "data.frame", row.names = c(NA,
-12L))
sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: i386-pc-mingw32/i386 (32-bit)
Windows XP
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.