Hello -
Altaweel, Mark R. wrote:
Hi,
I have data stored in a list that I would like to aggregate and
perform some basic stats. However, I would like to apply conditional
statements so that not all the data are used. Basically, I want to
get a specific variable, do some basic functions (such as a mean),
but only get the data in each element's data that match the
condition. The code I used is below:
result<-sapply(res, function(.df) { #res is the list containing
file data
+ if(.df$Volume>0)mean(.df$Volume) #only have the mean function
calculate on values great than 0 + })
I did get a numeric output; however, when I checked the output value
the conditional was ignored (i.e. it did not do anything to the
calculation)
I also obtained these warning statements:
Warning messages: 1: In if (.df$Volume > 0) mean(.df$Volume) : the
condition has length > 1 and only the first element will be used 2:
In if (.df$Volume > 0) mean(.df$Volume) : the condition has length >
1 and only the first element will be used
Please let me know what am I doing wrong and how can I apply a
conditional statement to the sapply function.
Before you think about sapply, what would you do if you had one element
of this list. Write a function to do that.
You wouldn't do :
if(x$Volume > 0)
mean(x$Volume)
because x$Volume > 0 will create a logical vector greater than length 1
(assuming x$Volume is greater than length 1), and then "if" will issue
the warning.
You might do,
mean(x$Volume[x$Volume > 0])
and turn it into a function.
Then use sapply.
Hopefully that gets you started!
Erik
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.