On 11/01/2017 5:33 AM, Alex Ivan Howard wrote:
Dear R Team
The following line returns 0 (zero) as answer:
sum(c(NA_real_, NA_real_, NA_real_, NA_real_), na.rm = TRUE)
One would, however, have expected it to return 'NaN', as is the case with
function 'mean':
mean(c(NA_real_, NA_real_, NA_real_, NA_real_), na.rm = TRUE)
[1] NaN
The two expressions are long versions of
sum(numeric())
mean(numeric())
It is reasonable that an empty sum is zero. The mean is 0/0, so NaN is
reasonable.
If this doesn't suit your needs, then you should put in special checks
for empty datasets.
Duncan Murdoch
The problem in other words:
I have a vector filled with missing numbers. I run the 'sum' function on
it, but instruct it to remove all missing values first. Consequently, the
sum function is left with an empty numeric vector. There is nothing to sum
over, so it shouldn't actually be able to return a concrete numeric value?
Shouldn't it thus rather return either NA ('unknown'/'missing') or - in the
fashion of the mean function - NaN ('not a number')?
With the current state of affairs, the sum function poses the grave danger
of introducing zeros to one's data (and subsequently other values as well,
as soon as the zeros get taken up in further calculations).
I hope my e-mail finds you well and I wish the R team all of the best for
2017 :)
Kind regards
Alex I. Howard
Web: www.nova.org.za
Phone: +27 (0) 44 695 0749
VoiP: +27 (0) 87 751 3490
Fax: +27 (0) 86 538 7958
[[alternative HTML version deleted]]
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel