On 9/18/21 5:28 AM, Leonard Mada via R-help wrote:
Hello Andrew,
I add this info as a completion (so other users can get a better
understanding):
If we want to perform a survival analysis, than the interval should be
closed to the right, but we should include also the first time point (as
pe
Hello Andrew,
I add this info as a completion (so other users can get a better
understanding):
If we want to perform a survival analysis, than the interval should be
closed to the right, but we should include also the first time point (as
per Intention-to-Treat):
[0, 4](4, 8](8, 12](12, 16]
Perhaps you and Andrew should take this discussion off list...
Bert Gunter
"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Fri, Sep 17, 2021 at 3:45 PM Leonard Mada via R-he
The warn should be in cut() => .bincode().
It should be generated whenever a real value (excludes NA or NAN or +/-
Inf) is not included in any of the bins.
If the user writes a script and doesn't want any warnings: he can select
warn = FALSE. But otherwise it would be very helpful to catch
Why would you want to merge different factors?
It makes no sense on real data. Even if some names are the same, the
factors are not the same!
The only real-data application that springs to mind is censoring (right
or left, depending on the choice): but here we have both open and closed
interv
Re your objection that "the user has to suspect that some values were not
included" applies equally to your proposed warn option. There are a lot of ways
to introduce NAs... in real projects all analysts should be suspecting this
problem.
On September 17, 2021 3:01:35 PM PDT, Leonard Mada via R
Hello Andrew,
But "cut" generates factors. In most cases with real data one expects to
have also the ends of the interval: the argument "include.lowest" is
both ugly and too long.
[The test-code on the ftable thread contains this error! I have run
through this error a couple of times.]
The
I disagree, I don't really think it's too long or ugly, but if you think it
is, you could abbreviate it as 'i'.
x <- 0:20
breaks1 <- seq.int(0, 16, 4)
breaks2 <- seq.int(0, 20, 4)
data.frame(
cut(x, breaks1, right = FALSE, i = TRUE),
cut(x, breaks2, right = FALSE, i = TRUE),
check.nam
While it is not explicitly mentioned anywhere in the documentation for
.bincode, I suspect 'include.lowest = FALSE' is the default to keep the
definitions of the bins consistent. For example:
x <- 0:20
breaks1 <- seq.int(0, 16, 4)
breaks2 <- seq.int(0, 20, 4)
cbind(
.bincode(x, breaks1, right
Thank you Andrew.
Is there any reason not to make: include.lowest = TRUE the default?
Regarding the NA:
The user still has to suspect that some values were not included and run
that test.
Leonard
On 9/18/2021 12:53 AM, Andrew Simmons wrote:
> Regarding your first point, argument 'include.
Regarding your first point, argument 'include.lowest' already handles this
specific case, see ?.bincode
Your second point, maybe it could be helpful, but since both 'cut.default'
and '.bincode' return NA if a value isn't within a bin, you could make
something like this on your own.
Might be worth
Hello List members,
the following improvements would be useful for function cut (and .bincode):
1.) Argument: Include extremes
extremes = TRUE
if(right == FALSE) {
# include also right for last interval;
} else {
# include also left for first interval;
}
2.) Argument: warn = TRUE
Warn
12 matches
Mail list logo