[R] as.factor and floating point numbers

Tobias Fellinger Wed, 25 Jan 2023 01:03:34 -0800

Hello,

I'm encountering the following error:


In a package for survival analysis I use a data.frame is created, one column is 
created by applying unique on the event times while others are created by 
running table on the event times and the treatment arm.

When there are event times very close together they are put in the same factor 
level when coerced to factor while unique outputs both values, leading to 
different lengths of the columns.

Try this to reproduce: 
x <- c(1, 1+.Machine$double.eps)
unique(x)
table(x)

Is there a general best practice to deal with such issues?

Should calling table on floats be avoided in general?

What can one use instead? 

One could easily iterate over the unique values and compare all values with the 
whole vector but this are N*N comparisons, compared to N*log(N) when sorting 
first and taking into account that the vector is sorted.

I think for my purposes I'll round to a hundredth of a day before calling the 
function, but any advice on avoiding this issue an writing more fault tolerant 
code is greatly appreciated.

all the best, Tobias


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] as.factor and floating point numbers

Reply via email to