Re: [R] multi-column factor

2012-09-17 Thread Hadley Wickham
If you have a million levels is it really necessary to use a factor? I'm not sure what advantages it will to have to a string in this circumstance (especially since you don't seem to know the levels a priori but have to learn them from the data). Hadley On Sunday, September 16, 2012, Sam Steingol

Re: [R] multi-column factor

2012-09-16 Thread Rui Barradas
Hello, The obvious simplification is to call union() only once. With 10M rows it should save time. Then I've asked myself whether unique() wouldn't be faster. f1 <- function(x){ x[[1]] <- factor(x[[1]], levels = union(x[[1]], x[[2]])) x[[2]] <- factor(x[[2]], levels = union(x[[1]], x