I have a data.table with dimensions 100 by 10^7. When I do
trainIndex <- caret::createDataPartition( df$status, p = .9, list = FALSE, times = 1 ) outerTrain <- df[trainIndex] outerTest <- df[-trainIndex] Subsetting the rows of df takes over 20 minutes. What is the best way to efficiently subset this? Thanks! [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.