You have 10^7 columns? That process is bound to be slow. On April 13, 2018 5:31:32 PM PDT, Jack Arnestad <jackarnes...@gmail.com> wrote: >I have a data.table with dimensions 100 by 10^7. > >When I do > > trainIndex <- > caret::createDataPartition( > df$status, > p = .9, > list = FALSE, > times = 1 > ) > outerTrain <- df[trainIndex] > outerTest <- df[-trainIndex] > >Subsetting the rows of df takes over 20 minutes. > >What is the best way to efficiently subset this? > >Thanks! > > [[alternative HTML version deleted]] > >______________________________________________ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
-- Sent from my phone. Please excuse my brevity. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.