[R] Efficient way to subset rows in R for dataset with 10^7 columns
Jeff Newmiller
jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Sat Apr 14 03:08:27 CEST 2018
You have 10^7 columns? That process is bound to be slow.
On April 13, 2018 5:31:32 PM PDT, Jack Arnestad <jackarnestad using gmail.com> wrote:
>I have a data.table with dimensions 100 by 10^7.
>
>When I do
>
> trainIndex <-
> caret::createDataPartition(
> df$status,
> p = .9,
> list = FALSE,
> times = 1
> )
> outerTrain <- df[trainIndex]
> outerTest <- df[-trainIndex]
>
>Subsetting the rows of df takes over 20 minutes.
>
>What is the best way to efficiently subset this?
>
>Thanks!
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
--
Sent from my phone. Please excuse my brevity.
More information about the R-help
mailing list