[R] How to speed this up?
Jen
plessthanpointohfive at gmail.com
Tue Feb 28 17:22:30 CET 2017
Hi, I'm trying to generate 2.5 million phone numbers. The code below
generates a random sample of 250K MPNS for Morocco. It takes about 10
minutes.
I need to generate 2.5 million. I've run it through once and it took about
45 hours.
Is there a way to speed this up?
Thanks,
Jen
# generate random sample of mobile phone numbers (MPNs) - Morocco
# Mobile phone number format: +212-6xx-xxxxxx
library(data.table)
# country code
cc <- "+212"
# prefixes
IAM <- data.table(matrix(c(610, 611, 613, 615, 616,
618, 641, 642, 648, 650, 651, 652, 653,
654, 655, 658, 659, 661, 662, 666, 667,
668, 670, 671, 672, 673,
676, 677, 678), dimnames=list(NULL, "IAM")))
Medi <- data.table(matrix(c(612, 614, 617, 619, 644,
645, 649, 656, 657, 660, 663, 664, 665,
669, 674, 675, 679), dimnames=list(NULL, "Medi")))
MOROC <- data.table(matrix(c(0636, 0637), dimnames=list(NULL, "MOROC")))
# combine
mno <- c(IAM, Medi, MOROC)
# generate MPNs
MPN <- NULL
system.time(for (i in 1:250000){
# randomly select number from list
prefix <- sapply(mno[floor(runif(1, 1, length(mno)+1))], function(x)
sample(x, 1))
MNO <- names(prefix)
# randomly generate 6 numbers between 0 and 9, inclusive
nums <- floor(runif(6, 0, 9))
# concatenate
tmp <- c(paste(c(cc,prefix,t(nums)), sep="", collapse=""), MNO)
MPN[[i]] <- tmp
i <- i+1
})
# unlist
df <- data.table(matrix(unlist(MPN), nrow=length(MPN), ncol=2, byrow=T,
dimnames = list(seq(1, length(MPN),1), c("MPN", "MNO")) ))
[[alternative HTML version deleted]]
More information about the R-help
mailing list