[R-pkg-devel] Fast Matrix Serialization in R?

Sameh Abdulah @@meh@@bdu|@h @end|ng |rom k@u@t@edu@@@
Thu May 9 05:20:58 CEST 2024


Hi,

I need to serialize and save a 20K x 20K matrix as a binary file. This process is significantly slower in R compared to Python (4X slower).

I'm not sure about the best approach to optimize the below code. Is it possible to parallelize the serialization function to enhance performance?


  n <- 20000^2
  cat("Generating matrices ... ")
  INI.TIME <- proc.time()
  A <- matrix(runif(n), ncol = m)
  END_GEN.TIME <- proc.time()
  arg_ser <- serialize(object = A, connection = NULL)

  END_SER.TIME <- proc.time()
  con <- file(description = "matrix_file", open = "wb")
  writeBin(object = arg_ser, con = con)
  close(con)
  END_WRITE.TIME <- proc.time()
  con <- file(description = "matrix_file", open = "rb")
  par_raw <- readBin(con, what = raw(), n = file.info("matrix_file")$size)
  END_READ.TIME <- proc.time()
  B <- unserialize(connection = par_raw)
  close(con)
  END_DES.TIME <- proc.time()
  TIME <- END_GEN.TIME - INI.TIME
  cat("Generation time", TIME[3], " seconds.")

  TIME <- END_SER.TIME - END_GEN.TIME
  cat("Serialization time", TIME[3], " seconds.")

  TIME <- END_WRITE.TIME - END_SER.TIME
  cat("Writting time", TIME[3], " seconds.")

  TIME <- END_READ.TIME - END_WRITE.TIME
  cat("Read time", TIME[3], " seconds.")

  TIME <- END_DES.TIME - END_READ.TIME
  cat("Deserialize time", TIME[3], " seconds.")




Best,
--Sameh

-- 

This message and its contents, including attachments are intended solely 
for the original recipient. If you are not the intended recipient or have 
received this message in error, please notify me immediately and delete 
this message from your computer system. Any unauthorized use or 
distribution is prohibited. Please consider the environment before printing 
this email.

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list