[R-SIG-Win] R4.2.3 slower than R4.1.3 on Windows only

Tomas Kalibera tom@@@k@||ber@ @end|ng |rom gm@||@com
Wed May 17 16:07:11 CEST 2023


On 4/18/23 14:16, Fredrik Skoog wrote:
> Hi,
>
> If you run:
>
> library(microbenchmark)
> m <- matrix(rnorm(28000000), nrow=7000, byrow=TRUE)
> rownames(m) <- rownames(m, do.NULL = FALSE, prefix = "this is a row name")
> colnames(m) <- colnames(m, do.NULL = FALSE, prefix = "this is a column
> name")
> microbenchmark(df <- as.data.frame(m, keep.rownames=TRUE), times=10)
>
> The results shows worse performance in R4.2.3 (also bigger variations)
> compared to v4.1.3. Also v4.2.0 shows worse performance, so it looks like
> it's 4.2.0 and later that has this issue. On Linux it's all good, so it
> seems to be a Windows only issue.
>
> Version 4.2.3
> ==============
>
> Run 1
> ------
> Unit: seconds
>                                           expr      min       lq     mean
>   median       uq      max neval
>   df <- as.data.frame(m, keep.rownames = TRUE) 1.324839 2.411304 2.760553
> 2.593452 3.290228 4.263175    10
>
> Run 2
> ------
> Unit: milliseconds
>                                           expr      min     lq     mean
>   median       uq     max neval
>   dt <- as.data.frame(m, keep.rownames = TRUE) 967.5651 1054.8 1155.453
> 1149.767 1194.742 1451.14    10
>
>
> Version 4.1.3
> ===============
>
> Run 1:
> ------
>
> Unit: milliseconds
>                                           expr      min       lq     mean
>   median       uq      max neval
>   df <- as.data.frame(m, keep.rownames = TRUE) 274.5478 298.2477 320.3988
> 320.9164 342.8119 375.6841    10
>
> Run 2:
> -------
> Unit: milliseconds
>                                           expr      min       lq     mean
>   median       uq      max neval
>   df <- as.data.frame(m, keep.rownames = TRUE) 278.5369 310.0312 313.0745
> 313.3275 320.0294 343.7539    10
>
> I have tried it on two different machines, with the same result.
>
> -----
>
> The above example is just trying to do something simple that exposes the
> issue, but as.data.table behaves similarly. Also it shows huge variations
> in time. We had a script that ran in 12 minutes in v3.6.3 and it took 18
> min with v4.2.3, with v4.1.3 it takes around 9 minutes.
>
> Has anyone else noticed this? I noticed in the release notes that Doug Leas
> malloc was replaced in v4.2.0 and that's a windows only change.

Thanks for the report. I confirm the slowdown with this example and I 
confirm it is due to the change in memory allocator: I've switched my 
working copy of R-devel back to the original version of dlmalloc, which 
removed the slowdown.

Windows 10 (build 19041 and later) allows to choose a more recent 
SegmentHeap allocator instead of the default Low Fragmentation Heap 
allocator. It gives almost the same performance with this example as the 
original version of dlmalloc, without the maintenance overhead of using 
a custom allocator, so this might be one possible solution.

Best
Tomas

>
> Best regards,
>
> Fredrik
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-SIG-windows mailing list
> R-SIG-windows using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-windows



More information about the R-SIG-windows mailing list