[R-SIG-Mac] Poor plotting performance on Mac OS X
peter dalgaard
pdalgd at gmail.com
Mon Jul 10 23:35:26 CEST 2017
Thanks Don,
Can you perhaps make that a little closer to being reproducible for people who might want to try it on newer hardware? (packages, location of input file).
(And by the way, xterm? What did Terminal.app do to you?)
This, along with a thread from 2014 that Google dug up, strongly suggests that quartz() is the culprit. It might not be our fault though, as it seems that "matplotlib" (Python) has had similar problems. In the latter case, it seems to have worked to "switch to the agg backend", whatever that means. (AGG is AntiGrain Geometry, which is apparently so stable that it hasn't changed since 2006. Its author died suddenly in 2013, it seems.)
-pd
> On 10 Jul 2017, at 19:59 , MacQueen, Don <macqueen1 at llnl.gov> wrote:
>
> For what it's worth, here is my experience on a late 2013 Mac Pro.
> (I normally run R from an xterm shell within an X Windows context)
>
> The best performance for displaying the image on-screen uses cairographics and Polypath. It's the only one fast enough to be satisfactory for interactive use, in my opinion [though x11(type='Xlib') could be tolerated]. pdf() is even faster.
>
> -Don
>
>> nsw <- readOGR('data','SA3_2016_AUST', stringsAsFactors=FALSE)
> OGR data source with driver: ESRI Shapefile
> Source: "data", layer: "SA3_2016_AUST"
> with 358 features
> It has 9 fields
> Warning message:
> In readOGR("data", "SA3_2016_AUST", stringsAsFactors = FALSE) :
> Dropping null geometries: 93, 94, 161, 162, 245, 246, 275, 276, 311, 312, 328, 329, 339, 340, 351, 352, 357, 358
>
> ## The "original" X windows device (does not support Polypath)
>> x11(type='Xlib')
>> system.time( plot(nsw, usePolypath=FALSE) )
> user system elapsed
> 0.882 0.512 99.618
>
> ## a more "modern" version of the X windows device
>> x11(type='cairo')
>> system.time( plot(nsw, usePolypath=FALSE) )
> user system elapsed
> 2.233 5.007 68.410
>
>
> ## Polypath gives large improvement
>> x11(type='cairo')
>> system.time( plot(nsw, usePolypath=TRUE) )
> user system elapsed
> 1.772 0.461 5.785
>
> ## Same R session as above, but using the quartz device
>> quartz()
>> system.time( plot(nsw) )
> user system elapsed
> 1135.606 1.556 1137.445
>
>> pdf('test.pdf')
>> system.time( plot(nsw) )
> user system elapsed
> 2.029 0.200 2.248
>> dev.off()
>
>
>> sessionInfo()
> R version 3.4.1 (2017-06-30)
> Platform: x86_64-apple-darwin15.6.0 (64-bit)
> Running under: OS X El Capitan 10.11.6
>
> Matrix products: default
> BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
> LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
>
> locale:
> [1] C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] rgdal_1.2-8 sp_1.2-5
>
> loaded via a namespace (and not attached):
> [1] compiler_3.4.1 tools_3.4.1 grid_3.4.1 lattice_0.20-35
>
>
> --
> Don MacQueen
>
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
>
>
> On 7/10/17, 4:40 AM, "R-SIG-Mac on behalf of Ashley Betts" <r-sig-mac-bounces at r-project.org on behalf of Ashley.Betts at saltbushsoftware.com> wrote:
>
> Hi Peter, outputting to PDF made a huge difference! It ran for only 15 seconds and there was no trailing unresponsive prompt. The PDF ended up being around 2Mb and opened and displayed almost immediately in Preview.
>
> I did watch the system when I was outputting to the quartz device last time and the process was steadily consuming a single core. When I watched it on the Windows machines I saw it use most of the cores but not to capacity. After seeing this I read up on multithreading libraries on Mac and finally setup R to use the Accelerate library. I’ve verified this library is being loaded by sampling the process. It make zero difference however and the print(plt) still only consumes a single core.
>
> Regards,
>
> Ashley
>
>> On 10 Jul 2017, at 6:21 PM, peter dalgaard <pdalgd at gmail.com> wrote:
>>
>> Pretty clear that the process is getting stuck in Apple-graphics land, then. This could be inefficiency of the device driver, but also just ... Apple. Could you try running the same thing to a PDF (AFAIR, just open the device with pdf(file="myplot.pdf"), then print(plt), then dev.off()). It would be good to know if this is fast, and also whether viewing the resulting PDF in Preview is slow (in which case it is Not Our Problem).
>>
>> Also, does running the Activity Monitor give any clues? Like, perhaps you are running out of memory.
>>
>> -pd
>>
>>> On 10 Jul 2017, at 00:05 , Ashley Betts <Ashley.Betts at saltbushsoftware.com> wrote:
>>>
>>> Oh yes, sorry about that. I originally had screen shots attached showing the timings but the email ended up being too large. All of the time is in the print. Nearly all other commands run within seconds. Oddly, after approximately half hour the prompt returns which I get one Sys.time() to execute but then the prompt hangs when I enter the second Sys.time() for the best part of an hour and half.
>>>
>>> I tried to profile but that failed. I tried sampling the process a number of times and every time I sampled execution was buried in CGContextDrawPath
>>> GEPolygon (in libR.dylib) + 127 [0x101cb54df] engine.c:0
>>> + 2502 clipPolygon (in libR.dylib) + 571 [0x101cb574b] engine.c:1080
>>> + 2502 CGContextDrawPath (in CoreGraphics) + 181 [0x7fff8d433e59]
>>> + 2502 ripc_DrawPath (in libRIP.A.dylib) + 417 [0x7fff8ec631a3]
>>> + 2502 ripc_Render (in libRIP.A.dylib) + 380 [0x7fff8ec4f750]
>>> + 2502 RIPRenderCoverage (in libRIP.A.dylib) + 1844 [0x7fff8ec4ff84]
>>>
>>>
>>> Regards,
>>>
>>> Ashley
>>>
>>> <macplottimes.jpg>
>>>
>>>
>>>> On 9 Jul 2017, at 9:35 PM, peter dalgaard <pdalgd at gmail.com> wrote:
>>>>
>>>> Hmm, you're not telling us much about where the time is being spent. Some more detailed timing using system.time() could be useful.
>>>>
>>>> If it is a graphics device issue, I would expect almost everything in the final print(plt). You could try switching graphics device, e.g. to pdf() which should be pretty much the same on all platforms. You might also try creating PDF files on one machine and displaying on the other.
>>>>
>>>> -pd
>>>>
>>>>> On 9 Jul 2017, at 12:45 , Ashley Betts <Ashley.Betts at saltbushsoftware.com> wrote:
>>>>>
>>>>> Hi All,
>>>>> I'm quite new to R and recently started investigating the geospatial plotting capabilities of R via ggplot2. I started by using some of the publicly available datasets from the Australian Bureau of Statistics. Plotting the Level 3 Statistical Area boundaries took over 2 hours on my 2012 Mac Book Pro. As there were over 3M rows in the fortify’ed data frame I initially thought this was just how long it must take. I then ran the exact same script on my work laptop which is similarly spec’ed and it ran in approximately 30 seconds. This now has me extremely disappointed in the performance on the Mac which is where I use R the most. I changed my BLAS library to the Accelerate library in a whim that this might make a difference. It did not. Whilst I primarily use RStudio I also ran the same script in R.app and if there was any improvement it was not noticeable. I did notice in the Windows run that it seemed to use multiple cores (which is what made me investigate the BLAS change) whilst the Mac seems to stay bound to a single core. My initial thoughts were that it must be something to do with ggplot but after sampling the rsession process a number of times (see attached Sample of rsession.txt) it appears to be spending most of it’s time in CGContextDrawPath in Apples CoreGraphics so I assume it is a Graphics related issue. I’m running R 3.4 on my Mac and 3.3.2 on the Windows machine. I’ve attached the script and have screen dumps of the process sample text and a number of others which I can supply if helpful in analysing the issue. Could someone possibly let me know if this is PEBCAK issue or an actual problem with R. If the later how do I go about getting the issue resolved?
>>>>>
>>>>> The SA3 boundary data is available here:
>>>>>
>>>>> http://www.abs.gov.au/AUSSTATS/abs@.nsf/DetailsPage/1270.0.55.001July%202016?OpenDocument
>>>>>
>>>>> as 'Statistical Area Level 3 (SA3) ASGS Ed 2016 Digital Boundaries in ESRI Shapefile Format’
>>>>>
>>>>> Regards,
>>>>>
>>>>> Ashley
>>>>>
>>>>> <aus_pop_analysis.R>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> R-SIG-Mac mailing list
>>>>> R-SIG-Mac at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>>>>
>>>> --
>>>> Peter Dalgaard, Professor,
>>>> Center for Statistics, Copenhagen Business School
>>>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>>>> Phone: (+45)38153501
>>>> Office: A 4.23
>>>> Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>> Ashley Betts
>>>
>>> Saltbush Software
>>> Excellence in Software Engineering Practices
>>>
>>> email: Ashley.Betts at saltbushsoftware.com
>>> Ashley.Betts at sbsw.com.au
>>> web: http://www.saltbushsoftware.com
>>> http://www.sbsw.com.au
>>>
>>
>> --
>> Peter Dalgaard, Professor,
>> Center for Statistics, Copenhagen Business School
>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> Phone: (+45)38153501
>> Office: A 4.23
>> Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
> Ashley Betts
>
> Saltbush Software
> Excellence in Software Engineering Practices
>
> email: Ashley.Betts at saltbushsoftware.com
> Ashley.Betts at sbsw.com.au
> web: http://www.saltbushsoftware.com
> http://www.sbsw.com.au
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-SIG-Mac mailing list
> R-SIG-Mac at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-mac
>
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-SIG-Mac
mailing list