[Rd] Display lines of code from the top-level script or subscript in non-interactive R Session with Rprof
Alexander Keth
@|ex@nder@keth @end|ng |rom |venturegroup@com
Wed Aug 3 10:44:55 CEST 2022
Hello there,
I am running R in a production environment. My goal is to profile all production jobs, which are run in non interactive R sessions via Rscript, in the form job-xyz ran for xxx amount of time and spend yyy seconds with code execution of line # (for every line of code). In general the R code is run with a main script which calls various subscripts. The jobs make heays use of external packages (e.g. dplyr, DBI, data.table and so on).
I re-installed all packages with --with-keep.source. Subscripts are sourced in the main-script via eval(parse("path/to/subscript.R")) to enable line-profiling with Rprof. The call to Rprof is Rprof("rprof.out", line.profiling = TRUE, memory.profiling = TRUE).
Unfotunately, the majority of the code relies on heavy package use (e.g. dplyr, data.table and so on). Thus most of the code lines in Rprof refer to the source-code within those packages and not the 'top-level' source code in the main-script or the subscripts. So far the only solution I came up with is to scrape the Rprof output using the profile package (https://github.com/r-prof/profile), extract the top-level call stack function calls (remove top level eval calls before) and auto-magically match the function calls with the function calls performed in the main-script and subscripts. However, this process is obviously not perfect and very error prone...
Is there any better way to do things?
Cheers,
Alex
More information about the R-devel
mailing list