[Rd] Minor inconsistencies in tools:::funAPI()

Ivan Krylov |kry|ov @end|ng |rom d|@root@org
Mon Jul 15 16:31:31 CEST 2024


Hi all,

I've noticed some peculiarities in the tools:::funAPI output that
complicate its programmatic use a bit.

 - Is it for remapped symbol names (with Rf_ or the Fortran
   underscore), or for unmapped names (without Rf_ or the underscore)?

I see that the functions marked in WRE are almost all (except
Rf_installChar and Rf_installTrChar) unmapped. This makes a lot of
sense because some of those interfaces (e.g. CONS(), CHAR(),
NOT_SHARED()) are C preprocessor macros, not functions. I also see that
installTrChar is not explicitly marked.

Are we allowed to call tools:::unmap(tools:::funAPI()$name) and
consider the return value to be the list of all unmapped APIs, despite,
e.g., installTrChar not being explicitly marked?

 - Should R_PV be an @apifun if it's currently caught by checks in
   sotools.R?

 - Should R_FindSymbol be commented /* Not API */ if it's marked as
   @apifun in WRE and not caught by sotools.R? It is currently used by 8
   CRAN packages.

 - The names 'select', 'delztg' from R_ext/Lapack.h are function
   pointer arguments, not functions or type declarations. They are
   being found because funcRegexp is written to match incomplete
   function declarations (e.g. when they end up being split over
   multiple lines, like in R_ext/Lapack.h), and function pointer
   argument declarations look sufficiently similar.

A relatively compact (but still brittle) way to match function
declarations in C header files is shown at the end of this message. I
have confirmed that compared to tools:::getFunsHdr, the only extraneous
symbols that it finds in preprocessed headers are "R_SetWin32",
"user_unif_rand", "user_unif_init", "user_unif_nseed",
"user_unif_seedloc" "user_norm_rand", which are special-cased in
tools:::getFunsHdr, and the only symbols it doesn't find are "select"
and "delztg" in R_ext/Lapack.h, which we should not be finding.

# "Bird's eye" view, gives unmapped names on non-preprocessed headers
getdecl <- function(file, lines = readLines(file)) {
	# have to combine to perform multi-line matches
	lines <- paste(c(lines, ''), collapse = '\n')
	# first eat the C comments, dotall but non-greedy match
	lines <- gsub('(?s)/\\*.*?\\*/', '', lines, perl = TRUE)
	# C++-style comments too, multiline not dotall
	lines <- gsub('(?m)//.*$', '', lines, perl = TRUE)
	# drop all preprocessor directives
	lines <- gsub('(?m)^\\s*#.*$', '', lines, perl = TRUE)

	rx <- r"{(?xs)
		(?!typedef)(?<!\w) # please no typedefs
		# return type with attributes
		(
			# words followed by whitespace or stars
			(?: \w+ (?:\s+ | \*)+)+
		)
		# function name, assumes no extra whitespace
		(
			\w+\(\w+\) # macro call
			| \(\w+\)  # in parentheses
			| \w+      # a plain name
		)
		# arguments: non-greedy match inside parentheses
		\s* \( (.*?) \) \s* # using dotall here
		# will include R_PRINTF_FORMAT(1,2 but we don't care
		# finally terminated by semicolon
		;
	}"

	regmatches(lines, gregexec(rx, lines, perl = TRUE))[[1]][3,]
}

# Preprocess then extract remapped function names like getFunsHdr
getdecl2 <- function(file)
	file |>
	readLines() |>
	grep('^\\s*#\\s*error', x = _, value = TRUE, invert = TRUE) |>
	tools:::ccE() |>
	getdecl(lines = _)

-- 
Best regards,
Ivan



More information about the R-devel mailing list