[R] failure with merge

William Dunlap wdunlap at tibco.com
Thu Jul 14 17:12:17 CEST 2016


It looks like a common problem when using do.call("order", dataFrame).
If dataFrame has a column whose name matches an argument to order
you will get this problem.  The solution is to use do.call("order",
unname(dataFrame)).
E.g.,
  > do.call("order", tuneAcc)
  Error in match.arg(method) : 'arg' must be NULL or a character vector
  > do.call("order", unname(tuneAcc))
  [1] 1 2

This is probably buried in the the code for merge().  You can work around
it by changing the name of the "method" column.


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Jul 14, 2016 at 7:51 AM, Max Kuhn <mxkuhn at gmail.com> wrote:

> I am merging two data frames:
>
> tuneAcc <- structure(list(select = c(FALSE, TRUE), method =
> structure(c(1L, 1L), .Label = "GCV.Cp", class = "factor"), RMSE =
> c(29.2102056093962, 28.9743318817886), Rsquared =
> c(0.0322612161559773, 0.0281713457306074), RMSESD = c(0.981573768028697,
> 0.791307778398384), RsquaredSD = c(0.0388188469162352,
> 0.0322578925071113)),
> .Names = c("select", "method", "RMSE", "Rsquared", "RMSESD",
> "RsquaredSD"),
> class = "data.frame", row.names = 1:2)
>
> finalTune <- structure(list(select = TRUE, method = structure(1L,
> .Label = "GCV.Cp", class = "factor"), Selected = "*"), .Names =
> c("select", "method", "Selected"), row.names = 2L, class = "data.frame")
>
> using
>
>    merge(x = tuneAcc, y = finalTune, all.x = TRUE)
>
> The error is
>
>   "Error in match.arg(method) : 'arg' must be NULL or a character vector"
>
> This is R version 3.3.1 (2016-06-21), Platform: x86_64-apple-darwin13.4.0
> (64-bit), Running under: OS X 10.11.5 (El Capitan).
>
> <some digging>
>
> These do not stop execution:
>
>   merge(x = tuneAcc, y = finalTune)
>   merge(x = tuneAcc, y = finalTune, all.x = TRUE, sort = FALSE)
>
> The latter produces (what I consider to be) incorrect results.
>
> Walking through the code, the original call with just `all.x = TRUE` fails
> when sorting at the line:
>
>   res <- res[if (all.x || all.y)
>     do.call("order", x[, seq_len(l.b), drop = FALSE]) else
>      sort.list(bx[m$xi]), , drop = FALSE]
>
> Specifically, on the `do.call` bit. For these data:
>
>   Browse[3]> x
>   select method RMSE Rsquared RMSESD RsquaredSD
>   2 TRUE GCV.Cp 28.97433 0.02817135 0.7913078 0.03225789
>   1 FALSE GCV.Cp 29.21021 0.03226122 0.9815738 0.03881885
>
>
>   Browse[3]> x[, seq_len(l.b), drop = FALSE]
>   select method
>   2 TRUE GCV.Cp
>   1 FALSE GCV.Cp
>
> and this line executes:
>
>   Browse[3]> order(x[, seq_len(l.b), drop = FALSE])
>   [1] 1 2 3 4
>
> although nrow(x) = 2 so this is an issue.
>
> Calling it this way stops execution:
>
> Browse[3]> do.call("order", x[, seq_len(l.b), drop = FALSE])
> Error in match.arg(method) : 'arg' must be NULL or a character vector
>
> Thanks,
>
> Max
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list