[R-pkg-devel] help/advice on debugging
Ivan Krylov
kry|ov@r00t @end|ng |rom gm@||@com
Sun Jul 10 17:54:16 CEST 2022
On Sat, 9 Jul 2022 16:29:57 -0400
Ben Bolker <bbolker using gmail.com> wrote:
> The problem is in vignette rebuilding, errors of this form in both
> of the package vignettes:
>
> Can't join on `x$entrezgene_id` x `y$entrezgene_id` because of
> incompatible types.
> ℹ `x$entrezgene_id` is of type <double>>.
> ℹ `y$entrezgene_id` is of type <character>>.
I'd hazard a guess that both vignettes crash in a call to
entrez_to_symbol (a direct one or via rename_genes). Specifically, its
first argument (`x`) is converted to numeric, then the following
happens:
df <- data.frame(entrezgene_id = x)
df <- dplyr::left_join(df, gene_info, by = "entrezgene_id")
gene_info is obtained above, using the following:
gene_info <- get_biomart_mapping(species, symbol_name, dir_save,
verbose) %>%
dplyr::group_by(entrezgene_id) %>%
dplyr::summarise(dplyr::across(dplyr::everything(), dplyr::first))
get_biomart_mapping accesses the Internet using biomaRt::getBM if it
can, but otherwise uses a copy of the information for human genome
cached inside the package.
There doesn't seem to be any mention of special cases for
"entrezgene_id" in the code of the biomaRt package. biomaRt::getBM
POSTs XML queries to ensembl.org/biomart/martservice?... and parses the
resulting tab-separated values using read.table.
My guess is, ensembl.org started returning something that isn't a
number in the entrezgene_id column, and you were the first one to
rebuild the vignette and notice that.
--
Best regards,
Ivan
More information about the R-package-devel
mailing list