[Rd] conflict between rJava and data.table

Matthew Dowle mdowle at mdowle.plus.com
Fri Mar 1 23:03:09 CET 2013


On 01.03.2013 20:19, Simon Urbanek wrote:
> On Mar 1, 2013, at 11:40 AM, Matthew Dowle wrote:
>
>> On 01.03.2013 16:13, Simon Urbanek wrote:
>>> On Mar 1, 2013, at 8:03 AM, Matthew Dowle wrote:
>>>
>>>>
>>>> Simon Urbanek wrote :
>>>>> Can you elaborate on the details as of where this will be a 
>>>>> problem? Packages
>>>>> should not be affected since they should be importing the 
>>>>> namespaces from the
>>>>> packages they use, so the only problem would be in a package that 
>>>>> uses both
>>>>> data.table and rJava --  and this is easily resolved in the 
>>>>> namespace of such
>>>>> package. So there is no technical reason why you can't have 
>>>>> multiple
>>>>> definitions of J - that's what namespaces are for.
>>>>
>>>> Right. It's users using J() in their own code, iiuc. rJava's 
>>>> manual says "J is
>>>> the high-level access to Java."  When they use J() on its own they 
>>>> probably
>>>> want the rJava one, but if data.table is higher they get that one.
>>>> They don't want to have to write out rJava::J(...).
>>>>
>>>> It is not just rJava but package XLConnect, too. If there's a 
>>>> better way would
>>>> be interested but I didn't mind removing J from data.table.
>>>>
>>>
>>> For packages there is really no issue - if something breaks in
>>> XTConnect then the authors are probably importing the wrong 
>>> function
>>> in their namespace (I still didn't see a reproducible example,
>>> though). The only difference is for interactive use so not having
>>> conflicting J() [if possible] would be actually useful there, since
>>> J() in rJava is primarily intended for interactive use.
>>
>> Yes that's what I wrote above isn't it? i.e.
>>
>>> It's users using J() in their own code, iiuc.
>>> "J is the high-level access to Java."
>>
>> Not just interactive use (i.e. at the R prompt) but inside their 
>> functions and scripts, too.
>> Although, I don't know the rJava package at all. So why J() might be 
>> used for interactive
>> use but not in functions and scripts isn't clear to me.
>> Any use of J from example(J) will serve as a reproducible example; 
>> e.g.,
>>
>>    library(rJava)          # load rJava first
>>    library(data.table)     # then data.table
>>    J("java.lang.Double")
>>
>> There is no error or warning, but the user would be returned a 1 row 
>> 1 column
>> data.table rather than something related to Java. Then the 
>> errors/warnings follow from there.
>>
>> The user can either load the packages the other way around, or, use 
>> ::
>>
>>    library(rJava)                  # load rJava first
>>    library(data.table)             # then data.table
>>    rJava::J("java.lang.Double")    # ok now
>>
>
> Matt,
>
> there are two entirely separate uses
>
> a) interactive use
> b) use in packages
>
> you are describing a) and as I said in the latter part above J() in
> rJava is meant for that so it would be useful to not have a conflict
> there.

Yes (a) is the problem. Good, so I did the right thing in July 2012
by starting to deprecate J in data.table when this problem was first
reported.

> However, in my first part of the e-mail I was referring to b) where
> there is no conflict, because packages define which package will a
> symbol come from, so the user search path plays no role. Today, all
> packages should be using imports so search path pollution should no
> longer be an issue, so the order in which the user attached packages
> to their search path won't affect the functionality of the packages
> (that's why namespaces are mandatory). Therefore, if XLConnect breaks
> (again, I don't know, I didn't see it) due to the order on the search
> path, it indicates there is a bug in the its namespace as it's
> apparently importing the wrong J - it should be importing it from
> rJava and not data.table. Is that more clear?

Yes, thanks. (b) isn't a problem. rJava and XLConnect aren't breaking,
the users aren't reporting that. It's merely problem (a); e.g. where
end users of both rJava and data.table use J() in their own code.

>
> Cheers,
> Simon
>
>
>>
>>>
>>> Cheers,
>>> Simon
>>>
>>>
>>>> Bunny/Matt,
>>>>
>>>> To add to Steve's reply here's some background. This is well 
>>>> documented in NEWS
>>>> and Googling "data.table J rJava" and similar returns useful links 
>>>> to NEWS and
>>>> datatable-help (so you shouldn't have needed to post to r-devel).
>>>>
>>>> From 1.8.2 (Jul 2012) :
>>>>
>>>> o  The J() alias is now deprecated outside DT[...], but will still 
>>>> work inside
>>>>  DT[...], as in DT[J(...)].
>>>>  J() is conflicting with function J() in package XLConnect (#1747)
>>>>  and rJava (#2045). For data.table to change is easier, with some 
>>>> efficiency
>>>>  advantages too. The next version of data.table will issue a 
>>>> warning from J()
>>>>  when used outside DT[...]. The version after will remove it. Only 
>>>> then will
>>>>  the conflict with rJava and XLConnect be resolved.
>>>>  Please use data.table() directly instead of J(), outside DT[...].
>>>>
>>>> From 1.8.4 (Nov 2012) :
>>>>
>>>> o  J() now issues a warning (when used *outside* DT[...]) that 
>>>> using it
>>>>  outside DT[...] is deprecated. See item below in v1.8.2.
>>>>  Use data.table() directly instead of J(), outside DT[...]. Or, 
>>>> define
>>>>  an alias yourself. J() will continue to work *inside* DT[...] as 
>>>> documented.
>>>>
>>>> From 1.8.7 (soon to be on CRAN) :
>>>>
>>>> o  The J() alias is now removed *outside* DT[...], but will still 
>>>> work inside DT[...];
>>>>  i.e., DT[J(...)] is fine. As warned in v1.8.2 (see below in this 
>>>> file) and deprecated
>>>>  with warning() in v1.8.6. This resolves the conflict with 
>>>> function J() in package
>>>>  XLConnect (#1747) and rJava (#2045).
>>>>  Please use data.table() directly instead of J(), outside DT[...].
>>>>
>>>> Matthew
>>>>
>>>>
>>>>
>>
>>



More information about the R-devel mailing list