[R] "too large for hashing"
Adam D. I. Kramer
adik-rhelp at ilovebacon.org
Thu Apr 5 21:07:15 CEST 2012
Thanks for your response, Duncan.
x$eventtype is a "character" vector (because the same hashing error
occurred when I tried to read.table() in the first place specifying
colClasses = c(..., "factor", ...).
x really is that long:
> dim(x)
[1] 1093574297 12
...the x$eventtype field has three unique values.
(I'm currently using a workaround of making a numeric column based on a
string of ifelse() and then setting class() <- factor and then setting the
labels manually.)
--Adam
On Thu, 5 Apr 2012, Duncan Murdoch wrote:
> On 05/04/2012 2:03 PM, Adam D. I. Kramer wrote:
>> Hello,
>>
>> I'm doing some analysis on a rather large data set. In this case,
>> some simple commands are failing. For example, this one:
>>
>> > x$eventtype<- factor(x$eventtype)
>> Error in unique.default(x) : length 1093574297 is too large for hashing
>>
>> ...I think this is a bug, because "hashing" should not be required for the
>> "factor" function. Am I right? The whole column does not need to be hashed,
>> only the unique keys. Sure, there is the potential to overflow the key
>> register, but this error should be thrown only if that occurs, no?
>
> It looks as though the error is coming when unique() tries to determine the
> unique levels in the argument, but really there's no way to answer your
> question without more information. What type of object is x$eventtype? It
> is really 1093574297 elements long? How many unique values does it have?
>
> Duncan Murdoch
>
More information about the R-help
mailing list