[Rd] Creating a Factor Object in C code?

Simon Urbanek simon.urbanek at r-project.org
Fri Dec 28 01:45:12 CET 2012


On Dec 27, 2012, at 5:43 PM, Rory Winston wrote:

> Hi Simon
> 
> Thanks for the clarification - makes sense and I now think youre right - probably better to avoid an automatic factor conversion and let the user explicitly convert if necessary. And you are right, I did abuse the term factor when referring to varchar - instead of factor, I really meant something like 'internalized strings' a la Java (ie like a factor but with no ordering or distinct levels attributes.
> 

FWIW all strings are internalized in R (for some years now) - hence character vectors are very memory-efficient and essentially what you were looking for.

Cheers,
Simon

> 
> 
> On 27/12/2012, at 5:47 PM, Simon Urbanek <simon.urbanek at r-project.org> wrote:
> 
>> varchars are character strings. Factors consists of index and level set, so if your DB doesn't keep those separate, it is not a factor (and below you suggest it doesn't). Even if the DB supports ordered and unordered sets, the drivers typically only return the strings anyway, so you don't get at the set (without querying the schema). To make a point - a factor is if you can have a column consisting of values A,A,B,B and a level set of A,B,C (i.e. C is not used so it is extra information that you cannot express in a character string). if you don't have levels information nor the order then it's just a character vector.
> 



More information about the R-devel mailing list