On Dec 27, 2012, at 5:43 PM, Rory Winston wrote:

> Hi Simon
> 
> Thanks for the clarification - makes sense and I now think youre right - 
> probably better to avoid an automatic factor conversion and let the user 
> explicitly convert if necessary. And you are right, I did abuse the term 
> factor when referring to varchar - instead of factor, I really meant 
> something like 'internalized strings' a la Java (ie like a factor but with no 
> ordering or distinct levels attributes.
> 

FWIW all strings are internalized in R (for some years now) - hence character 
vectors are very memory-efficient and essentially what you were looking for.

Cheers,
Simon

> 
> 
> On 27/12/2012, at 5:47 PM, Simon Urbanek <simon.urba...@r-project.org> wrote:
> 
>> varchars are character strings. Factors consists of index and level set, so 
>> if your DB doesn't keep those separate, it is not a factor (and below you 
>> suggest it doesn't). Even if the DB supports ordered and unordered sets, the 
>> drivers typically only return the strings anyway, so you don't get at the 
>> set (without querying the schema). To make a point - a factor is if you can 
>> have a column consisting of values A,A,B,B and a level set of A,B,C (i.e. C 
>> is not used so it is extra information that you cannot express in a 
>> character string). if you don't have levels information nor the order then 
>> it's just a character vector.
> 

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to