On Dec 27, 2012, at 5:43 PM, Rory Winston wrote: > Hi Simon > > Thanks for the clarification - makes sense and I now think youre right - > probably better to avoid an automatic factor conversion and let the user > explicitly convert if necessary. And you are right, I did abuse the term > factor when referring to varchar - instead of factor, I really meant > something like 'internalized strings' a la Java (ie like a factor but with no > ordering or distinct levels attributes. >
FWIW all strings are internalized in R (for some years now) - hence character vectors are very memory-efficient and essentially what you were looking for. Cheers, Simon > > > On 27/12/2012, at 5:47 PM, Simon Urbanek <simon.urba...@r-project.org> wrote: > >> varchars are character strings. Factors consists of index and level set, so >> if your DB doesn't keep those separate, it is not a factor (and below you >> suggest it doesn't). Even if the DB supports ordered and unordered sets, the >> drivers typically only return the strings anyway, so you don't get at the >> set (without querying the schema). To make a point - a factor is if you can >> have a column consisting of values A,A,B,B and a level set of A,B,C (i.e. C >> is not used so it is extra information that you cannot express in a >> character string). if you don't have levels information nor the order then >> it's just a character vector. > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel