Your complaint is based on what you think a factor should be rather than what 
it actually is andhow it works.  The trick with R (BTW I think it's version 
2.12.x rather than 12.x at this stage...) is learning to work *with* it as it 
is rather than making it work the way you would like it to do.

Factors are a bit tricky.  The are numeric objects, even if arithmetic is 
inhibited.

> f <- factor(letters)
> is.numeric(f)  ## this looks strange
[1] FALSE
> mode(f)        ## but at a lower level
[1] "numeric"

Take a simple example.  

> x <- structure(1:26, names = sample(letters))
> x
 h  o  u  w  l  z  a  j  e  n  k  i  s  v  t  g  f  x  c  b  y  d  m  q  p  r 
 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 

If you use the factor f as an index, it behaves as a numeric vector of indices:

> x[f]
 h  o  u  w  l  z  a  j  e  n  k  i  s  v  t  g  f  x  c  b  y  d  m  q  p  r 
 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 

That's just the way it is.  That's the reality.  You have to learn to deal with 
it. 

Sometimes a factor behaves as a character string vector, when no other 
interpretation would make sense.  e.g.

> which(f == "o")
[1] 15
>

but in other cases they do not.  In this case you can make the coercion 
explicit of course, if that is your bent:

> which(as.character(f) == "o")
[1] 15
> 

but here there is no need.  There are cases were you *do* need to make an 
explicit coercion, though, if you want it to behave as a character string 
vector, and indexing is one:

> x[as.character(f)]
 a  b  c  d  e  f  g  h  i  j  k  l  m  n  o  p  q  r  s  t  u  v  w  x  y  z 
 7 20 19 22  9 17 16  1 12  8 11  5 23 10  2 25 24 26 13 15  3 14  4 18 21  6 
>  

If you want factors to behave universally as character string vectors, the 
solution is not to use factors at all but to use character string vectors 
instead.  You can get away with a surprising amount this way.  e.g. character 
string vectors used in model formulae are silently coerced to factors, anyway.  
What you need to learn is how to read in data frames keeping the character 
string columns "as is" and stop them from being made into factors at that 
stage.  That is a lesson for another day...

Bill Venables.


-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of WB Kloke
Sent: Monday, 14 February 2011 8:31 PM
To: r-h...@stat.math.ethz.ch
Subject: [R] How to get warning about implicit factor to integer coercion?

Is there a way in R (12.x) to avoid the implicit coercion of factors to integers
in the context of subscripts?

If this is not possible, is there a way to get at least a warning, if any
coercion of this type happens, given that the action of this coercion is almost
never what is wanted?

Of course, in the rare case that as.integer() is applied explicitly onto a
factor, the warning is not needed, but certainly not as disastrous as in the
other case.

Probably, in a future version of R, an option similar to that for partial
matches of subscripts might be useful to change the default from maximal
unsafety to safe use.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to