Bert,
Thank you for correcting my inaccuracy. A quick look at the original
question might help you understand what I meant:
d<- data.frame(x = c(0, 1))
d<- data.frame(d, y = c(0,1))
names(d)[2]<- "a.-5"
d
x a.-5
1 0 0
2 1 1
d1<- data.frame(d, y = c(0,1))
d1
x a..5 y
1 0 0 0
2 1 1 1
d2<- data.frame(d, y = c(0,1), check.names=FALSE)
d2
x a.-5 y
1 0 0 0
2 1 1 1
With check.names=TRUE, the dash is converted to a period. With
check.names=FALSE, the dash is conserved. So the dash is not a problem
per se, because data.frame() doesn't throw an error or warning in this case.
Then my question is, why is it converted? To avoid problems with other
functions? To avoid confusion and mischief as you mentioned because it
is the symbol for subtraction? If it can be that problematic, why not
just not allow it at all? I guess there are reasons for these behaviors
and I am curious to learn more about the logic behind it.
Actually, I find that data.frame() can be confusing. On the one hand it
accepts unquoted strings to define column names, like in your first
example. But on the other hand, it doesn't accept it if it can be
confusing like in your second example. I am definitely not experienced
enough to judge whether the behavior makes sense or not, but I am
curious to know why quoted strings are not required in data.frame().
This behavior would be consistent, and therefore easier to understand
for beginners, I think.
Thank you for your insights,
Ivan
Le 24/01/12 16:53, Bert Gunter a écrit :
Ivan:
On Tue, Jan 24, 2012 at 6:47 AM, Ivan Calandra
<[email protected]> wrote:
By "it works anyway", I mean that you can have a dash in a column name,
there is no error or even warning.
I guess that some functions would throw an error or warning, depending on
the requirements, but data.frame() doesn't.
This is false. Please don't guess. Read the Help pages.
data.frame(a = 1:3) #fine
data.frame(a-3 = 1:3) # Error: unexpected '=' in "data.frame(a-3 ="
The name in **NOT** OK. However,
data.frame("a-3" = 1:3) # fine
a.3
1 1
2 2
3 3
## A quoted character string can be used as a column name
## The name will be changed to a legal name unless:
data.frame("a-3" = 1:3,check.names=FALSE)
a-3
1 1
2 2
3 3
However, as is obvious, there is much mischief possible from such
practices, so that they are best avoided.
-- Bert
Ivan
Le 24/01/12 15:35, David Winsemius a écrit :
On Jan 24, 2012, at 4:44 AM, Ivan Calandra wrote:
Hi Mark,
I cannot tell you why (maybe someone else can), but the check.names
argument to data.frame() interpret "a.-5" as an unvalid name and convert to
to a valid one. What I don't understand is why it isn't "valid" since it
works anyway.
The dash is not a valid character for column names. What do you mean by
"it works anyway"?
--
Ivan CALANDRA
Université de Bourgogne
UMR CNRS/uB 6282 Biogéosciences
6 Boulevard Gabriel
21000 Dijon, FRANCE
+33(0)3.80.39.63.06
[email protected]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Ivan CALANDRA
Université de Bourgogne
UMR CNRS/uB 6282 Biogéosciences
6 Boulevard Gabriel
21000 Dijon, FRANCE
+33(0)3.80.39.63.06
[email protected]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.