On Jul 26, 2010, at 10:56 AM, Steffen Uhlig wrote:
Dear David, Petr, and Alain,
thank you very much for your fast responses. It's a typical
"handbook-not-read-error" at my side. I will dig deeper into the
plot-functions and the assignment of data. I was not aware of that
the vector "a" is handled as a vector of factors with 10 levels.
Thanks for your suggestions and hints!
You can prevent that behavior and instead get a character vector ...
at least from functions that return such ... by using stringsAsFactors
= FALSE within the data.frame call. You also have the option of
setting that globally which at least one well known institution has
adopted as the default policy for its work.
?data.frame
?options
--
David
Best regards,
/steffen
Am 26.07.2010 14:30, schrieb David Winsemius:
On Jul 26, 2010, at 7:38 AM, Steffen Uhlig wrote:
Hello,
my data.frame is sort of a collection of process values, i.e. huge
run-chart. It consists of a time-stamp in the first column (date as
string), factors in the following columns (used for subset-
filtering),
and some process-data columns.
Hereafter, two examples are listed, showing the problems that occour
during print:
At first the example, that works fine:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
a = c(1:10) # create a vector of integers
b = rep(c("a","b"),5) # create a vector of chars, used
# as factor-levels
d = rnorm(10) # some random numbers
e = data.frame(a,b,d) # connect to a data.frame
You've gotten several answers, but none have addressed an aspect of R
behavior that took me longer to appreciate than it perhaps should
have.
The "b" column inside the "e" data.frame is now a factor column. I
mention that because you later referred to it as a "string" which
it is
not. It is an integer with an associated indexed level character
vector.
Many of the functions that you might think would "work" on "strings"
will give either errors or unexpected results when applied to
factors.
e.1 = subset(e, b=="a") # create two subsets
e.2 = subset(e, b=="b")
plot(d~a, e.1, pch=3, col=2) # plot first data-subset
points(d~a, e.2, pch=4, col=3) # plot the 2nd one
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
all looks fine in theses plots.
However, changing the content of vector "a" to a set of strings the
following happens:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
a = c("a","b","c","d","e","f","g","h","i","j")
e = data.frame(a,b,d) # re-build data.frame
e.1 = subset(e, b=="a") # create two subsets
e.2 = subset(e, b=="b")
plot(d~a, e.1, pch=3, col=2)
points(d~a, e.2, pch=4, col=3)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The plot-command produces horizontal lines instead of dots. This
seems
to happen when the x-axis contains strings rather than numbers. is
there a way out?
Best regards,
/Steffen
--
Steffen Uhlig, PhD
Mechatronik und Sensortechnik
HTW des Saarlandes
Goebenstraße 40
66117 Saarbrücken
Tel.: +49 (0) 681 58 67 274
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.