On May 25, 2011, at 1:16 PM, Scott Hatcher wrote:
Hello Dr. Winsemius,
First of all, thank you for your prompt and helpful reply. Also, for
providing something I hoped would be produced from joining this
mailing list: a means of discovering incredibly useful packages such
as the "reshape2" one you have introduced me too.
I have a follow up question to your solution (which should produce
exactly what I need):
when I run the cast function to reassemble the data frame I get:
I used `dcast`.
Error in names(data) <- array_names(res$labels[[2]]) :
'names' attribute [7] must be the same length as the vector [1]
And I obviously didn't get that error, so there might be a difference
in either the code (which you did not show), or the data (which you
did not offer in a reproducible form).
This signaled to me that the function was returning 7 values where
it expected only 1. To test this I applied a summary function "mean"
to the cast, and the result processed (however it only produced NA's
because my values were class:factors). What I don't understand is
where these multiple values are coming from; there should be only a
single value corresponding to the 4 id.vars given in the cast
function (STN_ID,YEAR,MM,variable).
If you want further effort you should address the inadequacies of your
question. It is very possible that you will need to acquaint yourself
with the use of either `dump` pr `dput`.
--
David.
Thanks again for your help,
Scott Hatcher
On 24/05/2011 5:16 PM, David Winsemius wrote:
On May 24, 2011, at 3:03 PM, Scott Hatcher wrote:
Hello,
I have a large data frame that is organized by date in a peculiar
way. I
am seeking advice on how to transform the data into a format that
is of
more use to me.
The data is organized as follows:
STN_ID YEAR MM ELEM X1 X2 X3
X4 X5 X6 X7
1 2402594 1997 9 1 *-00233* *-00204* *-00119* -00190
-00251 -00243 -00249
2 2402594 1997 10 1 -00003 -00005 -00001
-00039 -00031 -00036 -00033
3 2402594 1997 11 1 000025 000065 000070
000069 000115 000072 000093
Where "MM" is the month of the year, and ELEM is the variable to
which
the values in the X* columns describe (in the actual data there
are 31 X
columns, one for each day of the month). The values in bold are the
values that are transferred into the small chart below (which is the
result I hope to get). This is to give a sense of how the data is
picked
out of the original data frame.
assuming this dataframe is named 'tst':
require(reshape2)
mtst <- melt(tst[, 1:7], id.vars=1:4) Only select idvars and X1:X3
str(mtst)
#----------
'data.frame': 54 obs. of 6 variables:
$ STN_ID : num 2402594 2402594 2402594 2402594 2402594 ...
$ YEAR : num 1997 1997 1997 1997 1998 ...
$ MM : num 9 10 11 12 1 2 3 4 5 9 ...
$ ELEM : num 1 1 1 1 1 1 1 1 1 2 ...
$ variable: Factor w/ 3 levels "X1","X2","X3": 1 1 1 1 1 1 1 1 1
1 ...
$ value : chr "-00233" "-00003" "000025" "000160" ...
dcast(mtst, STN_ID +YEAR+ MM + variable ~ ELEM)
#---------
STN_ID YEAR MM variable 1 2
1 2402594 1997 9 X1 -00233 -00339
2 2402594 1997 9 X2 -00204 -00339
3 2402594 1997 9 X3 -00119 -00343
4 2402594 1997 10 X1 -00003 -00207
5 2402594 1997 10 X2 -00005 -00289
6 2402594 1997 10 X3 -00001 -00278
7 2402594 1997 11 X1 000025 -00242
snipped output
I would like to organize the data so it looks like this:
STN_ID YEAR MM DAY ELEM1 ELEM2
1 2402594 1997 9 X1 -00233 -00339
2 2402594 1997 9 X2 -00204 000077
3 2402594 1997 9 X3 -00119 000030
Where is that second column coming from. I don't see it in the data
example
Such that I create a new column named "DAY" that is made up of the
numbers following "X" in the original data.frame columns. Also,
the ELEM
values are converted to columns and parsed with the ELEM code (in
this
case 1 and 2).
I have tried to split apart the columns, transform them, and bind
them
back together, but my ability to do so just isn't there yet. I am
still
fairly new to R, and would really appreciate some help in working
towards organizing this data frame.
Thanks in advance,
Scott Hatcher
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.