On Dec 13, 2012, at 9:16 AM, Nathan Miller wrote:

Hi all,

I have played a bit with the "reshape" package and function along with
"melt" and "cast", but I feel I still don't have a good handle on how to use them efficiently. Below I have included a application of "reshape" that
is rather clunky and I'm hoping someone can offer advice on how to use
reshape (or melt/cast) more efficiently.


You do realize that the 'reshape' function is _not_ in the reshape package, right? And also that the reshape package has been superseded by the reshape2 package?

--
David.


#For this example I am using climate change data available on-line

file <- ("
http://processtrends.com/Files/RClimate_consol_temp_anom_latest.csv";)
clim.data <- read.csv(file, header=TRUE)

library(lubridate)
library(reshape)

#I've been playing with the lubridate package a bit to work with dates, but
as the climate dataset only uses year and month I have
#added a "day" to each entry in the "yr_mn" column and then used "dym" from
lubridate to generate the POSIXlt formatted dates in
#a new column clim.data$date

clim.data$yr_mn<-paste("01", clim.data$yr_mn, sep="")
clim.data$date<-dym(clim.data$yr_mn)

#Now to the reshape. The dataframe is in a wide format. The columns GISS,
HAD, NOAA, RSS, and UAH are all different sources
#from which the global temperature anomaly has been calculated since 1880
(actually only 1978 for RSS and UAH). What I would like to
#do is plot the temperature anomaly vs date and use ggplot to facet by the
different data source (GISS, HAD, etc.). Thus I need the
#data in long format with a date column, a temperature anomaly column, and
a data source column. The code below works, but its
#really very clunky and I'm sure I am not using these tools as efficiently
as I can.

#The varying=list(3:7) specifies the columns in the dataframe that
corresponded to the sources (GISS, etc.), though then in the resulting
#reshaped dataframe the sources are numbered 1-5, so I have to reassigned
their names. In addition, the original dataframe has
#additional data columns I do not want and so after reshaping I create
another! dataframe with just the columns I need, and
#then I have to rename them so that I can keep track of what everything is.
Whew! Not the most elegant of code.

d<-reshape(clim.data, varying=list(3:7),idvar="date",
v.names="anomaly",direction="long")

d$time<-ifelse(d$time==1,"GISS",d$time)
d$time<-ifelse(d$time==2,"HAD",d$time)
d$time<-ifelse(d$time==3,"NOAA",d$time)
d$time<-ifelse(d$time==4,"RSS",d$time)
d$time<-ifelse(d$time==5,"UAH",d$time)

new.data<-data.frame(d$date,d$time,d$anomaly)
names(new.data)<-c("date","source","anomaly")

I realize this is a mess, though it works. I think with just some help on how better to work this example I'll probably get over the learning hump and actually figure out how to use these data manipulation functions more
cleanly.

Any advice or assistance would be appreciated.
Thanks,
Nate

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to