Re: [R] More efficient use of reshape?

David Winsemius Thu, 13 Dec 2012 09:51:30 -0800


On Dec 13, 2012, at 9:16 AM, Nathan Miller wrote:

Hi all,

I have played a bit with the "reshape" package and function along with
"melt" and "cast", but I feel I still don't have a good handle onhow touse them efficiently. Below I have included a application of"reshape" that
is rather clunky and I'm hoping someone can offer advice on how to use
reshape (or melt/cast) more efficiently.

You do realize that the 'reshape' function is _not_ in the reshapepackage, right? And also that the reshape package has been supersededby the reshape2 package?


--
David.

#For this example I am using climate change data available on-line

file <- ("
http://processtrends.com/Files/RClimate_consol_temp_anom_latest.csv";)
clim.data <- read.csv(file, header=TRUE)

library(lubridate)
library(reshape)
#I've been playing with the lubridate package a bit to work withdates, but
as the climate dataset only uses year and month I have
#added a "day" to each entry in the "yr_mn" column and then used"dym" from
lubridate to generate the POSIXlt formatted dates in
#a new column clim.data$date

clim.data$yr_mn<-paste("01", clim.data$yr_mn, sep="")
clim.data$date<-dym(clim.data$yr_mn)
#Now to the reshape. The dataframe is in a wide format. The columnsGISS,
HAD, NOAA, RSS, and UAH are all different sources
#from which the global temperature anomaly has been calculated since1880
(actually only 1978 for RSS and UAH). What I would like to
#do is plot the temperature anomaly vs date and use ggplot to facetby the
different data source (GISS, HAD, etc.). Thus I need the
#data in long format with a date column, a temperature anomalycolumn, and
a data source column. The code below works, but its
#really very clunky and I'm sure I am not using these tools asefficiently
as I can.

#The varying=list(3:7) specifies the columns in the dataframe that
corresponded to the sources (GISS, etc.), though then in the resulting
#reshaped dataframe the sources are numbered 1-5, so I have toreassigned
their names. In addition, the original dataframe has
#additional data columns I do not want and so after reshaping I create
another! dataframe with just the columns I need, and
#then I have to rename them so that I can keep track of whateverything is.
Whew! Not the most elegant of code.

d<-reshape(clim.data, varying=list(3:7),idvar="date",
v.names="anomaly",direction="long")

d$time<-ifelse(d$time==1,"GISS",d$time)
d$time<-ifelse(d$time==2,"HAD",d$time)
d$time<-ifelse(d$time==3,"NOAA",d$time)
d$time<-ifelse(d$time==4,"RSS",d$time)
d$time<-ifelse(d$time==5,"UAH",d$time)

new.data<-data.frame(d$date,d$time,d$anomaly)
names(new.data)<-c("date","source","anomaly")
I realize this is a mess, though it works. I think with just somehelp onhow better to work this example I'll probably get over the learninghumpand actually figure out how to use these data manipulation functionsmore
cleanly.

Any advice or assistance would be appreciated.
Thanks,
Nate

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Alameda, CA, USA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] More efficient use of reshape?

Reply via email to