Dear all,

I am using R to work on huge numbers of telemetry data divided by day. Each file (an xlsx file) contains 2 rows, the first one for sst readings and the second one for chl readings, and 72360 columns, each corresponding to the centre of a cell in my study area. The columns have no headings. Lots of cells have fake readings (-999.0000000). What I want to do is merging the files together, by month and season, replace null values with "NA" and then calculate for both sst and chl average row values. I have stored the files in the directory C:/TEMP. This directory contains 12 subfolders, January to December and each subfolder contains a certain number of files, corresponding to the number of days for each month (e.g. January 31 files, February 30 files, and so on).

I already have commands that work properly but would really know if it is possible to reduce their number and, maybe to do some of them automatically. What I do is working "month-by-month" as it follows (I am aware this is not the most elegant way to do it, i'm new to R and for the moment "elegance&stile" is not my main goal):

>setwd("C:/Temp/January09")        # to set my working directory
>library(xlsx) # to load the "xlsx" library necessary to handle the original *.xlsx files
>list.jan09<-list.files("C:/Temp/January09", full=TRUE)
>read.all.jan09<-lapply(list.jan09, read.xlsx, 1, header=FALSE)
>daily.all.jan09<-do.call("cbind",read.all.jan09) # to create a data frame containig all my data >daily.sst.jan09<-daily.all.jan09[,seq(from=1,to=61,by=2)] # to create a second data frame containing only sst readings (sst readings correspond to the first column of each daily file). The resulting file will have 31 columns and 72360 lines >daily.chl.jan09<-daily.all.jan09[,seq(from=2,to=62,by=2)] # to create a third data frame containing only chl readings (chl readings correspond to the second column of each daily file). The resulting file will have 31 columns and 72360 lines
>daily.sst.jan09<-replace(daily.sst.jan09,daily.sst.jan09==-999.0000000,NA)       # used 
to replace -999.0000000 values with "NA"               
>jan09_avgsst<-rowMeans(daily.sst.jan09) # to create a vector containing the mean sst value of all the rows >write.xlsx(jan09_avgsst, "C:/Users/AAA/Desktop/Data/january09_avgsst.xlsx") # to store the sst vector
>daily.chl.jan09<-replace(daily.chl.jan09,daily.chl.jan09==-999.0000000,NA)       # used 
to replace -999.0000000 values with "NA"               
>jan09_avgchl<-rowMeans(daily.chl.jan09) # to create a vector containing the mean value of all the rows >write.xlsx(jan09_avgchl, "C:/Users/AAA/Desktop/Data/january09_avgchl.xlsx") # to store the chl vector

I repeat these same commands for all the months and for the seasons (January-March; April-June; July-September; October-December), so the all thing is a bit redundant.

How can I speed up the process, reduce the commands and maybe make them automatically? Many thanks for your help.

Cheers,
Nino

--
Nino Pierantonio

Mobile: +39 349.532.9370
Skype: pierantonio_nino

 * Italiano - rilevata
 * Inglese
 * Italiano
 * Francese
 * Spagnolo
 * Tedesco

 * Inglese
 * Italiano
 * Francese
 * Spagnolo
 * Tedesco

 <javascript:void(0);>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to