Dear sir,
At the outset I sincerely apologize for reverting back bit late as I was out of
office. I thank you for your guidance extended by you in response to my earlier
mail regarding "Writing a single output file" where I was trying to read
multiple output files and create a single output date.frame. However, I think
things are not working as I am mentioning below -
# Your code
setwd('/temp')
fileNames <- list.files(pattern = "file.*.csv")
input <- do.call(rbind, lapply(fileNames, function(.name)
{
.data <- read.table(.name, header = TRUE, as.is = TRUE)
.data$file <- .name
.data
}))
# This produces following output containing only two columns and moreover date
and yield_rates are clubbed together.
date.yield_rate file
1 12/23/10,5.25 file1.csv
2 12/22/10,5.19 file1.csv
3 12/23/10,4.16 file2.csv
4 12/22/10,4.59 file2.csv
5 12/23/10,6.15 file3.csv
6 12/22/10,6.41 file3.csv
7 12/23/10,8.15 file4.csv
8 12/22/10,8.68 file4.csv
# and NOT the kind of output given below where date and yield_rates are
different.
> input
date yield_rate file
1 12/23/2010 5.25 file1.csv
2 12/22/2010 5.19 file1.csv
3 12/23/2010 5.25 file2.csv
4 12/22/2010 5.19 file2.csv
5 12/23/2010 5.25 file3.csv
6
12/22/2010 5.19 file3.csv
7 12/23/2010 5.25 file4.csv
8 12/22/2010 5.19 file4.csv
So when I tried following code to produce the required result, it throws me an
error.
require(reshape)
in.melt <- melt(input, measure = 'yield_rate')
> in.melt <- melt(input, measure = 'yield_rate')
Error: measure variables not found in data: yield_rate
# So I tried
in.melt <- melt(input, measure = 'date.yield_rate')
cast(in.melt, date.yield_rate ~ file)
> cast(in.melt, date ~ file)
Error: Casting formula contains variables not found in molten data: date
# If I try to change it as
cast(in.melt, date.yield_rate ~ file) # Gives following error.
Error: Casting formula contains variables not found in molten data:
date.yield_rate
Sir, it will be a
great help if you can guide me and once again sinserely apologize for
reverting so late.
Regards
Amy
--- On Thu, 12/23/10, jim holtman <[email protected]> wrote:
From: jim holtman <[email protected]>
Subject: Re: [R] Writing a single output file
To: "Amy Milano" <[email protected]>
Cc: [email protected]
Date: Thursday, December 23, 2010, 1:39 PM
This should get you close:
> # get file names
> setwd('/temp')
> fileNames <- list.files(pattern = "file.*.csv")
> fileNames
[1] "file1.csv" "file2.csv" "file3.csv" "file4.csv"
> input <- do.call(rbind, lapply(fileNames, function(.name){
+ .data <- read.table(.name, header = TRUE, as.is = TRUE)
+ # add
file name to the data
+ .data$file <- .name
+ .data
+ }))
> input
date yield_rate file
1 12/23/2010 5.25 file1.csv
2 12/22/2010 5.19 file1.csv
3 12/23/2010 5.25 file2.csv
4 12/22/2010 5.19 file2.csv
5 12/23/2010 5.25 file3.csv
6 12/22/2010 5.19 file3.csv
7 12/23/2010 5.25 file4.csv
8 12/22/2010 5.19 file4.csv
> require(reshape)
> in.melt <- melt(input, measure = 'yield_rate')
> cast(in.melt, date ~ file)
date file1.csv file2.csv file3.csv file4.csv
1 12/22/2010 5.19 5.19
5.19 5.19
2 12/23/2010 5.25 5.25 5.25 5.25
>
On Thu, Dec 23, 2010 at 8:07 AM, Amy Milano <[email protected]> wrote:
> Dear R helpers!
>
> Let me first wish all of you "Merry Christmas and Very Happy New year 2011"
>
> "Christmas day is a day of Joy and Charity,
> May God make you rich in both" - Phillips Brooks
>
> ##
> ----------------------------------------------------------------------------------------------------------------------------
>
> I have a process which generates number of outputs. The R code for the same
> is as given below.
>
> for(i in 1:n)
> {
> write.csv(output[i], file = paste("output", i, ".csv", sep = ""), row.names =
FALSE)
> }
>
> Depending on value of 'n', I get different output files.
>
> Suppose n = 3, that means I am having three output csv files viz.
> 'output1.csv', 'output2.csv' and 'output3.csv'
>
> output1.csv
> date yield_rate
> 12/23/2010 5.25
> 12/22/2010 5.19
> .................................
> .................................
>
>
> output2.csv
>
> date yield_rate
>
> 12/23/2010 4.16
>
> 12/22/2010 4.59
>
> .................................
>
>
.................................
>
> output3.csv
>
>
> date yield_rate
>
>
> 12/23/2010 6.15
>
>
> 12/22/2010 6.41
>
>
> .................................
>
>
> .................................
>
>
>
> Thus all the output files have same column names viz. Date and yield_rate.
> Also, I do need these files individually too.
>
> My further requirement is to have a single dataframe as given below.
>
> Date yield_rate1
yield_rate2 yield_rate3
> 12/23/2010 5.25 4.16
> 6.15
> 12/22/2010 5.19 4.59
> 6.41
> ...............................................................................................
> ...............................................................................................
>
> where
yield_rate1 = output1$yield_rate and so on.
>
> One way is to simply create a dataframe as
>
> df = data.frame(Date = read.csv('output1.csv')$Date, yield_rate1 =
> read.csv('output1.csv')$yield_rate, yield_rate2 =
> read.csv('output2.csv')$yield_rate,
> yield_rate3 = read.csv('output3.csv')$yield_rate)
>
> However, the problem arises when I am not aware how many output files are
> there as n can be 5 or even 100.
>
> So is it possible to write some loop or some function which will enable me to
> read 'n' files individually and then keeping "Date" common, only pickup the
> yield_curve data from each output file.
>
> Thanking in advance for any guidance.
>
> Regards
>
> Amy
>
>
>
>
>
> [[alternative HTML version deleted]]
>
>
>
______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.