Thanks, both of you, this is really helpful!

Rui, this is a very neat, clear, and effective way to do what I need to do.
 It is essentially the same idea that I was trying to do with a for loop,
but instead with a function, obviously.  I really appreciate the additional
functions that you taught me, as well, and will probably end up using
something very close to what you gave me.  Out of curiosity, though, aside
from its neatness, is it faster and more "R" like to use the functions on a
vector rather than a loop?

Sam

On Tue, Oct 2, 2012 at 8:05 PM, Jeff Newmiller <jdnew...@dcn.davis.ca.us>wrote:

> File operations are not vectorizable. About the only thing you can do for
> the iterating through files part might be to use lapply instead of a for
> loop, but that is mostly a style change.
>
> Once you have read the dbf files there will probably be vector functions
> you can use (quantile). Off the top of my head I don't know a function that
> tells you which value corresponds to a particular quantile, but you can
> probably sort the data with order(), find the value whose ecdf is just
> below your target with which.max, and look at the row number of that value.
>
> x <- rnorm(11)
> names(x) <- seq(x)
> xs <- x[order(x)]
> Row90 <- as.numeric(names (xs)[0.9<=seq(xs)/length(xs))])
>
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<jdnew...@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live
> Go...
>                                       Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> Sam Asin <asin....@gmail.com> wrote:
>
> >Hello,
> >
> >Sorry if this process is too simple for this list.  I know I can do it,
> >but
> >I always read online about how when using R one should always try to
> >avoid
> >loops and use vectors.  I am wondering if there exists a more "R
> >friendly"
> >way to do this than to use for loops.
> >
> >I have a dataset that has a list of "ID"s.  Let's call this dataset
> >"Master"
> >
> >Each of these "ID"s has an associated DBF file.  The DBF files each
> >have
> >the same title, and they are each located in a directory path that
> >includes, as one of the folder names, the "ID".
> >
> >These DBF files have 2 columns of interest.  One is the "run number"
> >the
> >other is the "statistic."  I'm interested in the median and 90th
> >percentile
> >of the "statistic" as well as their corresponding run numbers.
> >Ultimately,
> >I want a table that consists of
> >
> >ID Run_50th Stat_50 Run_90 Stat_90
> >1AB      5    102010     3         144376
> >1AC      3    999999     6         999999999
> >
> >etc.
> >
> >Where I currently have a dataset that has
> >
> >ID
> >1AB
> >1AC
> >
> >etc.
> >
> >And there are several DBF files that are in folders i.e.
> >"folder1/1AC/folder2/blah.dbf"
> >
> >This dbf looks like
> >
> >run   Stat
> >
> >1      10
> >2      10
> >3      999999
> >4      100000000000
> >5      100000000
> >6       9999999999
> >7      100000000
> >8     10
> >9     10
> >10    10
> >11     1000000
> >
> >
> >I know i could do this with a loop, but I can't see the efficient, R
> >way.
> > I was hoping that you experienced R programmers could give me some
> >pointers on the most efficient way to achieve this result.
> >
> >Sam
> >
> >       [[alternative HTML version deleted]]
> >
> >______________________________________________
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to