I recently came to realize the true power of R for statistical analysis -- mainly for post-processing of data from large-scale simulations -- and have been converting many of existing Python(SciPy) scripts to those based on R and/or Perl.
In the middle of this conversion, I revisited the problem of curve fitting for simulation data with multiple observations resulting from repetitions. In the past, I first processed simulation data (i.e., multiple y's from repetitions) to get a mean with a confidence interval for a given value of x (independent variable) and then applied spline procedure for those mean values only (i.e., unique pairs of (x_i, y_i) for i=1, 2, ...) to get a smoothed curve. Because of rather large confidence intervals, however, the resulting curves were hardly smooth enough for my purpose, I had to fix the function to exponential and used least square methods to fit its parameters for data. >From a plot with confidence intervals, it's rather easy for one to visually and manually(?) figure out a smoothed curve for it. So I'm thinking right now of directly applying spline (or whatever regression procedures for this purpose) to the simulation data with repetitions rather than means. The simulation data in this case looks like this (assuming three repetitions): # x y 1 1.2 1 0.9 1 1.3 2 2.2 2 1.7 2 2.0 ... .... So my idea is to let spline procedure handle the fluctuations in the data (i.e., in repetitions) by itself. But I wonder whether this direct application of spline procedures for data with multiple observations makes sense from the statistical analysis (i.e., theoretical) point of view. It may be a stupid question and quite obvious to many, but personally I don't know where to start. It would be greatly appreciated if anyone can shed a light on this in this regard. Many thanks in advance, Joseph ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.