Frankly speaking, I am not looking for such a framework. The system I'm studying is a communication network (like M/M/1 queue, but way too complicated to mathematically analyze it using classical queueing theory) and the conclusion I want to make is qualitative rather than quantatitive -- a high-level comparative study of various network architectures based on the "equivalence principle" (a concept specific to netwokring, not in the general sense).
What l want in this regard is a smooth, non-decreasing (hence one-to-one) function built out of simulation data because later in my processing, I need an inverse function of the said curve to find out an x value given the y value. That was, in fact, the reason I used the exponential (i.e., non-decreasing function) curve fiting. Even though I don't need a statistical inference framework for my work, I want to make sure that my use of regression/curve fitting techniques with my simulation data (as a tool for getting the mentioned curve) is proper and a usual practice among experts like you. To get answer to my question, I digged a lot through the Internet but found no clear explanation so far. Your suggestions and providing examples (always!) are much appreciated, but I am still not sure the use of those regression procedures with the kind of data I described is a right way to do. Again, many thanks for your prompt and kind answers, Joseph On Tue, Apr 27, 2010 at 8:46 PM, Gabor Grothendieck <ggrothendi...@gmail.com> wrote: > If you are looking for a framework for statistical inference you could > look at additive models as in the mgcv package which has a book > associated with it if you need more info. e.g. > > library(mgcv) > fm <- gam(dist ~ s(speed), data = cars) > summary(fm) > plot(dist ~ speed, cars, pch = 20) > fm.ci <- with(predict(fm, se = TRUE), cbind(0, -2*se.fit, 2*se.fit) + c(fit)) > matlines(cars$speed, fm.ci, lty = c(1, 2, 2), col = c(1, 2, 2)) > > > On Tue, Apr 27, 2010 at 3:07 PM, Kyeong Soo (Joseph) Kim > <kyeongsoo....@gmail.com> wrote: >> Hello Gabor, >> >> Many thanks for providing actual examples for the problem! >> >> In fact I know how to apply and generate plots using various R >> functions including loess, lowess, and smooth.spline procedures. >> >> My question, however, is whether applying those procedures directly on >> the data with multiple observations/duplicate points(?) is on the >> sound basis or not. >> >> Before asking my question to the list, I checked smooth.spline manual >> pages and found the mentioning of "cv" option related with duplicate >> points, but I'm not sure "duplicate points" in the manual has the same >> meaning as "multiple observations" in my case. To me, the manual seems >> a bit unclear in this regard. >> >> Looking at "car" data, I found it has multiple points with the same >> "speed" but different "dist", which is exactly what I mean by multiple >> observations, but am still not sure. >> >> Regards, >> Joseph >> >> >> On Tue, Apr 27, 2010 at 7:35 PM, Gabor Grothendieck >> <ggrothendi...@gmail.com> wrote: >>> This will compute a loess curve and plot it: >>> >>> example(loess) >>> plot(dist ~ speed, cars, pch = 20) >>> lines(cars$speed, fitted(cars.lo)) >>> >>> Also this directly plots it but does not give you the values of the >>> curve separately: >>> >>> library(lattice) >>> xyplot(dist ~ speed, cars, type = c("p", "smooth")) >>> >>> >>> >>> On Tue, Apr 27, 2010 at 1:30 PM, Kyeong Soo (Joseph) Kim >>> <kyeongsoo....@gmail.com> wrote: >>>> I recently came to realize the true power of R for statistical >>>> analysis -- mainly for post-processing of data from large-scale >>>> simulations -- and have been converting many of existing Python(SciPy) >>>> scripts to those based on R and/or Perl. >>>> >>>> In the middle of this conversion, I revisited the problem of curve >>>> fitting for simulation data with multiple observations resulting from >>>> repetitions. >>>> >>>> In the past, I first processed simulation data (i.e., multiple y's >>>> from repetitions) to get a mean with a confidence interval for a given >>>> value of x (independent variable) and then applied spline procedure >>>> for those mean values only (i.e., unique pairs of (x_i, y_i) for i=1, >>>> 2, ...) to get a smoothed curve. Because of rather large confidence >>>> intervals, however, the resulting curves were hardly smooth enough for >>>> my purpose, I had to fix the function to exponential and used least >>>> square methods to fit its parameters for data. >>>> >>>> >From a plot with confidence intervals, it's rather easy for one to >>>> visually and manually(?) figure out a smoothed curve for it. >>>> So I'm thinking right now of directly applying spline (or whatever >>>> regression procedures for this purpose) to the simulation data with >>>> repetitions rather than means. The simulation data in this case looks >>>> like this (assuming three repetitions): >>>> >>>> # x y >>>> 1 1.2 >>>> 1 0.9 >>>> 1 1.3 >>>> 2 2.2 >>>> 2 1.7 >>>> 2 2.0 >>>> ... .... >>>> >>>> So my idea is to let spline procedure handle the fluctuations in the >>>> data (i.e., in repetitions) by itself. >>>> But I wonder whether this direct application of spline procedures for >>>> data with multiple observations makes sense from the statistical >>>> analysis (i.e., theoretical) point of view. >>>> >>>> It may be a stupid question and quite obvious to many, but personally >>>> I don't know where to start. >>>> It would be greatly appreciated if anyone can shed a light on this in >>>> this regard. >>>> >>>> Many thanks in advance, >>>> Joseph >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >> > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.