Lorenzo Isella wrote:
Dear All,
Apologies if this is too simple for this list.
Let us assume that you have an instrument measuring particle distributions.
The output is a set of counts {n_i} corresponding to a set of average
sizes {d_i}.
The set of {d_i} ranges from d_i_min to d_i_max either linearly of
logarithmically.
There is no access to further detailed information about the
distribution of the measured sizes, but at least you know enough to
plot n(d_i) (number of counts as a function of particle size).
If you can fit the {n_i} to a known distribution (e.g. normal or
lognormal), then you can choose a new set of average sizes, {D_i} and
plot the corresponding n_i(D_i).
But what if the initial {n_i}'s observations do not belong to a known
distribution and you still want to calculate n(D_i)?
On the top of my head, I think that whatever I do must conserve the
original total number of observations N=\sum_i{n_i}, but this does not
terribly constrain the problem.
Any suggestion is welcome.
Hi Lorenzo,
You should probably be aware that both the position and spacing of
category boundaries can have a large effect on parameter location tests
carried out on the categorized data. See:
Wainer, H., Geseroli, M. & Verdi, M. (2006) Finding what is not there
through the unfortunate binning of results: The Mendel effect.
Chance,19(1): 49-52.
Lemon, J. On the perils of categorizing responses. Tutorials in
Quantitative Methods for Psychology, 5(1): 35-39.
Jim
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.