[LabPlot2] [Bug 474732] Number of generated values from generate data with stdev or similar column statistic functions seems to be wrong

bugzilla_noreply Sun, 24 Sep 2023 14:15:21 -0700

https://bugs.kde.org/show_bug.cgi?id=474732


--- Comment #2 from pcfreak...@gmail.com ---
(In reply to Alexander Semke from comment #1)
> (In reply to pcfreak115 from comment #0)
> > STEPS TO REPRODUCE
> > 1. Create column with numbers in it
> > 2. Create second column in another table
> > 3. For this second column, use Generate Data and then the function stdev(x),
> > where for x the column from step 1 is used
> > 
> > OBSERVED RESULT
> > The second column contains exactly as many entries as there are in the first
> > column
> > [...]
> Right now we adjust the size of the target column/spreadsheet which makes
> sense in very many cases. But your points are valid, of course. In case
> spreadsheets with different sizes are involved, we should warn the user and
> also allow to decide whether the sizes should be adjusted or not. We'll
> implement this.
> 
> May I ask you about your scenario? Why do you need to have the standard
> deviation 50 times?

In my scenario I recorded some background noise with my experiment and put all
the values into one column. Then I did the actual measurement but with a
smaller number of samples. I then wanted to use the standard deviation of the
background noise as error for my actual measurements. And as far as I know, i
have to fill another column parallel to the measurements with the errors so in
a plot all my measurements get their proper error bar. 

This is, of course, a relatively simple case where I could also just copy the
value of the standard deviation and generate constant values in the error
column. But I believe in more complicated cases this might also become a lot
more inconvenient. (Think automated templates which can be reused if an
experiment is repeated without having to manually copy some specific value,
etc...) 

Also, for the usual functions (i.e. Number -> Number, just applied to all
entries, like sqrt(x)) the resizing behavior, i.e. that the amount of entries
in the target column is set by the source column, makes somewhat sense to me. 
But in this case stdev is a function Column -> Number, where I believe the
amount of entries in the target column should be set by the target.
Composite/chained functions like stdev(x)*sqrt(x)*y*stdev(x*y) where x and y
correspond to differently sized columns/tables might make this more
complicated, though  (Not something i actually need, just another example.
Speaking of which, I haven't checked if something like stdev(x*y) is even
possible right now?***). In this example i would expect the target size to be
expanded to fit sqrt(x)*y, the standard deviations should just act as scalers
to the entire target column.


*** stdev(x*y) doesn't make sense for any particular values of x and y, because
this would just be 0, but rather for x and y as references to columns. While
writing this I checked, and I get the first behavior, i.e. stdev(x*y) = 0,
which is kinda inconsistent behavior? stdev(x) is the standard deviation of the
entire column referenced by x while stdev(x*y) is the standard deviation of the
single value x*y. However, stdev(x*y) could be rephrased with another column z
which is generated by a function z=x*y, and then stdev(z) has not the same
result as stdev(x*y). I believe this could be another bug, which is probably
common to all functions Column -> Number.

It seems like functions which are not "bijective" with regard to the index in
the source- and targetcolumns make the generate function system a lot more
complicated... I have some more thoughts on this, but they are kinda half-baked
so I don't want to share them right now.

Sorry for the wall of text. I hope it makes sense, at least.

-- 
You are receiving this mail because:
You are watching all bug changes.

[LabPlot2] [Bug 474732] Number of generated values from generate data with stdev or similar column statistic functions seems to be wrong

Reply via email to