And with "equally spaced" I obviously meant "of equal size". It's getting too hot in the office here...
On Wed, May 31, 2017 at 4:39 PM, Joris Meys <jorism...@gmail.com> wrote: > Seriously, if a method gives a wrong result, it's wrong. line() does NOT > implement the algorithm of Tukey, even not after the patch. We're not > discussing Excel here, are we? > > The method of Tukey is rather clear, and it is NOT using the default > quantile definition from the quantile function. Actually, it doesn't even > use quantiles to define the groups. It just says that the groups should be > more or less equally spaced. As the method of Tukey relies on the medians > of the subgroups, it would make sense to pick a method that is > approximately unbiased with regard to the median. That would be type 8 > imho. > > To get the size of the outer groups, Tukey would've been more than happy > enough with a: > > > floor(length(dfr$time) / 3) > [1] 6 > > There you have the size of your left and right group, and now we can > discuss about which median type should be used for the robust fitting. > > But I can honestly not understand why anyone in his right mind would > defend a method that is clearly wrong while not working at Microsoft's > spreadsheet department. > > Cheers > Joris > > On Wed, May 31, 2017 at 4:03 PM, Serguei Sokol <so...@insa-toulouse.fr> > wrote: > >> Le 31/05/2017 à 15:40, Joris Meys a écrit : >> >>> OTOH, >>> >>> > sapply(1:9, function(i){ >>> + sum(dfr$time <= quantile(dfr$time, 1./3., type = i)) >>> + }) >>> [1] 8 8 6 6 6 6 8 6 6 >>> >>> Only the default (type = 7) and the first two types give the result >>> lines() gives now. I think there is plenty of reasons to give why any of >>> the other 6 types might be better suited in Tukey's method. >>> >>> So to my mind, chaning the definition of line() to give sensible output >>> that is in accordance with the theory, does not imply any inconsistency >>> with the quantile definition in R. At least not with 6 out of the 9 >>> different ones ;-) >>> >> Nice shot. >> But OTOE (on the other end ;) >> > sapply(1:9, function(i){ >> + sum(dfr$time >= quantile(dfr$time, 2./3., type = i)) >> + }) >> [1] 8 8 8 8 6 6 8 6 6 >> >> Here "8" gains 5 votes against 4 for "6". There were two defector methods >> that changed the point number and should be discarded. Which leaves us >> with the score 3:4, still in favor of "6" but the default method should >> prevail >> in my sens. >> >> Serguei. >> > > > > -- > Joris Meys > Statistical consultant > > Ghent University > Faculty of Bioscience Engineering > Department of Mathematical Modelling, Statistics and Bio-Informatics > > tel : +32 (0)9 264 61 79 <+32%209%20264%2061%2079> > joris.m...@ugent.be > ------------------------------- > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php > -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Mathematical Modelling, Statistics and Bio-Informatics tel : +32 (0)9 264 61 79 joris.m...@ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel