Martin,

There's also the work of a former PhD student in our Dept:

http://arxiv.org/pdf/1007.1032.pdf

Matias




On 24/09/2014 1:16 AM, Martin Maechler wrote:
Rolf Turner <r.tur...@auckland.ac.nz>
     on Wed, 24 Sep 2014 18:43:34 +1200 writes:
     > On 24/09/14 17:31, Mohan Radhakrishnan wrote:
     >> Hi,
     >>
     >> I have streaming data(1 TB) that can't fit in memory. Is
     >> there a way for me to find the median of these streaming
     >> integers assuming I can fit only a small part in memory ?
     >> This is about the statistical approach to find the median
     >> of a large number of values when I can inspect only a
     >> part of them due to memory constraints.

     > You cannot, I'm pretty sure, calculate the median
     > recursively.  However there are "approximate" recursive
     > median algorithms which provide an estimate of location
     > that has the same asymptotic properties as the median.

     > See:

     > * U. Holst, Recursive estimators of location.
     > Commun. Statist. Theory Meth., vol. 16, 1987,
     > pp. 2201--2226.

     > and

     > * Murray A. Cameron and T. Rolf Turner, Recursive location
     > and scale estimators, Commun. Statist. Theory Meth.,
     > vol. 22, 1993, pp. 2503--2515.

This is really interesting to me, thank you, Rolf!

OTOH,

1) has your proposal ever been provided in R?
    I'd be happy to add it to the robustX
    (http://cran.ch.r-project.org/web/packages/robustX) or even
    robustbase (http://cran.ch.r-project.org/web/packages/robustbase) package.

2) Would anybody know of more recent research on the subject?
    (I quickly "googled around" and found research more geared
     for the time series situation which is more involved anyway)

    --> Hence CC'ing the experts' list  R-SIG-robust


Martin Maechler,  ETH Zurich


     > cheers,
     > Rolf Turner

     > --
     > Rolf Turner Technical Editor ANZJS

_______________________________________________
r-sig-rob...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-robust

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to