Hello Nicholas, Looks like we are around the same point. Here is my branch https://github.com/m-khl/solr-patches/tree/streaming there are only two commits on top of it. And here is the test https://github.com/m-khl/solr-patches/blob/streaming/solr/core/src/test/org/apache/solr/response/ResponseStreamingTest.java it streams increasing int ids w/o keeping them in heap. There is also some unnecessary stuff with Digits() but, work in progress, you know.
My usecase is really specific: * Let's we have documents inserted with increasing PKs, then I want to search them, and retrieve results ordered by PK * but, in this case I can them sorted for free - they are already ordered. * And I also can have a pretty huge results, because I don't need to store them in response, but just stream them into output as they occurs on searching. Core points: * https://github.com/m-khl/solr-patches/blob/streaming/solr/core/src/java/org/apache/solr/servlet/ResponseStreamingRequestParsers.java steals servlet outputstream and put it into request * Solr Component adds PostFilter in a chain, which puts collecting docs into DocSetStreamer https://github.com/m-khl/solr-patches/blob/streaming/solr/core/src/java/org/apache/solr/handler/component/ResponseStreamerComponent.java * and DocSetStreamer writes collecting docs PKs into outputStream https://github.com/m-khl/solr-patches/blob/streaming/solr/core/src/java/org/apache/solr/response/DocSetStreamer.java It seems that it can have a huge results with a few of memory. It should be damn scalable. So, my plan is output a response, which is readable by intrinsic distributed search. But, I'm stopped for some reason. About chunked encoding I have a kind of common sense consideration: I guess that there is a buffer behind servlet output stream, when output fits, it forms http response with content-length, on overflow it switches to chunked encoding. So, I believe but can be wrong (and even lie) that these machinery should be behind the curtain. WDYT? On Thu, Mar 15, 2012 at 4:17 AM, Nicholas Ball <nicholas.b...@nodelay.com>wrote: > Hello all, > > I've been working on a plugin with a custom component and a few handlers > for a research project. It's aim is to do some interesting distributed > work, however I seem to have come to a road block when trying to respond to > a clients request in multiple steps. Not even sure if this is possible with > Solr but after no luck on the IRC channel, thought I'd ask here. > > What I'd like to achieve is to be able to have the requestHandler return > results to a user as soon as it has data available, then continue > processing or performing other distributed calls, and then return some more > data, all on the same single client request. > > Now my understanding is that solr does some kind of streaming. Not sure > how it's technically done over http in Solr so any information would be > useful. I believe something like this would work well but again not sure: > > http://en.m.wikipedia.org/wiki/Chunked_transfer_encoding > > I also came across this issue/feature request in JIRA but not completely > sure what the conclusion was or how someone might do/use this. Is it even > relevant to what I'm looking for? > > https://issues.apache.org/jira/browse/SOLR-578 > > Thank you very much for any help and time you can spare! > > Nicholas (incunix) > > > -- Sincerely yours Mikhail Khludnev Lucid Certified Apache Lucene/Solr Developer Grid Dynamics <http://www.griddynamics.com> <mkhlud...@griddynamics.com>