In this scenario the /export handler continues to export results until it
encounters a "Broken Pipe" exception. This exception is trapped and ignored
rather then logged as it's not considered an exception if the client
disconnects early.

Joel Bernstein
http://joelsolr.blogspot.com/

On Fri, May 12, 2017 at 2:10 PM, Susmit Shukla <shukla.sus...@gmail.com>
wrote:

> Hi,
>
> I have a question regarding solr /export handler. Here is the scenario -
> I want to use the /export handler - I only need sorted data and this is the
> fastest way to get it. I am doing multiple level joins using streams using
> /export handler. I know the number of top level records to be retrieved but
> not for each individual stream rolling up to the final result.
> I observed that calling close() on a /export stream is too expensive. It
> reads the stream to the very end of hits. Assuming there are 100 million
> hits for each stream ,first 1k records were found after joins and we call
> close() after that, it would take many minutes/hours to finish it.
> Currently I have put close() call in a different thread - basically fire
> and forget. But the cluster is very strained because of the unneccessary
> reads.
>
> Internally streaming uses ChunkedInputStream of HttpClient and it has to be
> drained in the close() call. But from server point of view, it should stop
> sending more data once close() has been issued.
> There is a read() call in close() method of ChunkedInputStream that is
> indistinguishable from real read(). If /export handler stops sending more
> data after close it would be very useful.
>
> Another option would be to use /select handler and get into business of
> managing a custom cursor mark that is based on the stream sort and is reset
> until it fetches the required records at topmost level.
>
> Any thoughts.
>
> Thanks,
> Susmit
>

Reply via email to