Hello fellow solr-users!

Currently, if I do an HTTP request to receive some data via streaming
expressions, like:

curl --data-urlencode 'expr=search(test,
                                   q="foo_s:*",
                                   fl="foo_s",
                                   sort="foo_s asc",
                                   qt="/export")'
http://localhost:8983/solr/test/stream

I get all results at once. This is more obvious if I simply introduce
a one-second sleep in CloudSolrStream: with three documents, the
request takes about three seconds, and I get all three docs after
three seconds.

Instead, I would like to get documents in a more "streaming" way. For
example, after X seconds give me what you already have. Or if an
Y-sized buffer fills up, give me all the tuples you have, then resume.

Any ideas/opinions in terms of how I could achieve this? With or
without changing Solr's code?

Here's what I have so far:
- this is normal with non-chunked HTTP/1.1. You get all results at
once. If I revert this patch[1] and get Solr to use chunked encoding,
I get partial results every... what seems to be a certain size between
16KB and 32KB
- I couldn't find a way to manually change this... what I assume is a
buffer size, but failed so far. I've tried changing Jetty's
response.setBufferSize() in HttpSolrCall (maybe the wrong place to do
it?) and also tried changing the default 8KB buffer in FastWriter
- manually flushing the writer (in JSONResponseWriter) gives the
expected results (in combination with chunking)

The thing is, even if I manage to change the buffer size, I assume
that will apply to all requests (not just streaming expressions). I
assume that ideally it would be configurable per request. As for
manual flushing, that would require changes to the streaming
expressions themselves. Would that be the way to go? What do you
think?

[1] https://issues.apache.org/jira/secure/attachment/12787283/SOLR-8669.patch

Best regards,
Radu
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

Reply via email to