Hello fellow solr-users! Currently, if I do an HTTP request to receive some data via streaming expressions, like:
curl --data-urlencode 'expr=search(test, q="foo_s:*", fl="foo_s", sort="foo_s asc", qt="/export")' http://localhost:8983/solr/test/stream I get all results at once. This is more obvious if I simply introduce a one-second sleep in CloudSolrStream: with three documents, the request takes about three seconds, and I get all three docs after three seconds. Instead, I would like to get documents in a more "streaming" way. For example, after X seconds give me what you already have. Or if an Y-sized buffer fills up, give me all the tuples you have, then resume. Any ideas/opinions in terms of how I could achieve this? With or without changing Solr's code? Here's what I have so far: - this is normal with non-chunked HTTP/1.1. You get all results at once. If I revert this patch[1] and get Solr to use chunked encoding, I get partial results every... what seems to be a certain size between 16KB and 32KB - I couldn't find a way to manually change this... what I assume is a buffer size, but failed so far. I've tried changing Jetty's response.setBufferSize() in HttpSolrCall (maybe the wrong place to do it?) and also tried changing the default 8KB buffer in FastWriter - manually flushing the writer (in JSONResponseWriter) gives the expected results (in combination with chunking) The thing is, even if I manage to change the buffer size, I assume that will apply to all requests (not just streaming expressions). I assume that ideally it would be configurable per request. As for manual flushing, that would require changes to the streaming expressions themselves. Would that be the way to go? What do you think? [1] https://issues.apache.org/jira/secure/attachment/12787283/SOLR-8669.patch Best regards, Radu -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/