I’ll give that a shot.

Not sure if range queries work on a UUID field, but I have thought of 
segmenting the ID space and running parallel queries on those.

Right now it is sucking over 1.6 million docs per hour, so that is bearable. 
Making it 4X or 16 X faster would be nice, though.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Feb 10, 2020, at 2:19 PM, Erick Erickson <erickerick...@gmail.com> wrote:
> 
> Not sure whether cursormark respects distrib=false, although I can easily see 
> there being “complications” here.
> 
> Hmmm, whenever I try to use distrib=false, I usually fire the query at the 
> specific replica rather than use the shards parameter. IDK whether that’ll 
> make any difference.
> 
> https://node:port/solr/collection1_shard1_replica1/query?distrib=false…….
> 
> You could also make it simpler...
> 
> q=id:{last_id_from_last_packet TO *]&rows=some_reasonable_number&sort=id 
> asc&distrib=false
> 
> That doesn’t pay the penalty of having a huge start param.
> 
> Again, I’d use the specific replica rather than shards parameter, but just 
> because I’ve never tried it with shards….
> 
> Best,
> Erick
> 
>> On Feb 10, 2020, at 1:30 PM, Walter Underwood <wun...@wunderwood.org> wrote:
>> 
>> I tried to get fancy and dump our content with one thread per shard, but it 
>> did distributed search anyway. I specified the shard using the “shards” 
>> param and set distrib=false.
>> 
>> Is this a bug or expected behavior in 6.6.2? I did not see it mentioned in 
>> the docs.
>> 
>> It is working fine with a single thread and distributed search. Should have 
>> followed the old Kernighan and Plauger rule, “Make it right before youmake 
>> it faster."
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
> 

Reply via email to