Thanks Hoss. I am actually ok with that, I think something like 50,000 results from each shard as a max would be reasonable since my check takes about 1s for 50,000 records. I'll give this a whirl and see how it goes.
On Mon, Aug 29, 2011 at 6:46 PM, Chris Hostetter <hossman_luc...@fucit.org> wrote: > > : Also I see that this is before sorting, is there a way to do something > : similar after sorting? The reason is that I'm ok with the total > : result not being completely accurate so long as the first say 10 pages > : are accurate. The results could get more accurate as you page through > : them though. Does that make sense? > > munging results after sorting is dangerous in the general case, but if you > have a specific usecase where you're okay with only garunteeing accurate > results up to result #X, then you might be able to get away with something > like... > > * custom SearchComponent > * configure to run after QueryComponent > * in prepare, record the start & rows params, and replace them with 0 & > (MAX_PAGE_NUM * rows) > * in process, iterate over the the DocList and build up your own new > DocSlice based on the docs that match your special criteria - then use the > original start/rows to generate a subset and return that > > ...getting this to play nicely with stuff like faceting be possible with > more work, and manipulation of the DocSet (assuming you're okay with the > facet counts only being as accurate as much as the DocList is -- filtered > up to row X). > > it could fail misserablly with distributed search since you hvae no idea > how many results will pass your filter. > > (note: this is all off the top of my head ... no idea if it would actually > work) > > > > -Hoss >