Re: Post Processing Solr Results

Jamie Johnson Mon, 29 Aug 2011 05:24:55 -0700

Thanks for the reply.  The fields I want are indexed, but how would I
go directly at the fields I wanted?


In regards to indexing the auth tokens I've thought about this and am
trying to get confirmation if that is reasonable given our
constraints.

On Mon, Aug 29, 2011 at 8:20 AM, Erick Erickson <erickerick...@gmail.com> wrote:
> Yeah, loading the document inside a Collector is a
> definite no-no. Have you tried going directly
> at the fields you want (assuming they're
> indexed)? That *should* be much faster, but
> whether it'll be fast enough is a good question. I'm
> thinking some of the Terms methods here. You
> *might* get some joy out of making sure lazy
> field loading is enabled (and make sure the
> fields you're accessing for your logic are
> indexed), but I'm not entirely sure about
> that bit.
>
> This kind of problem is sometimes handled
> by indexing "auth tokens" with the documents
> and including an OR clause on the query
> with the authorizations for a particular
> user, but that works best if there is an upper
> limit (in the 100s) of tokens that a user can possibly
> have, often this works best with some kind of
> grouping. Making this work when a user can
> have tens of thousands of auth tokens is...er...
> contra-indicated...
>
> Hope this helps a bit...
> Erick
>
> On Sun, Aug 28, 2011 at 11:59 PM, Jamie Johnson <jej2...@gmail.com> wrote:
>> Just a bit more information.  Inside my class which extends
>> FilteredDocIdSet all of the time seems to be getting spent in
>> retrieving the document from the readerCtx, doing this
>>
>> Document doc = readerCtx.reader.document(docid);
>>
>> If I comment out this and just return true things fly along as I
>> expect.  My query is returning a total of 2 million documents also.
>>
>> On Sun, Aug 28, 2011 at 11:39 AM, Jamie Johnson <jej2...@gmail.com> wrote:
>>> I have a need to post process Solr results based on some access
>>> controls which are setup outside of Solr, currently we've written
>>> something that extends SearchComponent and in the prepare method I'm
>>> doing something like this
>>>
>>>                    QueryWrapperFilter qwf = new
>>> QueryWrapperFilter(rb.getQuery());
>>>                    Filter filter = new CustomFilter(qwf);
>>>                    FilteredQuery fq = new FilteredQuery(rb.getQuery(), 
>>> filter);
>>>                    rb.setQuery(fq);
>>>
>>> Inside my CustomFilter I have a FilteredDocIdSet which checks if the
>>> document should be returned.  This works as I expect but for some
>>> reason is very very slow.  Even if I take out any of the machinery
>>> which does any logic with the document and only return true in the
>>> FilteredDocIdSets match method the query still takes an inordinate
>>> amount of time as compared to not including this custom filter.  So my
>>> question, is this the most appropriate way of handling this?  What
>>> should the performance out of such a setup be expected to be?  Any
>>> information/pointers would be greatly appreciated.
>>>
>>
>

Re: Post Processing Solr Results

Reply via email to