Thanks for the reply. The fields I want are indexed, but how would I go directly at the fields I wanted?
In regards to indexing the auth tokens I've thought about this and am trying to get confirmation if that is reasonable given our constraints. On Mon, Aug 29, 2011 at 8:20 AM, Erick Erickson <erickerick...@gmail.com> wrote: > Yeah, loading the document inside a Collector is a > definite no-no. Have you tried going directly > at the fields you want (assuming they're > indexed)? That *should* be much faster, but > whether it'll be fast enough is a good question. I'm > thinking some of the Terms methods here. You > *might* get some joy out of making sure lazy > field loading is enabled (and make sure the > fields you're accessing for your logic are > indexed), but I'm not entirely sure about > that bit. > > This kind of problem is sometimes handled > by indexing "auth tokens" with the documents > and including an OR clause on the query > with the authorizations for a particular > user, but that works best if there is an upper > limit (in the 100s) of tokens that a user can possibly > have, often this works best with some kind of > grouping. Making this work when a user can > have tens of thousands of auth tokens is...er... > contra-indicated... > > Hope this helps a bit... > Erick > > On Sun, Aug 28, 2011 at 11:59 PM, Jamie Johnson <jej2...@gmail.com> wrote: >> Just a bit more information. Inside my class which extends >> FilteredDocIdSet all of the time seems to be getting spent in >> retrieving the document from the readerCtx, doing this >> >> Document doc = readerCtx.reader.document(docid); >> >> If I comment out this and just return true things fly along as I >> expect. My query is returning a total of 2 million documents also. >> >> On Sun, Aug 28, 2011 at 11:39 AM, Jamie Johnson <jej2...@gmail.com> wrote: >>> I have a need to post process Solr results based on some access >>> controls which are setup outside of Solr, currently we've written >>> something that extends SearchComponent and in the prepare method I'm >>> doing something like this >>> >>> QueryWrapperFilter qwf = new >>> QueryWrapperFilter(rb.getQuery()); >>> Filter filter = new CustomFilter(qwf); >>> FilteredQuery fq = new FilteredQuery(rb.getQuery(), >>> filter); >>> rb.setQuery(fq); >>> >>> Inside my CustomFilter I have a FilteredDocIdSet which checks if the >>> document should be returned. This works as I expect but for some >>> reason is very very slow. Even if I take out any of the machinery >>> which does any logic with the document and only return true in the >>> FilteredDocIdSets match method the query still takes an inordinate >>> amount of time as compared to not including this custom filter. So my >>> question, is this the most appropriate way of handling this? What >>> should the performance out of such a setup be expected to be? Any >>> information/pointers would be greatly appreciated. >>> >> >