Actually, after thinking for a bit, it makes sense to apply the post filter everywhere, otherwise I wouldn't be able to know the number of results overall (something I unfortunately really need).
Anyways, thank you Timothy Colin Hebert On 28 February 2013 17:38, Colin Hebert <hebert.co...@gmail.com> wrote: > I know that the query selects everything, this is why I made this > request to test my solution. > If a user make a query with a very large amount of results with > paging, I expected the post filter to be executed only when necessary > (as it can be expensive). > > Colin > > > On 28 February 2013 17:25, Timothy Potter <thelabd...@gmail.com> wrote: >> Hi Colin, >> >> Your query is *:* so that is every document. Try a query that only >> matches a small subset and see if you get different results. >> >> Cheers, >> Tim >> >> On Thu, Feb 28, 2013 at 8:17 AM, Colin Hebert <hebert.co...@gmail.com> wrote: >>> Thank you Timothy, >>> >>> With the indication you gave me (and the help of this article >>> http://searchhub.org/2012/02/22/custom-security-filtering-in-solr/ ) I >>> managed to draft my own filter, but it seems that it doesn't work >>> quite as I expected. >>> >>> Here is what I've done so far: >>> https://github.com/ColinHebert/Sakai-Solr/tree/permission/permission/solr/src/main/java/org/sakaiproject/search/solr/permission/filter >>> >>> But it seems that the filter is applied on every document matched by a >>> query (rather than doing that on the range of documents I searched >>> for). >>> >>> I've done some tests with 10k+ documents and the query >>> /select?q=*%3A*&fq={!sakai%20userId=admin}&tv=false&start=0&rows=1 >>> takes ages to execute (and in my application I can see that solr is >>> trying to apply the filter on absolutely every document. >>> >>> Cheers, >>> Colin >>> Colin Hebert >>> >>> >>> On 26 February 2013 15:30, Timothy Potter <thelabd...@gmail.com> wrote: >>>> Hi Colin, >>>> >>>> I think a filter is definitely the way to go. Moreover, you should >>>> look into Solr's PostFilter concept which is intended to work with >>>> "expensive" filters. Have a look at Yonik's blog post on this topic: >>>> http://yonik.com/posts/advanced-filter-caching-in-solr/ >>>> >>>> Cheers, >>>> Tim >>>> >>>> On Tue, Feb 26, 2013 at 7:24 AM, Colin Hebert <hebert.co...@gmail.com> >>>> wrote: >>>>> Hi, >>>>> >>>>> I have some troubles to figure out the right thing when it comes to >>>>> filtering results for security reasons. >>>>> >>>>> I work on this application that contains documents that are not >>>>> accessible to everyone, so I want to filter the search results, based >>>>> on the right to read each document for the user making the search >>>>> query. >>>>> To do that, right now, I have a filter on the application side that >>>>> checks for each document returned by a search query, if it is >>>>> accessible by the current user, and removes it from the result list if >>>>> it isn't. >>>>> >>>>> That isn't really optimal as you might get a result page with 7 >>>>> results instead of 10 because some results were removed (and if you're >>>>> smart enough you can figure out the content of those hidden documents >>>>> by doing many search queries). >>>>> >>>>> So I can think of two solutions, either I code a paging system in my >>>>> application that will take care of those holes in the result list, but >>>>> it adds quite a lot of work that could be useless if solr can take >>>>> care of that. >>>>> The second solution is having solr filtering those results before >>>>> sending them back. >>>>> >>>>> The second solution seems a bit more clean to me, but I'm not sure if >>>>> it is a good practice or not. >>>>> >>>>> The permission system in the application is a bit 'wild', some >>>>> permissions are based on the day of the week, others on the existence >>>>> or not of another document, so I can't really get out of this >>>>> situation by storing more information in the index and using standard >>>>> filters. >>>>> If creating a custom filter in Solr isn't too bad, what I was thinking >>>>> of would require the solr server making a request to the application >>>>> to check if the user (given as a parameter in the query) can access >>>>> the document (and that should be done on each document). >>>>> Note that I will have to do that security check anyways, so the time >>>>> to do a security check isn't (at least shouldn't) be relevant to the >>>>> performances of a solution over the other. >>>>> What will have an impact though is the fact that the solr server has >>>>> to do a request to the application (network connection) for each >>>>> document. >>>>> >>>>> Colin Hebert