I know that the query selects everything, this is why I made this
request to test my solution.
If a user make a query with a very large amount of results with
paging, I expected the post filter to be executed only when necessary
(as it can be expensive).

Colin


On 28 February 2013 17:25, Timothy Potter <thelabd...@gmail.com> wrote:
> Hi Colin,
>
> Your query is *:* so that is every document. Try a query that only
> matches a small subset and see if you get different results.
>
> Cheers,
> Tim
>
> On Thu, Feb 28, 2013 at 8:17 AM, Colin Hebert <hebert.co...@gmail.com> wrote:
>> Thank you Timothy,
>>
>> With the indication you gave me (and the help of this article
>> http://searchhub.org/2012/02/22/custom-security-filtering-in-solr/ ) I
>> managed to draft my own filter, but it seems that it doesn't work
>> quite as I expected.
>>
>> Here is what I've done so far:
>> https://github.com/ColinHebert/Sakai-Solr/tree/permission/permission/solr/src/main/java/org/sakaiproject/search/solr/permission/filter
>>
>> But it seems that the filter is applied on every document matched by a
>> query (rather than doing that on the range of documents I searched
>> for).
>>
>> I've done some tests with 10k+ documents and the query
>> /select?q=*%3A*&fq={!sakai%20userId=admin}&tv=false&start=0&rows=1
>> takes ages to execute (and in my application I can see that solr is
>> trying to apply the filter on absolutely every document.
>>
>> Cheers,
>> Colin
>> Colin Hebert
>>
>>
>> On 26 February 2013 15:30, Timothy Potter <thelabd...@gmail.com> wrote:
>>> Hi Colin,
>>>
>>> I think a filter is definitely the way to go. Moreover, you should
>>> look into Solr's PostFilter concept which is intended to work with
>>> "expensive" filters. Have a look at Yonik's blog post on this topic:
>>> http://yonik.com/posts/advanced-filter-caching-in-solr/
>>>
>>> Cheers,
>>> Tim
>>>
>>> On Tue, Feb 26, 2013 at 7:24 AM, Colin Hebert <hebert.co...@gmail.com> 
>>> wrote:
>>>> Hi,
>>>>
>>>> I have some troubles to figure out the right thing when it comes to
>>>> filtering results for security reasons.
>>>>
>>>> I work on this application that contains documents that are not
>>>> accessible to everyone, so I want to filter the search results, based
>>>> on the right to read each document for the user making the search
>>>> query.
>>>> To do that, right now, I have a filter on the application side that
>>>> checks for each document returned by a search query, if it is
>>>> accessible by the current user, and removes it from the result list if
>>>> it isn't.
>>>>
>>>> That isn't really optimal as you might get a result page with 7
>>>> results instead of 10 because some results were removed (and if you're
>>>> smart enough you can figure out the content of those hidden documents
>>>> by doing many search queries).
>>>>
>>>> So I can think of two solutions, either I code a paging system in my
>>>> application that will take care of those holes in the result list, but
>>>> it adds quite a lot of work that could be useless if solr can take
>>>> care of that.
>>>> The second solution is having solr filtering those results before
>>>> sending them back.
>>>>
>>>> The second solution seems a bit more clean to me, but I'm not sure if
>>>> it is a good practice or not.
>>>>
>>>> The permission system in the application is a bit 'wild', some
>>>> permissions are based on the day of the week, others on the existence
>>>> or not of another document, so I can't really get out of this
>>>> situation by storing more information in the index and using standard
>>>> filters.
>>>> If creating a custom filter in Solr isn't too bad, what I was thinking
>>>> of would require the solr server making a request to the application
>>>> to check if the user (given as a parameter in the query) can access
>>>> the document (and that should be done on each document).
>>>> Note that I will have to do that security check anyways, so the time
>>>> to do a security check isn't (at least shouldn't) be relevant to the
>>>> performances of a solution over the other.
>>>> What will have an impact though is the fact that the solr server has
>>>> to do a request to the application (network connection) for each
>>>> document.
>>>>
>>>> Colin Hebert

Reply via email to