I thought of that method. The problem I was thinking of is that if a new 
customer is added, that could potentially cause an update of about 2,000,000 
records or so. Fortunately, this does not happen everyday. It also make 
indexing a little difficult because I now have to check permissions on each 
document.

In this configuration, the acl field could have 900 or so customer IDs.

----- Original Message ----
From: Ken Krugler <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Wednesday, April 30, 2008 8:13:57 PM
Subject: Re: access control list

>I have an index of about 3,000,000 products and about 8500 
>customers. Each customers has access to about 50 to about 500,000 of 
>the products.
>
>Our current method was using a bitset in the filter. So, for each 
>customer, they have a bitset in the cache. For each docId that they 
>have access to, the bit is set. This is probably the best 
>performance-wise for searches, but it consumes a lot of memory, 
>especially because each document that they don't have access to also 
>consumes space (a 0). It also is probably the cause of our problems 
>when either these customer access lists (stored in files) or the 
>index is updated.
>
>Is there a better way to manage access control? I was thinking of 
>storing the user access list as a specific document type in the 
>index. Basically, a single multi-value field. But I'm not quite sure 
>where to go from here.

Another approach is to add an additional "acl" field, where the 
contents of this would be the list of customers ids with access to 
that document.

Then your query is an implicit acl:<customer id> AND (actual query).

No idea if that would work for your case, but we use it to control 
access to source code in our Krugle enterprise product. Though we're 
using LDAP-provided groups (more in line with Mike's suggestion) than 
individual user ids.

-- Ken
-- 
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"If you can't find it, you can't fix it"



Reply via email to