I'm not sure how you COULD do searching without having the permissions in the documents. I mentally use the model of unix filesystems, as a starter. Simple, but powerful. If I needed a separate table for permissions, or index, I'd have to do queries, with GINORMOUS amounts of OR statements.
I see it flowing like: User U Has Access to Documents DS (40,000,000 out of 100,000,000 of them), Now get these (list of 40x10^06) documents. How do you see it Peter? Dennis Gearon ----- Original Message ---- From: Peter Sturge <peter.stu...@gmail.com> To: solr-user@lucene.apache.org Sent: Thu, January 20, 2011 3:16:59 PM Subject: Re: Document level security Hi, One of the things about Document Security is that it never involves just one thing. There are a lot of things to consider, and unfortunately, they're generally non-trivial. Deciding how to store/hold/retrieve permissions is certainly one of those things, and you're right, you should avoid attaching permissions to document data in the index, because if you want to change permissions (and you will want to change them at some point), it can be a cumbersome job, particularly if it involves millions of documents, replication, shards etc. It's also generally a good idea not to tie your schema to permission fields. Another big consideration is authentication - how can you be sure the request is coming from the user you think it is? Is there a certificate involved? Has the user authenticated to the container? If so, how do you get to this? and so on... For permissions storage, there are two realistic approaches to consider: 1. Write a SearchComponent that handles permission requests. This typically involves storing/reading permissions in/from a file, database or separate index (see SOLR-1872) 2. Use an LCF module to retrieve permissions from the original documents themselves (see SOLR-1834) Hope this helps, Peter On Thu, Jan 20, 2011 at 8:44 PM, Rok Rejc <rokrej...@gmail.com> wrote: > Hi all, > > I have an index containing a couple of million documents. > Documents are grouped into "groups", each group contains from 1000-20000 > documents. > > The problem: > Each group has defined permission settings. It can be viewed by public, > viewed by registred users, or viewed by a list of users (each group has her > own list of users). > Said differently: I need a document security. > > What I read from the other threads it is not recommended to store > permissions in the index. I have already all the permissions in the > database, but I don't "know" how to connect the database and the index. > I can query the database to get the groups in which the user is and after > that do the OR query, but I am afraid that this list can be too big (100 > OR's could also exceeds maximum HTTP GET query string length). > > What are the other options? Should I write a custom collector which will > query (and cache) the database for permissions? > > Any ideas are appreciated... > > Many thanks, Rok >