: I have an application that manages documents in real-time into
: collections where a given document can live in more than one collection
: and multiple users can create collections on the fly.

: I get from reading that it's better to have a single index over all
: documents than to have one per collection or per user. However, to limit

that's not neccessarily true, but it's certainly easier to add docs to an
existing index with a field denoting that they are part of a seperate
"collection" then it is to add a whole new index.

this isn't a Solr specific issue -- even when dealing directly with Lucene
the overhead of an index per user/collection may be prohibitive.

: their collections, if that document has been added by other users to
: their collections, I'd have to update the metadata fields in that
: document and reindex it.   This seems like trouble if I scale to
: hundreds of users over thousands of collections over millions of

it depends on your definition of "trouble".  in general there really
aren't any serious downsides to this approach that would be easily solved
by any other appraoch.

: documents. At a minimum it sounds like indexing could be slow and my
: query cache would never be of much value since I'd be doing a lot of
: reindexing.

"reindexing" isn't specificly going to cause caching problems -- any index
changes of any kind at an extremely high rate of changes will cause
caching problems ... it doesn't matter if you are reindexing existing odcs
because a new user wants it in their list, or if a whole new doc is added,
the affect is the same: all caches need invalidated and new caches need
warmed in their place.

the key to getitng better cache performance is in making allowances about
how "real time" your real time indexing needs to be ... if you can set a
minimum delay (ie: only reopen a new searcher at minimum every X seconds)
your cache performance will potentially be X times better then if you
reopen a new searcher everytime any document is added or modified.


-Hoss

Reply via email to