Thanks for the advice Yonik.
We have new users at least every few hours so it would be kinda
difficult to maintain the indexes this way. However, we do have a
smaller set of tokens describing the different subscription sets
available (<100). Basically, each folder_id is attached to a certain
number of subscription sets, and these associations don't change
much. With MySQL using this field would have taken too many joins,
but with solr this may actually end up being better overall.
Only problem is, right now the current workflow would require indexing
once to put the images in the system, and then a second time to set
the permissions on them. We'll have to change the order of some
processes around, which means retraining, but in the end I think this
is going to be the most workable solution.
I didn't realize there was a regular expression tokenizer, but now I
see there's PatternAnalyzer. I'll give it a shot.
Regards,
Steve
On Jun 10, 2008, at 5:18 PM, Yonik Seeley wrote:
On Mon, Jun 9, 2008 at 7:44 PM, Stephen Weiss
<[EMAIL PROTECTED]> wrote:
However, in the plain text search, the user automatically searches
through
*all* of the folders to which they have subscribed. This means,
for (good!)
users who have subscribed to a large (1000+) number of folders, the
filter
query would be quite long,
This is not a well-solved problem in Lucene & Solr in general.
and would exceed the default number of boolean
parameters allowed.
...