Thanks for the advice Yonik.

We have new users at least every few hours so it would be kinda difficult to maintain the indexes this way. However, we do have a smaller set of tokens describing the different subscription sets available (<100). Basically, each folder_id is attached to a certain number of subscription sets, and these associations don't change much. With MySQL using this field would have taken too many joins, but with solr this may actually end up being better overall.

Only problem is, right now the current workflow would require indexing once to put the images in the system, and then a second time to set the permissions on them. We'll have to change the order of some processes around, which means retraining, but in the end I think this is going to be the most workable solution.

I didn't realize there was a regular expression tokenizer, but now I see there's PatternAnalyzer. I'll give it a shot.

Regards,

Steve


On Jun 10, 2008, at 5:18 PM, Yonik Seeley wrote:

On Mon, Jun 9, 2008 at 7:44 PM, Stephen Weiss <[EMAIL PROTECTED]> wrote:
However, in the plain text search, the user automatically searches through *all* of the folders to which they have subscribed. This means, for (good!) users who have subscribed to a large (1000+) number of folders, the filter
query would be quite long,

This is not a well-solved problem in Lucene & Solr in general.

and would exceed the default number of boolean
parameters allowed.
...

Reply via email to