Hey Hamish,

You might want to check this out LUCENE-5402 . I added support for
index-time pruning for suggesters that consumes from the index itself.
I plan to add this support to file-based suggesters as well.
In order to use this functionality from Solr, more changes are required. I
am planning to support this in the new SuggesterComponent (SOLR-5378) in
Solr.

Hope that helps!

Areek


On Wed, Jan 15, 2014 at 6:10 PM, Hamish Campbell <
hamish.campb...@koordinates.com> wrote:

> Thanks Tomás, I'll take a look.
>
> Still interested to hear from anyone about using queries to populate the
> list - I'm willing to give up a bit of performance for the flexibility it
> would provide.
>
>
> On Thu, Jan 16, 2014 at 1:06 PM, Tomás Fernández Löbbe <
> tomasflo...@gmail.com> wrote:
>
> > I think your use case is the one described in LUCENE-5350, maybe you want
> > to take a look to the patch and comments there.
> >
> > Tomás
> >
> >
> > On Wed, Jan 15, 2014 at 12:58 PM, Hamish Campbell <
> > hamish.campb...@koordinates.com> wrote:
> >
> > > Hi all,
> > >
> > > I'm looking into options for filtering the search suggestions
> dictionary.
> > >
> > > Using Solr 4.6.0, Suggester component and fst.FuzzyLookupFactory using
> a
> > > field based dictionary, we're indexing records for a multi-tenanted
> SaaS
> > > platform. SearchHandler records are always filtered by the particular
> > > client warehouse (e.g. by domain), however we need a way to apply a
> > similar
> > > filter to the spell check dictionary to prevent leaking terms between
> > > clients. In other words: when client A searches for a document title
> they
> > > should not receive spelling suggestions for client B's document titles.
> > >
> > > This has been asked a couple of times, on the mailing list and on
> > > StackOverflow. Some of the suggested approaches:
> > >
> > > 1. Use dynamic fields to create dictionaries per-warehouse (mentioned
> > here:
> > >
> > >
> >
> http://lucene.472066.n3.nabble.com/Filtering-down-terms-in-suggest-tt4069627.html
> > > )
> > >
> > > That might be a reasonable option for us (we already considered a
> similar
> > > approach), but at what point does this stop scaling efficiently? How
> many
> > > dynamic fields are too many?
> > >
> > > 2. Run a query to populate the suggestion list (also mentioned in that
> > > thread)
> > >
> > > If I understand this correctly, this would give us a lot of flexibility
> > and
> > > power: for example to give a more nuanced result set using the users
> > > permissions to expose private documents in their spelling suggestions.
> > >
> > > I expect this would be a slow query, but our total document count is
> > > currently relatively small (on the order of 10^3 objects) and I imagine
> > you
> > > could create a specific word index with the appropriate fields to keep
> > this
> > > in check. Is this a feasible approach, and if so, how do you build a
> > > dynamic suggestion list?
> > >
> > > 3. Other options:
> > >
> > > It seems like this is a common problem - and we could through some
> > > resources at building an extension to provide some limited suggestion
> > > dictionary filtering. Is anyone already doing something similar, or has
> > > found a clever hack around this, or can suggest a starting point?
> > >
> > > Thanks everyone!
> > >
> > > --
> > > Hamish Campbell
> > > Koordinates Ltd <http://koordinates.com/?_bzhc=esig>
> > > PH   +64 9 966 0433
> > > FAX +64 9 966 0045
> > >
> >
>
>
>
> --
> Hamish Campbell
> Koordinates Ltd <http://koordinates.com/?_bzhc=esig>
> PH   +64 9 966 0433
> FAX +64 9 966 0045
>

Reply via email to