Arnon,

Use "spellcheck.collate=true" with "spellcheck.maxCollationTries" set to a 
non-zero value.  This will give you re-written queries that are guaranteed to 
return hits, given the original query and filters.  If you are using an "mm" 
value other than 100%, you also will want specify 
"spellcheck.collateParam.mm=100%". (or if using "q.op=OR", then use 
"spellcheck.collateParam.q.op=AND")

Of course, the first section of the spellcheck result will still show every 
possible suggestion, so your client needs to discard these and not divulge them 
to the user.  If you need to know word-by-word how the collations were 
constructed, then specify "spellcheck.collateExtendedResults=true".  Use the 
extended collation results for this information and not the first section of 
the spellcheck results.

This is all fairly well-documented on the old solr wiki:  
https://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collate

James Dyer
Ingram Content Group

-----Original Message-----
From: Arnon Yogev [mailto:arn...@il.ibm.com] 
Sent: Monday, October 12, 2015 2:33 AM
To: solr-user@lucene.apache.org
Subject: Spell Check and Privacy

Hi,

Our system supports many users from different organizations and with 
different ACLs. 
We consider adding a spell check ("did you mean") functionality using 
DirectSolrSpellChecker. However, a privacy concern was raised, as this 
might lead to private information being revealed between users via the 
suggested terms. Using the FileBasedSpellChecker is another option, but 
naturally a static list of terms is not optimal.

Is there a best practice or a suggested method for these kind of cases?

Thanks,
Arnon

Reply via email to