Just to clarify this line of code:
String[] suggestions = spellChecker.suggestSimilar(termText, numSug,
req.getSearcher().getReader(), restrictToField, true);
I only return suggestions if they are more popular than termText. You
probably need to use code in Scott's patch to make this behaviour
configurable.
On 10/11/07, climbingrose <[EMAIL PROTECTED]> wrote:
>
> Hi all,
>
> I've been so busy the last few days so I haven't replied to this email. I
> modified SpellCheckerHandler a while ago to include support for multiword
> query. To be honest, I didn't have time to write unit test for the code.
> However, I deployed it in a production environment and it has been working
> for me so far. My version, however, has two assumptions:
>
> 1) I assumpt that when user enter a misspelled multiword query, we should
> only check for words that are actually misspelled. For example, if user
> enter "life expectancy calculatar", which has "calculator" misspelled, we
> should only spellcheck "calculatar".
> 2) I only return the best string for a mispelled query.
>
> I guess I can just directly paste the code here so that others can adapt
> for their own purposes. If you have any question, just send me an email.
> I'll happy to help you.
>
> StringBuffer buf = null;
> if (null != words && !"".equals(words.trim())) {
> Analyzer analyzer = req.getSchema
> ().getField(field).getType().getAnalyzer();
>
> TokenStream source = analyzer.tokenStream(field, new
> StringReader(words));
> Token t;
> boolean hasSuggestion = false;
> boolean termExists = false;
> while (true) {
> try {
> t = source.next();
> } catch (IOException e) {
> t = null;
> }
> if (t == null)
> break;
>
> String termText = t.termText();
> String[] suggestions = spellChecker.suggestSimilar(termText,
> numSug, req.getSearcher().getReader(), restrictToField, true);
> if (suggestions != null && suggestions.length > 0) {
> if (!suggestions[0].equals(termText)) {
> hasSuggestion = true;
> }
> if (buf == null) {
> buf = new StringBuffer(suggestions[0]);
> } else
> buf.append(" ").append(suggestions[0]);
> } else if (spellChecker.exist(termText)){
> termExists = true;
> if (buf == null) {
> buf = new StringBuffer(termText);
> } else
> buf.append(" ").append(termText);
> } else {
> hasSuggestion = false;
> termExists= false;
> break;
> }
> }
> try {
> source.close();
> } catch (IOException e) {
> // ignore
> }
> // String[] suggestions = spellChecker.suggestSimilar(words,
> numSug,
> // nullReader, restrictToField, onlyMorePopular);
> if (hasSuggestion || (!hasSuggestion && termExists))
> rsp.add("suggestions", buf.toString());
> else
> rsp.add("suggestions", null);
>
>
>
> On 10/11/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> >
> > Hoss,
> >
> > I had a feeling someone would be quoting Yonik's Law of Patches! ;-)
> >
> > For now, this is done.
> >
> > I created the changes, created JavaDoc comments on the various settings
> > and their expected output, created a JUnit test for the
> > SpellCheckerRequestHandler
> > which tests various components of the handler, and I also created the
> > supporting configuration files for the JUnit tests (schema and
> > solrconfig files).
> >
> > I attached the patch to the JIRA issue so now we just have to wait until
> > it gets
> > added back in to the main code stream.
> >
> > For anyone who is interested, here is a link to the JIRA:
> > https://issues.apache.org/jira/browse/SOLR-375
> >
> > Could someone please drop me a hint on how to update the wiki or any
> > other
> > documentation that could benefit to being updated; I'll like to help out
> > as much
> > as possible, but first I need to know "how". ;-)
> >
> > When these changes do get committed back in to the daily build, please
> > review the generated JavaDoc for information on how to utilize these new
> > features.
> > If anyone has any questions, or comments, please do not hesitate to ask.
> >
> >
> > As a general note of a self-critique on these changes, I am not 100%
> > sure of the way I
> > implemented the "nested" structure when the "multiWords" parameter is
> > used. My interest
> > is that it should work smoothly with some other technology such as
> > Prototype using the
> > JSon output type. Unfortunately, I will not be getting a chance to
> > start on that coding until
> > next week so it is up in the air as to if this structure will be
> > conducive or not. I am planning
> > on providing more details in the documentations as far as how to utilize
> > these modifications
> > in Prototype and AJax when I get a chance (even provide links to a
> > production site so you
> > can see it in action and view the source if interested). So stay
> > tuned...
> >
> > Thanks for everyones time,
> > Scott Tabar
> >
> > ---- Chris Hostetter <[EMAIL PROTECTED]> wrote:
> >
> > : If you like, I can post the source code changes that I made to the
> > : SpellCheckerRequestHandler, but at this time I am not ready to open a
> > : JIRA issue and submit the changes back through the subversion. I will
> > : need to do a little more testing, documentation, and create some unit
> > : tests to cover all of these changes, but what I have been able to
> > : perform, it is working very well.
> >
> > Keep in mind "Yonik's Law Of Patches" ...
> >
> > "A half-baked patch in Jira, with no documentation, no tests
> > and no backwards compatibility is better than no patch at all."
> > http://wiki.apache.org/solr/HowToContribute
> >
> > ...even if you don't think the code is "solid" yet, if you want to
> > eventually make it available to people, making a "rough" version
> > available
> > to people early gives other people the opportunity to help you make it
> > solid (by writing unit tests, fixing bugs, and adding documentation).
> >
> >
> > -Hoss
> >
> >
> >
>
>
> --
> Regards,
>
> Cuong Hoang
--
Regards,
Cuong Hoang