If you want a spell checker, don’t use a search engine. Use a spell checker. Something like aspell (http://aspell.net/ <http://aspell.net/>) will be faster and better than Solr.
wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Oct 1, 2015, at 1:06 PM, Mark Fenbers <mark.fenb...@noaa.gov> wrote: > > This is with Solr. The Lucene approach (assuming that is what is in my Java > code, shared previously) works flawlessly, albeit with fewer options, AFAIK. > > I'm not sure what you mean by "business case"... I'm wanting to spell-check > user-supplied text in my Java app. The end-user then activates the > spell-checker on the entire text (presumably, a few paragraphs or less). I > can use StyledText's capabilities to highlight the misspelled words, and when > the user clicks the highlighted word, a menu will appear where he can select > a suggested spelling. > > But so far, I've had trouble: > > * determining which words are misspelled (because Solr often returns > suggestions for correctly spelled words). > * getting coherent suggestions (regardless if the query word is > misspelled or not). > > It's been a bit puzzling (and frustrating)!! it only took me 10 minutes to > get the Lucene spell checker working, but I agree that Solr would be the > better way to go, if I can ever get it configured properly... > > Mark > > > On 10/1/2015 12:50 PM, Alexandre Rafalovitch wrote: >> Is that with Lucene or with Solr? Because Solr has several different >> spell-checker modules you can configure. I would recommend trying >> them first. >> >> And, frankly, I still don't know what your business case is. >> >> Regards, >> Alex. >> ---- >> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: >> http://www.solr-start.com/ >> >> >> On 1 October 2015 at 12:38, Mark Fenbers <mark.fenb...@noaa.gov> wrote: >>> Yes, and I've spend numerous hours configuring and reconfiguring, and >>> eventually even starting over, but still have not getting it to work right. >>> Even now, I'm getting bizarre results. For example, I query "NOTE: This >>> is purely as an example." and I get back really bizarre suggestions, like >>> "n ot e" and "n o te" and "n o t e" for the first word which isn't even >>> misspelled! The same goes for "purely" and "example" also! Moreover, I get >>> extended results showing the frequencies of these suggestions being over >>> 2600 occurrences, when I'm not even using an indexed spell checker. I'm >>> only using a file-based spell checker (/usr/shar/dict/words), and the >>> wordbreak checker. >>> >>> At this point, I can't even figure out how to narrow down my confusion so >>> that I can post concise questions to the group. But I'll get there >>> eventually, starting with removing the wordbreak checker for the time-being. >>> Your response was encouraging, at least. >>> >>> Mark >>> >>> >>> >>> On 10/1/2015 9:45 AM, Alexandre Rafalovitch wrote: >>>> Hi Mark, >>>> >>>> Have you gone through a Solr tutorial yet? If/when you do, you will >>>> see you don't need to code any of this. It is configured as part of >>>> the web-facing total offering which are tweaked by XML configuration >>>> files (or REST API calls). And most of the standard pipelines are >>>> already pre-configured, so you don't need to invent them from scratch. >>>> >>>> On your specific question, it would be better to ask what _business_ >>>> level functionality you are trying to achieve and see if Solr can help >>>> with that. Starting from Lucene code is less useful :-) >>>> >>>> Regards, >>>> Alex. >>>> ---- >>>> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: >>>> http://www.solr-start.com/ >>>> >>>> >>>> On 1 October 2015 at 07:48, Mark Fenbers <mark.fenb...@noaa.gov> wrote: >