Re: Spellchecking and frequency

2010-07-28 Thread Jonathan Rochkind
I therefore wrote an implementation of SolrSpellChecker that wraps jazzy, the java aspell library. I also extended the SpellCheckComponent to take the matrix of suggested words and query the corpus to find the first combination of suggestions which returned a match. This works well for my use ca

Re: Spellchecking and frequency

2010-07-28 Thread dan sutton
Hi Mark, Thanks for that info looks very interesting, would be great to see your code. Out of interest did you use the dictionary and the phonetic file? Did you see better results with both? In regards to the secondary part to check the corpus for matching suggestions, would another way to do thi

Re: Spellchecking and frequency

2010-07-27 Thread Erick Erickson
"Yonik's Law of Patches" reads: "A half-baked patch in Jira, with no documentation, no tests and no backwards compatibilty is better than no patch at all." It'd be perfectly appropriate, IMO, for you to post an outline of what your enhancements do over on the SOLR dev list and get a reaction from

RE: Spellchecking and frequency

2010-07-27 Thread Dyer, James
ike to see yours to compare. James Dyer E-Commerce Systems Ingram Book Company (615) 213-4311 -Original Message- From: Mark Holland [mailto:mark.holl...@zoopla.co.uk] Sent: Tuesday, July 27, 2010 1:04 PM To: solr-user@lucene.apache.org Subject: Re: Spellchecking and frequency Hi, I

Re: Spellchecking and frequency

2010-07-27 Thread Mark Holland
Hi, I found the suggestions returned from the standard solr spellcheck not to be that relevant. By contrast, aspell, given the same dictionary and mispelled words, gives much more accurate suggestions. I therefore wrote an implementation of SolrSpellChecker that wraps jazzy, the java aspell libra

Spellchecking and frequency

2010-07-27 Thread dan sutton
Hi, I've recently been looking into Spellchecking in solr, and was struck by how limited the usefulness of the tool was. Like most corpora , ours contains lots of different spelling mistakes for the same word, so the 'spellcheck.onlyMorePopular' is not really that useful unless you click on it nu