Re: Spellchecking and frequency

2010-07-27 Thread Mark Holland
Hi, I found the suggestions returned from the standard solr spellcheck not to be that relevant. By contrast, aspell, given the same dictionary and mispelled words, gives much more accurate suggestions. I therefore wrote an implementation of SolrSpellChecker that wraps jazzy, the java aspell libra

Indexing slowdowns

2010-07-08 Thread Mark Holland
Since I began using the 2010-05-18 nightly I'm experiencing indexing slow downs which I didn't with solr-1.4. I'm seeing indexing slow down roughly every 7m records. I'm indexing about 28m in total. These records are batched into csv files of 1m rows, which are loaded with stream.file. Solr happil

Determining matched tokens in original query

2010-07-08 Thread Mark Holland
Hi, I'm trying to find out which tokens in a user's query matched against each result. I've been trying to use the highlight component for this, however it doesn't quite fit the bill. I'm using edismax, with mm set to 50%, and I want to extract for each matching doc which tokens /didn't/ match (I