edge_ngram and short words containing digits

2014-05-28 Thread Kevin Murphy
Hi,

i’m using Django Haystack 2.1.0 with Solr 4.8.1 in an auto-complete 
application.  I’ve noticed that words containing digits are not being matched.  
Examples are ‘B2B’, ‘PSG4’, and ‘5S_rRNA’.  The words match up to the 
occurrence of the digit and fail starting with the digit.

Below is what I believe to be the relevant chunk from the Haystack-generated 
Solr schema.xml.  If I need to include more, let me know.

COPB2


  




  
  



  


Can I get this to work by tweaking the WordDelimiterFilterFactory attributes 
somehow, or do I need to do something else?

Thanks,
Kevin



Re: edge_ngram and short words containing digits

2014-05-28 Thread Kevin Murphy
On May 28, 2014, at 6:19 PM, Kevin Murphy  wrote:
> i’m using Django Haystack 2.1.0 with Solr 4.8.1 in an auto-complete 
> application.  I’ve noticed that words containing digits are not being 
> matched.  Examples are ‘B2B’, ‘PSG4’, and ‘5S_rRNA’.  The words match up to 
> the occurrence of the digit and fail starting with the digit.

I solved the problem by adding `splitOnNumerics=“0”` to the 
solr.WordDelimiterFilterFactory filter for both the index and query analyzers.  
I don’t know if there is a potential downside to this.

Regards,
Kevin