Re: Solr, ICUTokenizer with Latin-break-only-on-whitespace

2013-06-20 Thread Jonathan Rochkind
Thank you... I started out writing an email with screenshots proving that it wasn't working for me in 4.3.0... and of course, having to confirm every single detail in order to say I confirmed it... I realized it was a mistake on my part, not testing what I thought I was testing. Does indeed ap

Re: Solr, ICUTokenizer with Latin-break-only-on-whitespace

2013-06-20 Thread Shawn Heisey
On 6/20/2013 1:26 PM, Jonathan Rochkind wrote: I want, for instance, "C++ Language" to be tokenized into "C++", "Language". But the ICUTokenizer, even with the rulefiles="Latn:Latin-break-only-on-whitespace.rbbi", with the rbbi file from the Solr 4.3 source [1]. But the ICUTokenizer, even wi