You are tokenising … "<tokenizer class="solr.WhitespaceTokenizerFactory"/>" Be careful in doing first the lowercase token filter. It's a best practice to first charFilter, then Tokenize and finally the set of Token Filters.
Cheers 2015-06-26 13:27 GMT+01:00 Mike Thomsen <mikerthom...@gmail.com>: > I tried creating a simplified new text field type that only did lower > casing and exact phrasing worked this time. I'm not sure what the problem > was. Perhaps it was a case of copypasta gone bad because I could have sworn > that I tried exact phrase matching against a simple text field with bad > results. Thanks for the help. In case anyone sees this and wonders what the > field I created looks like here it is (with phonetic matching) > > <fieldType name="phonetics" class="solr.TextField" > positionIncrementGap="100" multiValued="true"> > <analyzer type="index"> > <filter class="solr.LowerCaseFilterFactory"/> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.PhoneticFilterFactory" encoder="RefinedSoundex" > inject="true"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.PhoneticFilterFactory" encoder="RefinedSoundex" > inject="true"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > </fieldType> > > On Fri, Jun 26, 2015 at 7:24 AM, Jack Krupansky <jack.krupan...@gmail.com> > wrote: > > > Lucene, the underlying search engine library, imposes this 32K limit for > > individual terms. Use tokenized text instead. > > > > -- Jack Krupansky > > > > On Thu, Jun 25, 2015 at 8:36 PM, Mike Thomsen <mikerthom...@gmail.com> > > wrote: > > > > > I need to be able to do exact phrase searching on some documents that > > are a > > > few hundred kb when treated as a single block of text. I'm on 4.10.4 > and > > it > > > complains when I try to put something larger than 32kb in using a > > textfield > > > with the keyword tokenizer as the tokenizer. Is there any way I can > index > > > say a 500kb block of text like this? > > > > > > Thanks, > > > > > > Mike > > > > > > -- -------------------------- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England