You are tokenising …
""
Be careful in doing first the lowercase token filter.
It's a best practice to first charFilter, then Tokenize and finally the set
of Token Filters.
Cheers
2015-06-26 13:27 GMT+01:00 Mike Thomsen :
> I tried creating a simplified new text field type that only did lower
> c
I tried creating a simplified new text field type that only did lower
casing and exact phrasing worked this time. I'm not sure what the problem
was. Perhaps it was a case of copypasta gone bad because I could have sworn
that I tried exact phrase matching against a simple text field with bad
results
Lucene, the underlying search engine library, imposes this 32K limit for
individual terms. Use tokenized text instead.
-- Jack Krupansky
On Thu, Jun 25, 2015 at 8:36 PM, Mike Thomsen
wrote:
> I need to be able to do exact phrase searching on some documents that are a
> few hundred kb when treat
I agree with Updaya,
furthermore It doesn't make any sense to try to solve a "Phrase search"
problem , not tokenising at all the text …
It's not going to work and it is fundamentally wrong to not tokenise long
textual fields if you want to do free text search in them.
Can you explain us better your
Why do you want to use the KeywordTokenizer? Why not use a text field,
and use Solr's phrase search features?
q="some phrase" will match those terms next to each other, and should be
fine with a large block of text.
Combine that with hit highlighting, and it'll return a snippet of that
block of t