Erick,
What you are saying of course makes perfect sense.
But in our particular situation there is a high probability that an
essential part of the query will match a meaningful part or a business name
in a short description indexed as shingle.
Also it is better than just a broad match.
Besides I
Sounds like an attempt to identify stable Multi Word Units, sometimes
used in Natural Language Processing.
In that case, a Shingle factory plus using the field as a facet might
do the trick.
The shingle will generate a "token" that is "this kind of winter" and
facet will give back a count for it.
Tokenizers, filters and the like have no real way to
figure out that some words in the query are to be
ignored. In your example, how would one algorithmically
determine that "this kind of winter" is important and that
"Hi", "likes" and "weather" aren't? What's different
about like/likes that indica