Re: search with special chars like € @ % §

Erick Erickson Sat, 31 Jul 2010 05:09:37 -0700

Could you provide some more details on your use case? This sounds like an XY
problem (see http://people.apache.org/~hossman/#xyproblem). The reason I
say this is that you're probably going to shoot yourself in the foot if you
require such symbols, leading to an "interesting" user experience.

That said, you can pre-process your data for both indexing and seaching
by, say applying a regex that strategically removes the spaces you care
about and using, say, the whitespacetokenizer. I'll also apply a
lowercasefilter.

HTH
Erick

On Thu, Jul 29, 2010 at 10:25 AM, <markus.rietz...@rzf.fin-nrw.de> wrote:

> hi,
> what is the best way to deal with searches with special chars like §
> (paragraph), € (euro), @ (at in emails), % and so forth.
> i think that the WordDelimiterFilters is working on such chars (on
> index-time and on query-time).
>
> the greatest problem i see is, that there can be an optional space between
> those chars and numbers, like 50% or 50 %, or §235 or § 235 and so on.
> so even if i get the WordDelimiter (or any other filter) right and working
> with those chars i think there is no way to deal with the optional spaces.
>
> anyone have a solution for this.
>
>
>
> markus
>

Re: search with special chars like € @ % §

Reply via email to