Hi everyone,

How to support 'implicit trailing wildcard *' using Solr, eg: using Google
to search 'umoun', 'umount' will be matched , search 'mounta', 'mountain'
will be matched.

>From my point of view, there are several ways, both with disadvantages:

1) Using EdgeNGramFilterFactory, thus 'umount' will be indexed with 'u',
'um', 'umo', 'umou', 'umoun', 'umount'. The disadvantages are: a) the index
size increases dramatically, b) will matches even has no relationship, such
as such 'mount' will match 'mountain' also.

2) Using two pass searching: first pass searches term dictionary through
TermsComponent using given keyword, then using the first matched term from
term dictionary to search again. eg: when user enter 'umoun', TermsComponent
will match 'umount', then use 'umount' to search. The disadvantage are: a)
need to parse query string so that could recognize meta keywords such as
'AND', 'OR', '+', '-', '"' (this makes more complex as I am using PHP
client), b) The returned hit counts is not for original search string, thus
will influence other components such as auto-suggest component based on user
search history and hit counts.

3) Write custom SearchComponent, while have no idea where/how to start with.

Is there any other way in Solr to do this, any feedback/suggestion are
welcome!

Thanks very much in advance!

Reply via email to