Sorry, I missed this. We have the same problem. None of our customers use query syntax, so I have considered making a full-text query parser. Use the analyzer chain, then convert the result into a big OR query, then pass it to the rest of Dismax. Shingles and synonyms should work at query time with that approach.
This question should probably go to a Lucene list, too. wunder On 3/11/09 2:54 AM, "Tobias Dittrich" <dittr...@wave-computer.de> wrote: > Hmmm was my mail so weird or my question so stupid ... or is > there simply noone with an answer? Not even a hint? :( > > Tobias Dittrich schrieb: >> Hi all, >> >> I know there are a lot of topics about compound word search already but >> I haven't found anything for my specific problem yet. So if this is >> already answered (which would be nice :)) then any hints or search >> phrases for the mail archive would be apreciated. >> >> Bascially I want users to be able to search my index for compound words >> that are not really compounds but merely terms that can be written in >> several ways. >> >> For example I have the categories "usb" and "cable" in my index and I >> want the user to be able to search for "usbcable" or "usb-cable" etc. >> Also there is "bluetooth" in the index and I want the search for "blue >> tooth" to return the corresponding documents. >> >> My approach is to use ShingleFilterFactory followed by >> WordDelimiterFilterFactory to index all possible combinations of words >> and get rid of intra-word delimiters. This nicely covers the first part >> of my requirements since the terms "usb" and "cable" somewhere along the >> process get concatenated and "usbcable" is in the index. >> >> Now I also want use this on the query side, so the user input "blue >> tooth" (not as phrase) would become "bluetooth" for this field and >> produce a hit. But this never happens since with the DisMax Searcher the >> parser produces a query like this: >> >> ((category:blue | name:blue)~0.1 (category:tooth | name:tooth)~0.1) >> >> And the filters and analysers for this field never get to see the whole >> user query and cannot perform their shingle and delimiter tasks :( >> >> So my question now is: how can I get this working? Is there a preferable >> way to deal with this compound word problem? Is there another query >> parser that already does the trick? >> >> Or would it make sense to write my own query parser that passes the user >> query "as is" to the several fields? >> >> Any hints on this are welcome. >> >> Thanks in advance >> Tobias >>