Hi Jan,

Thanks for this suggestion. If we choose parsing, then why don't we do it at 
the indexing side, instead of the querying side, which might slows down the 
search process? i.e., if a document has "is_man=true" and "is_single=true", the 
we populate a text field by the words "man" and "single". Then, during the 
search, we compare the user query with the text field. There's no "intelligent" 
query in my application, i.e., users would not ask for "not smoking". If they 
mention a word, it means that the boolean value is true.

I don't have many fields, so populating a text field will not dramatically 
increase the size of my index.

What do you think?

-Saïd

On Jul 3, 2010, at 12:36 AM, Jan Høydahl / Cominvent wrote:

> Hi,
> 
> I would rather go for the boolean variant and spend some time writing a query 
> parser which tries to understand all kinds of input people may make, mapping 
> it into boolean filters. In this way you can support both navigation and 
> search and keep both in sync whatever people prefert to start with. I'm not 
> saying it is easy to write such a parser, but you know the domain and the 
> users...
> 
> Another reason for doing it this way is that if you have a field 
> does_smoke=true, you still want to match if someone writes "not smoking". 
> Your parser would have to understand negations, e.g. through a set of regex 
> ((not|non|no) (smoker|smoking|smoke))...
> 
> You could always do a mix also - to keep a free-text field as well, and any 
> words that your parser does not understand can be passed through to the 
> free-text as a "should" term with a boost.
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Training in Europe - www.solrtraining.com
> 
> On 2. juli 2010, at 18.36, Saïd Radhouani wrote:
> 
>> Hi,
>> 
>> I have the following kind of data to index in a multilingual context: 
>> is_man, is_single, has_job, etc.
>> 
>> Logically, the underlying fields have a value of "yes" or "no." That's why 
>> the boolean type would be appropriate. But my problem is, in addition to be 
>> able to filter on these fields, I would like to give my users the 
>> possibility to search against these fields using free text. i.e., a query 
>> might be "single man having job." Therefore, I think that the boolean type 
>> is not appropriate anymore. Instead, I'm thinking of using the string type, 
>> and each field will be either empty (the "no" case), or populated by its own 
>> tag. e.g., if we deal about a man, the field is_man will contain the string 
>> "man." Then, I copy all these fields into a text field that I ca user for 
>> free text search.
>> 
>> Does that make sense?
>> 
>> Does that make sense in a multilingual context, i.e., field tags can be 
>> different in each language (EN => man, single, jog, FR => homme, 
>> célibataire, emploi, etc.)
>> 
>> Thanks!
>> 
>> -Saïd
> 

Reply via email to