On Jan 10, 2011, at 12:42 PM, lee carroll wrote:

> Hi
> 
> I'm indexing a set of documents which have a conversational writing style.
> In particular the authors are very fond
> of listing facts in a variety of ways (this is to keep a human reader
> interested) but its causing my index trouble.
> 
> For example instead of listing facts like: the house is white, the castle is
> pretty.
> 
> We get the house is the complete opposite of black and the castle is not
> ugly.
> 
> What are the best approaches to resolve these sorts of issues. Even if its
> just handling "not" correctly would be a good start
> 

Hmm, good problem.  I guess I'd start by stepping back and ask what is the 
problem you are trying to solve?  You've stated, I think, one half of the 
problem, namely that your authors have a conversational style, but you haven't 
stated what your users are expecting to do with this information?  Is this a 
pure search app?  Is it something else that is just backed by Solr but the user 
would never do a search?  

Do you have a relevance problem?  Also, what is your notion of handling "not" 
correctly?  In other words, more details are welcome!

-Grant

--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Reply via email to