Hi Grant,

Its a search relevancy problem. For example:

a document about london reads like

London is not very good for a peaceful break.

we analyse this at the (i can't remember the technical term) is it lexical
level? (bloody hell i think you may have wrote the book !) anyway which
produces tokens in our index of say

"London good peaceful holiday"

users search for cities which would be nice for them to take a holiday in
say the search is
"good for a peaceful break"

and bang london is top. talk about a relevancy problem :-)

now i was thinking of using phrase matches in the synonyms file but is that
the best approach or could nlp help here?

cheers lee




On 10 January 2011 18:21, Grant Ingersoll <gsing...@apache.org> wrote:

>
> On Jan 10, 2011, at 12:42 PM, lee carroll wrote:
>
> > Hi
> >
> > I'm indexing a set of documents which have a conversational writing
> style.
> > In particular the authors are very fond
> > of listing facts in a variety of ways (this is to keep a human reader
> > interested) but its causing my index trouble.
> >
> > For example instead of listing facts like: the house is white, the castle
> is
> > pretty.
> >
> > We get the house is the complete opposite of black and the castle is not
> > ugly.
> >
> > What are the best approaches to resolve these sorts of issues. Even if
> its
> > just handling "not" correctly would be a good start
> >
>
> Hmm, good problem.  I guess I'd start by stepping back and ask what is the
> problem you are trying to solve?  You've stated, I think, one half of the
> problem, namely that your authors have a conversational style, but you
> haven't stated what your users are expecting to do with this information?
>  Is this a pure search app?  Is it something else that is just backed by
> Solr but the user would never do a search?
>
> Do you have a relevance problem?  Also, what is your notion of handling
> "not" correctly?  In other words, more details are welcome!
>
> -Grant
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
>
>

Reply via email to