On Nov 26, 2007, at 6:06 PM, Eswar K wrote:
We essentially are looking at having an implementation for doing
search
which can return documents having conceptually similar words without
necessarily having the original word searched for.
Very challenging. Say someone searches for "LSA" and hits an
archived version of the mail you sent to this list. "LSA" is a
reasonably discriminating term. But so is "Eswar".
If you knew that the original term was "LSA", then you might look for
documents near it in term vector space. But if you don't know the
original term, only the content of the document, how do you know
whether you should look for docs near "lsa" or "eswar"?
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/