Sentence level searching

Michael Imbeault Sun, 12 Nov 2006 15:53:57 -0800

Hello everyone,

I'm trying to do some sentence-level searching with Solr; basically, Iwant to find if two words are in the same sentence. As I read on theLucene mailing list, there's many ways to do this, including but notlimited to :

-inserting special boundary terms to denote the start and end of asentence. It is unclear to me what kind of query should be used to fetchresults from within one sentence (something like: start_sentence_tokenword1 word2 end_sentence_token)?-increase token position at a sentence boundary by a large factor(1000?) so that "x y"~500 (or more) won't match across sentence boundaries.

Is there an existing filter class that I could use to do this, or shouldI first parse my text fields with PHP and some NLP tool, and index theresult (for the first case)? For the second case (increment tokenposition), how should I do this within Solr?


Is there any plans to implement such functionality as standard?

Thanks for the help,

--
Michael Imbeault
CHUL Research Center (CHUQ)
2705 boul. Laurier
Ste-Foy, QC, Canada, G1V 4G2
Tel: (418) 654-2705, Fax: (418) 654-2212

Sentence level searching

Reply via email to