Check out 
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory
Don't know if it works with phrases though

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 31. mars 2011, at 16.49, Brian Lamb wrote:

> No, I don't really want to break down the words into subwords. In the
> example I provided, I would not want "kind" to match either record because
> it is not at the beginning of the word even though "kind" appears in both
> records as part of a word.
> 
> On Wed, Mar 30, 2011 at 4:42 PM, lboutros <boutr...@gmail.com> wrote:
> 
>> Do you want to tokenize subwords based on dictionaries ? A bit like
>> disagglutination of german words ?
>> 
>> If so, something like this could help : DictionaryCompoundWordTokenFilter
>> 
>> http://search.lucidimagination.com/search/document/CDRG_ch05_5.8.8
>> 
>> Ludovic
>> 
>> 
>> http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/analysis/compound/DictionaryCompoundWordTokenFilter.html
>> 
>> 2011/3/30 Brian Lamb [via Lucene] <
>> ml-node+2754668-300063934-383...@n3.nabble.com>
>> 
>>> Hi all,
>>> 
>>> I have a field set up like this:
>>> 
>>> <field name="common_names" multiValued="true" type="text" indexed="true"
>>> stored="true" required="false" />
>>> 
>>> And I have some records:
>>> 
>>> RECORD1
>>> <arr name="common_names">
>>> <str>companion to mankind</str>
>>> <str>pooch</str>
>>> </arr>
>>> 
>>> RECORD2
>>> <arr name="common_names">
>>> <str>companion to womankind</str>
>>> <str>man's worst enemy</str>
>>> </arr>
>>> 
>>> I would like to write a query that will match the beginning of a word
>>> within
>>> the term. Here is the query I would use as it exists now:
>>> 
>>> 
>> http://localhost:8983/solr/search/?q=*:*&fq={!q.op=AND%20df=common_names}
>> "companion
>>> 
>>> man"~10
>>> 
>>> In the above example. I would want to return only RECORD1.
>>> 
>>> The query as it exists right now is designed to only match records where
>>> both words are present in the same term. So if I changed man to mankind
>> in
>>> the query, RECORD1 will be returned.
>>> 
>>> Even though the phrases companion and man exist in the same term in
>>> RECORD2,
>>> I do not want RECORD2 to be returned because 'man' is not at the
>> beginning
>>> of the word.
>>> 
>>> How can I achieve this?
>>> 
>>> Thanks,
>>> 
>>> Brian Lamb
>>> 
>>> 
>>> ------------------------------
>>> If you reply to this email, your message will be added to the discussion
>>> below:
>>> 
>>> 
>> http://lucene.472066.n3.nabble.com/Matching-the-beginning-of-a-word-within-a-term-tp2754668p2754668.html
>>> To start a new topic under Solr - User, email
>>> ml-node+472068-1765922688-383...@n3.nabble.com
>>> To unsubscribe from Solr - User, click here<
>> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=472068&code=Ym91dHJvc2xAZ21haWwuY29tfDQ3MjA2OHw0Mzk2MDUxNjE=
>>> .
>>> 
>>> 
>> 
>> 
>> -----
>> Jouve
>> France.
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Matching-the-beginning-of-a-word-within-a-term-tp2754668p2755561.html
>> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to