Re: Re[2]: startsWith?

Mike Klaas Mon, 05 May 2008 19:10:52 -0700


On 3-May-08, at 10:44 PM, JLIST wrote:

Hello Otis,

Do you mean that if I index the URL as a "text" field, I'll
be able to do * for a given prefix because the text will be
tokenized at the "/" and should suffice for my need?


I'm not sure what your needs are, but I use the following to index urls:

    <fieldType name="reverse_domain" class="solr.TextField">
      <analyzer>
        <tokenizer class="solr.PatternTokenizerFactory" pattern="\."/>
        <filter class="solr.LowerCaseFilterFactory"/>
      </analyzer>
    </fieldType>

(in which is stored the _reversed domain_.  That is, "com.example.www")

I also store the url as a textTight (see example schema). If you wantto do prefix matching on the url, I recommend storing it untokenizedin another field (or minimal tokenization, like lowercasing).

If, like me, you want to restrict document to a certain domain andsubdomains, you have to be careful with your query:


reverse_domain:com.example reverse_domain:com.example.*

If you just do reverse_domain:com.example*, you will also match www.foo-example.com, which you don't want.


-Mike

Re: Re[2]: startsWith?

Reply via email to