Re: Case sensitivity on hostnames and email addresses

2006-12-13 Thread Yonik Seeley
Oh, and yet another way to get around it (with it's own trade offs) is to use something like fieldtype textTight in the example schema.xml, which catenates all word parts in both the index analyzer and query analyzer. This would index as "upanddownmysitecom" and allow the following queries to mat

Re: Case sensitivity on hostnames and email addresses

2006-12-13 Thread Yonik Seeley
On 12/13/06, Wade Leftwich <[EMAIL PROTECTED]> wrote: I've run into some unexpected case sensitivity on searches, at least unexpected by me. If you index a text field containing this sentence: A sentence containing CamelCase words by [EMAIL PROTECTED] is found at StudlyCaps.org The document wi

Re: Case sensitivity on hostnames and email addresses

2006-12-13 Thread Walter Underwood
Also, avoid stemming URLs. I used a stemmer that turned my "best.com" URL into "good.com". The Lucene StandardAnalyzer works pretty hard to avoid that. --wunder On 12/13/06 9:33 PM, "Otis Gospodnetic" <[EMAIL PROTECTED]> wrote: > When indexing (and searching), make sure you are using an Analyzer

Re: Case sensitivity on hostnames and email addresses

2006-12-13 Thread Otis Gospodnetic
When indexing (and searching), make sure you are using an Analyzer that lower-cases (or upper-cases) tokens. These are from Lucene, so Solr has them, too: ./src/java/org/apache/lucene/analysis/LowerCaseTokenizer.java ./src/java/org/apache/lucene/analysis/LowerCaseFilter.java Otis - Origi