Hi, Are you sure you are not looking at the original field values? (what is the schema browser are you referring to?) Yes, tokenizer + filters are applied in the order they are defined in, so the order is important. For example, you typically want to lower-case tokens before removing stop words because, presumably, your stop words are all lower-case.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Luca Molteni <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Monday, September 22, 2008 4:43:43 AM > Subject: Standard analyzer and acronyms > > Hello, list. > > I found some strange results using the standard analyzer. > > I've put it in both query and index time, but when I use the schema browser > to see the commond values for field, i find: > > spa1558 s.p.a. 833 > Which is pretty strange, since I've used the analyzer to remove the dots > from the acronyms. > > My hypothesis is that the StandardAnalyzer remove dots from only the > uppercase acronyms. > > Can anyone confirm this to me? > > Regarding this, I was wondering if the filter and the tokenizers are applied > sequencely using the order in which they are written. > For example, if I use the StandardAnalyzer, the StopFilter for the words > "IBM" and the whitespace tokenizer > > "I.B.M Company" > > 1. The standard removes the dot > > "IBM Company" > > 2. The stopfilter removes the word "IBM" > > "Company" > > 3. The analyzer returns only one token > > "Company". > > I know, this is not a great example, but I think that not all the analyzer > are commutative, then there should be an order in which they are applied. > > Thank you very much. > > L.M.