Just to add to Jack's points, you can also use the term query parser to avoid all the escaping for special characters, e.g.
fq={!term f=some_field}<crazy&term#value%> See Erik's preso from Apache Eurocon 2012 around 25:50 - http://vimeopro.com/user11514798/apache-lucene-eurocon-2012/video/55822628 On Tue, Mar 12, 2013 at 12:33 PM, Jack Krupansky <j...@basetechnology.com>wrote: > Use the white space tokenizer and be sure to escape a lot of them in > queries since a number of them have meaning to the query parser. Or, > enclose query terms in quotes. > > -- Jack Krupansky > > -----Original Message----- From: vsl > Sent: Tuesday, March 12, 2013 11:16 AM > To: solr-user@lucene.apache.org > Subject: Special characters not indexed > > > Hi, > I am trying to index special characters and make them searchable. > > User Story: > 1. Index document with content: §$ %&/( )=? +*#'-<> > 2. Find indexed document using search term: & > > Additionaly I have several other fields that are copied to textAll Field. > The search is performed on this field. > > Does anybody know how to deal with such cases? > > Field definition: > > <fieldType name="text_general" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.**StandardTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="**true" /> > > <filter class="solr.**LowerCaseFilterFactory"/> > <filter class="solr.**SnowballPorterFilterFactory" > language="English"/> > <filter class="solr.**WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1" > preserveOriginal="1" types="characters.txt" /> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.**StandardTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="**true" /> > <filter class="solr.**SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > <filter class="solr.**LowerCaseFilterFactory"/> > <filter class="solr.**SnowballPorterFilterFactory" > language="English"/> > <filter class="solr.**WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1" > preserveOriginal="1" types="characters.txt" /> > </analyzer> > </fieldType> > > where: characters.txt > > § => ALPHA > $ => ALPHA > % => ALPHA > & => ALPHA > / => ALPHA > ( => ALPHA > ) => ALPHA > = => ALPHA > ? => ALPHA > + => ALPHA > * => ALPHA > # => ALPHA > ' => ALPHA > - => ALPHA > < => ALPHA > >> => ALPHA >> > > > > > -- > View this message in context: http://lucene.472066.n3.** > nabble.com/Special-characters-**not-indexed-tp4046630.html<http://lucene.472066.n3.nabble.com/Special-characters-not-indexed-tp4046630.html> > Sent from the Solr - User mailing list archive at Nabble.com. >