Francis, If you can wait another month or so, you could skip 1.3.0, and jump to 1.4 which will be released soon.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > >From: Francis Yakin <fya...@liquid.com> >To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> >Sent: Wednesday, June 10, 2009 1:17:25 AM >Subject: Upgrading 1.2.0 to 1.3.0 solr > > > >I am in process to upgrade our solr 1.2.0 to solr 1.3.0 > >Our solr 1.2.0 now is working fine, we just want to upgrade it cause we have >an application that requires some function from 1.3.0( we call it >autocomplete). > >Currently our config files on 1.2.0 are as follow: > >Solrconfig.xml >Schema.xml ( we wrote this in house) >Index_synonyms.txt ( we also modified and wrote this in house) >Scripts.conf >Protwords.txt >Stopwords.txt >Synonyms.txt > >I understand on 1.3.0 , it has new solrconfig.xml . > >My questions are: > >1) what config files that I can reuse from 1.2.0 for 1.3.0 > can I use the same schema.xml >2) Solrconfig.xml, can I use the 1.2.0 version or I have to stick with 1.3.0 > If I need to stick with 1.3.0, what that I need to change. > >As of right I am testing it in my sandbox, so it doesn't work. > >Please advice, if you have any docs for upgrading 1.2.0 to 1.3.0 let me know. > >Thanks in advance > >Francis > >Note: I attached my solrconfig and schema.xml in this email > > > >-----Inline Attachment Follows----- > ><?xml version="1.0" encoding="UTF-8" ?> ><!-- >Licensed to the Apache Software Foundation (ASF) under one or more >contributor license agreements. See the NOTICE file distributed with >this work for additional information regarding copyright ownership. >The ASF licenses this file to You under the Apache License, Version 2.0 >(the "License"); you may not use this file except in compliance with >the License. You may obtain a copy of the License at > > http://www.apache.org/licenses/LICENSE-2.0 > > >Unless required by applicable law or agreed to in writing, software >distributed under the License is distributed on an "AS IS" BASIS, >WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. >See the License for the specific language governing permissions and >limitations under the License. >--> > ><!-- >This is the Solr schema file. This file should be named "schema.xml" and >should be in the conf directory under the solr home >(i.e. ./solr/conf/schema.xml by default) >or located where the classloader for the Solr webapp can find it. > >This example schema is the recommended starting point for users. >It should be kept correct and concise, usable out-of-the-box. > >For more information, on how to customize this file, please see >http://wiki.apache.org/solr/SchemaXml > >--> > ><schema name="example" version="1.1"> > <!-- attribute "name" is the name of this schema and is only used for > display purposes. > Applications should change this to reflect the nature of the search > collection. > version="1.1" is Solr's version number for the schema syntax and > semantics. It should > not normally be changed by applications. > 1.0: multiValued attribute did not exist, all fields are multiValued by > nature > 1.1: multiValued attribute introduced, false by default --> > > <types> > <!-- field type definitions. The "name" attribute is > just a label to be used by field definitions. The "class" > attribute and any other attributes determine the real > behavior of the fieldType. > Class names starting with "solr" refer to java classes in the > org.apache.solr.analysis package. > --> > > <!-- The StrField type is not analyzed, but indexed/stored verbatim. > - StrField and TextField support an optional compressThreshold which > limits compression (if enabled in the derived fields) to values which > exceed a certain size (in characters). > --> > <fieldType name="string" class="solr.StrField" sortMissingLast="true" > omitNorms="true"/> > > <!-- boolean type: "true" or "false" --> > <fieldType name="boolean" class="solr.BoolField" sortMissingLast="true" > omitNorms="true"/> > > <!-- The optional sortMissingLast and sortMissingFirst attributes are > currently supported on types that are sorted internally as strings. > - If sortMissingLast="true", then a sort on this field will cause > documents > without the field to come after documents with the field, > regardless of the requested sort order (asc or desc). > - If sortMissingFirst="true", then a sort on this field will cause > documents > without the field to come before documents with the field, > regardless of the requested sort order. > - If sortMissingLast="false" and sortMissingFirst="false" (the default), > then default lucene sorting will be used which places docs without the > field first in an ascending sort and last in a descending sort. > --> > > > <!-- numeric field types that store and index the text > value verbatim (and hence don't support range queries, since the > lexicographic ordering isn't equal to the numeric ordering) --> > <fieldType name="integer" class="solr.IntField" omitNorms="true"/> > <fieldType name="long" class="solr.LongField" omitNorms="true"/> > <fieldType name="float" class="solr.FloatField" omitNorms="true"/> > <fieldType name="double" class="solr.DoubleField" omitNorms="true"/> > > > <!-- Numeric field types that manipulate the value into > a string value that isn't human-readable in its internal form, > but with a lexicographic ordering the same as the numeric ordering, > so that range queries work correctly. --> > <fieldType name="sint" class="solr.SortableIntField" > sortMissingLast="true" omitNorms="true"/> > <fieldType name="slong" class="solr.SortableLongField" > sortMissingLast="true" omitNorms="true"/> > <fieldType name="sfloat" class="solr.SortableFloatField" > sortMissingLast="true" omitNorms="true"/> > <fieldType name="sdouble" class="solr.SortableDoubleField" > sortMissingLast="true" omitNorms="true"/> > > > <!-- The format for this date field is of the form 1995-12-31T23:59:59Z, > and > is a more restricted form of the canonical representation of dateTime > http://www.w3.org/TR/xmlschema-2/#dateTime > The trailing "Z" designates UTC time and is mandatory. > Optional fractional seconds are allowed: 1995-12-31T23:59:59.999Z > All other components are mandatory. > > Expressions can also be used to denote calculations that should be > performed relative to "NOW" to determine the value, ie... > > NOW/HOUR > ... Round to the start of the current hour > NOW-1DAY > ... Exactly 1 day prior to now > NOW/DAY+6MONTHS+3DAYS > ... 6 months and 3 days in the future from the start of > the current day > > Consult the DateField javadocs for more information. > --> > <fieldType name="date" class="solr.DateField" sortMissingLast="true" > omitNorms="true"/> > > <!-- solr.TextField allows the specification of custom text analyzers > specified as a tokenizer and a list of token filters. Different > analyzers may be specified for indexing and querying. > > The optional positionIncrementGap puts space between multiple fields > of > this type on the same document, with the purpose of preventing false > phrase > matching across fields. > > For more info on customizing your analyzer chain, please see > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters > > --> > > <!-- One can also specify an existing Analyzer class that has a > default constructor via the class attribute on the analyzer element > <fieldType name="text_greek" class="solr.TextField"> > <analyzer class="org.apache.lucene.analysis.el.GreekAnalyzer"/> > </fieldType> > --> > > <!-- A text field that only splits on whitespace for exact matching of > words --> > <fieldType name="text_ws" class="solr.TextField" > positionIncrementGap="100"> > <analyzer> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > </analyzer> > </fieldType> > > <!-- A text field that uses WordDelimiterFilter to enable splitting and > matching of > words on case-change, alpha numeric boundaries, and non-alphanumeric > chars, > so that a query of "wifi" or "wi fi" could match a document containing > "Wi-Fi". > Synonyms and stopwords are customized by external files, and stemming > is enabled. > Duplicate tokens at the same position (which may result from Stemmed > Synonyms or > WordDelim parts) are removed. > --> > <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.SynonymFilterFactory" > synonyms="index_synonyms.txt" ignoreCase="true" expand="true"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt"/> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" > generateNumberParts="1" catenateWords="1" catenateNumbers="1" > catenateAll="0"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.EnglishPorterFilterFactory" > protected="protwords.txt"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt"/> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" > generateNumberParts="1" catenateWords="0" catenateNumbers="0" > catenateAll="0"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.EnglishPorterFilterFactory" > protected="protwords.txt"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > </fieldType> > > > <!-- Less flexible matching, but less false matches. Probably not ideal > for product names, > but may be good for SKUs. Can insert dashes in the wrong place and > still match. --> > <fieldType name="textTight" class="solr.TextField" > positionIncrementGap="100" > > <analyzer> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="false"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt"/> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" > generateNumberParts="0" catenateWords="1" catenateNumbers="1" > catenateAll="0"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.EnglishPorterFilterFactory" > protected="protwords.txt"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > </fieldType> > > <!-- This is an example of using the KeywordTokenizer along > With various TokenFilterFactories to produce a sortable field > that does not include some properties of the source text > --> > <fieldType name="alphaOnlySort" class="solr.TextField" > sortMissingLast="true" omitNorms="true"> > <analyzer> > > <!-- KeywordTokenizer does no actual tokenizing, so the entire > input string is preserved as a single token > --> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <!-- The LowerCase TokenFilter does what you expect, which can be > when you want your sorting to be case insensitive > --> > <filter class="solr.LowerCaseFilterFactory" /> > <!-- The TrimFilter removes any leading or trailing whitespace --> > <filter class="solr.TrimFilterFactory" /> > <!-- The PatternReplaceFilter gives you the flexibility to use > Java Regular expression to replace any sequence of characters > matching a pattern with an arbitrary replacement string, > which may include back refrences to portions of the orriginal > string matched by the pattern. > > See the Java Regular Expression documentation for more > infomation on pattern and replacement string syntax. > > > http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/package-summary.html > > <filter class="solr.PatternReplaceFilterFactory" > pattern="([^a-z])" replacement="" replace="all" > /> > --> > </analyzer> > </fieldType> > > <!-- since fields of this type are by default not stored or indexed, any > data added to > them will be ignored outright > --> > <fieldtype name="ignored" stored="false" indexed="false" > class="solr.StrField" /> > ></types> > > ><fields> > <!-- Valid attributes for fields: > name: mandatory - the name for the field > type: mandatory - the name of a previously defined type from the <types> > section > indexed: true if this field should be indexed (searchable or sortable) > stored: true if this field should be retrievable > compressed: [false] if this field should be stored using gzip compression > (this will only apply if the field type is compressable; among > the standard field types, only TextField and StrField are) > multiValued: true if this field may contain multiple values per document > omitNorms: (expert) set to true to omit the norms associated with > this field (this disables length normalization and index-time > boosting for the field, and saves some memory). Only full-text > fields or fields that need an index-time boost need norms. > --> > > <field name="id" type="string" indexed="true" stored="true" required="true" > /> > <field name="track" type="text" indexed="true" stored="true" /> > <field name="trackId" type="sint" indexed="true" stored="true"/> > <field name="type" type="text" indexed="true" stored="true"/> > <field name="price" type="sfloat" indexed="true" stored="true"/> > <field name="graphic" type="text" indexed="false" stored="true"/> > <field name="clip" type="text" indexed="false" stored="true"/> > <field name="releaseDate" type="date" indexed="true" stored="true" > multiValue="false"/> > <field name="liveDate" type="date" indexed="true" stored="true" > multiValue="false"/> > <field name="endDate" type="date" indexed="true" stored="true" > multiValue="false"/> > <field name="genre" type="text" indexed="true" stored="true"/> > <field name="genreId" type="sint" indexed="true" stored="true"/> > <field name="subgenre" type="text" indexed="true" stored="true"/> > <field name="subgenreId" type="sint" indexed="true" stored="true"/> > <field name="tags" type="text" indexed="true" stored="true"/> > <field name="artist" type="text" indexed="true" stored="true"/> > <field name="artistId" type="sint" indexed="true" stored="true"/> > <field name="album" type="text" indexed="true" stored="true"/> > <field name="albumId" type="sint" indexed="true" stored="true"/> > <field name="duration" type="sint" indexed="true" stored="true"/> > <field name="explodedItemSequence" type="sint" indexed="true" > stored="true"/> > <field name="bitrate" type="sint" indexed="false" stored="true"/> > <field name="salesRank" type="sint" indexed="true" stored="true"/> > > <!-- > Sort artist name used by mp3 store to sort artist title for search > --> > <field name="sortArtistName" type="text" indexed="true" stored="true"/> > <field name="availableOnAlbum" type="boolean" indexed="true" > stored="true"/> > > <field name="sku" type="textTight" indexed="true" stored="true" > omitNorms="true"/> > <field name="text" type="text" indexed="true" stored="false" > multiValued="true"/> > > <field name="trackSort" type="string" indexed="true" stored="false"/> > <field name="alphaTrackSort" type="alphaOnlySort" indexed="true" > stored="true"/> > > <field name="albumSort" type="string" indexed="true" stored="false"/> > <field name="alphaAlbumSort" type="alphaOnlySort" indexed="true" > stored="false"/> > > <field name="artistSort" type="string" indexed="true" stored="false"/> > <field name="alphaArtistSort" type="alphaOnlySort" indexed="true" > stored="false"/> > > <field name="genreSort" type="string" indexed="true" stored="false"/> > <field name="genreArtistSort" type="alphaOnlySort" indexed="true" > stored="false"/> > > > <!-- Here, default is used to create a "timestamp" field indicating > When each document was indexed. > --> > <field name="timestamp" type="date" indexed="true" stored="true" > default="NOW" multiValued="false"/> > > > <!-- Dynamic field definitions. If a field name is not found, dynamicFields > will be used if the name matches any of the patterns. > RESTRICTION: the glob-like pattern in the name attribute must have > a "*" only at the start or the end. > EXAMPLE: name="*_i" will match any field ending in _i (like myid_i, > z_i) > Longer patterns will be matched first. if equal size patterns > both match, the first appearing in the schema will be used. --> > <dynamicField name="*_i" type="sint" indexed="true" stored="true"/> > <dynamicField name="*_s" type="string" indexed="true" stored="true"/> > <dynamicField name="*_l" type="slong" indexed="true" stored="true"/> > <dynamicField name="*_t" type="text" indexed="true" stored="true"/> > <dynamicField name="*_b" type="boolean" indexed="true" stored="true"/> > <dynamicField name="*_f" type="sfloat" indexed="true" stored="true"/> > <dynamicField name="*_d" type="sdouble" indexed="true" stored="true"/> > <dynamicField name="*_dt" type="date" indexed="true" stored="true"/> > > <!-- uncomment the following to ignore any fields that don't already match > an existing > field name or dynamic field, rather than reporting them as an error. > alternately, change the type="ignored" to some other type e.g. "text" > if you want > unknown fields indexed and/or stored by default --> > <!--dynamicField name="*" type="ignored" /--> > ></fields> > ><!-- Field to use to determine and enforce document uniqueness. > Unless this field is marked with required="false", it will be a required > field > --> ><uniqueKey>id</uniqueKey> > ><!-- field for the QueryParser to use when an explicit fieldname is absent --> ><defaultSearchField>text</defaultSearchField> > ><!-- SolrQueryParser configuration: defaultOperator="AND|OR" --> ><solrQueryParser defaultOperator="AND"/> > > <!-- copyField commands copy one field to another at the time a document > is added to the index. It's used either to index the same field > differently, > or to add multiple fields to the same field for easier/faster > searching. --> > <copyField source="id" dest="sku"/> > > <copyField source="track" dest="text"/> > <copyField source="track" dest="trackSort"/> > <copyField source="track" dest="alphaTrackSort"/> > > > <copyField source="type" dest="text"/> > <copyField source="artist" dest="text"/> > > <copyField source="album" dest="text"/> > <copyField source="album" dest="albumSort"/> > <copyField source="album" dest="alphaAlbumSort"/> > > <copyField source="artist" dest="text"/> > <copyField source="artist" dest="artistSort"/> > <copyField source="artist" dest="alphaArtistSort"/> > > <copyField source="genre" dest="text"/> > <copyField source="genre" dest="genreSort"/> > <copyField source="genre" dest="genreArtistSort"/> > ><!-- Similarity is the scoring routine for each document vs. a query. > A custom similarity may be specified here, but the default is fine > for most applications. --> ><!-- <similarity class="org.apache.lucene.search.DefaultSimilarity"/> --> > ></schema> > > > >-----Inline Attachment Follows----- > ><?xml version="1.0" encoding="UTF-8" ?> ><!-- >Licensed to the Apache Software Foundation (ASF) under one or more >contributor license agreements. See the NOTICE file distributed with >this work for additional information regarding copyright ownership. >The ASF licenses this file to You under the Apache License, Version 2.0 >(the "License"); you may not use this file except in compliance with >the License. You may obtain a copy of the License at > > http://www.apache.org/licenses/LICENSE-2.0 > > >Unless required by applicable law or agreed to in writing, software >distributed under the License is distributed on an "AS IS" BASIS, >WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. >See the License for the specific language governing permissions and >limitations under the License. >--> > ><config> > <!-- Set this to 'false' if you want solr to continue working after it has > encountered an severe configuration error. In a production > environment, > you may want solr to keep working even if one handler is mis-configured. > > You may also set this to false using by setting the system property: > -Dsolr.abortOnConfigurationError=false > --> > > <abortOnConfigurationError>${solr.abortOnConfigurationError:true}</abortOnConfigurationError> > > <!-- Used to specify an alternate directory to hold all index data > other than the default ./data under the Solr home. > If replication is in use, this should match the replication > configuration. --> > <!-- > <dataDir>${solr.data.dir:./solr/data}</dataDir> > --> > > > <indexDefaults> > <!-- Values here affect all index writers and act as a default unless > overridden. --> > <useCompoundFile>false</useCompoundFile> > <mergeFactor>10</mergeFactor> > <maxBufferedDocs>1000</maxBufferedDocs> > > <!-- > If both ramBufferSizeMB and maxBufferedDocs is set, then Lucene will > flush based on whichever limit is hit first. > > --> > <!--<maxBufferedDocs>1000</maxBufferedDocs>--> > <maxBufferedDocs>1000</maxBufferedDocs> > <!-- Tell Lucene when to flush documents to disk. > Giving Lucene more memory for indexing means faster indexing at the cost > of more RAM > > If both ramBufferSizeMB and maxBufferedDocs is set, then Lucene will flush > based on whichever limit is hit first. > > --> > <ramBufferSizeMB>32</ramBufferSizeMB> > <maxMergeDocs>2147483647</maxMergeDocs> > <maxFieldLength>10000</maxFieldLength> > <writeLockTimeout>1000</writeLockTimeout> > <commitLockTimeout>10000</commitLockTimeout> > > <!-- > Expert: Turn on Lucene's auto commit capability. > This causes intermediate segment flushes to write a new lucene > index descriptor, enabling it to be opened by an external > IndexReader. > NOTE: Despite the name, this value does not have any relation to Solr's > autoCommit functionality > --> > <!--<luceneAutoCommit>false</luceneAutoCommit>--> > <!-- > Expert: > The Merge Policy in Lucene controls how merging is handled by Lucene. > The default in 2.3 is the LogByteSizeMergePolicy, previous > versions used LogDocMergePolicy. > > LogByteSizeMergePolicy chooses segments to merge based on their size. > The Lucene 2.2 default, LogDocMergePolicy chose when > to merge based on number of documents > > Other implementations of MergePolicy must have a no-argument constructor > --> > > <!--<mergePolicy>org.apache.lucene.index.LogByteSizeMergePolicy</mergePolicy>--> > > <!-- > Expert: > The Merge Scheduler in Lucene controls how merges are performed. The > ConcurrentMergeScheduler (Lucene 2.3 default) > can perform merges in the background using separate threads. The > SerialMergeScheduler (Lucene 2.2 default) does not. > --> > > <!--<mergeScheduler>org.apache.lucene.index.ConcurrentMergeScheduler</mergeScheduler>--> > > <!-- > This option specifies which Lucene LockFactory implementation to use. > > single = SingleInstanceLockFactory - suggested for a read-only index > or when there is no possibility of another process trying > to modify the index. > native = NativeFSLockFactory > simple = SimpleFSLockFactory > > (For backwards compatibility with Solr 1.2, 'simple' is the default > if not specified.) > --> > <lockType>single</lockType> > </indexDefaults> > > <mainIndex> > <!-- options specific to the main on-disk lucene index --> > <useCompoundFile>false</useCompoundFile> > <ramBufferSizeMB>32</ramBufferSizeMB> > <mergeFactor>10</mergeFactor> > <!-- Deprecated --> > <!--<maxBufferedDocs>1000</maxBufferedDocs>--> > <maxMergeDocs>2147483647</maxMergeDocs> > <maxFieldLength>10000</maxFieldLength> > > <!-- If true, unlock any held write or commit locks on startup. > This defeats the locking mechanism that allows multiple > processes to safely access a lucene index, and should be > used with care. > This is not needed if lock type is 'none' or 'single' > --> > <unlockOnStartup>true</unlockOnStartup> > </mainIndex> > > <!-- Enables JMX if and only if an existing MBeanServer is found, use > this if you want to configure JMX through JVM parameters. Remove > this to disable exposing Solr configuration and statistics to JMX. > > If you want to connect to a particular server, specify the agentId > e.g. <jmx agentId="myAgent" /> > > If you want to start a new MBeanServer, specify the serviceUrl > e.g <jmx > serviceurl="service:jmx:rmi:///jndi/rmi://localhost:9999/solr" /> > > For more details see http://wiki.apache.org/solr/SolrJmx > > --> > <jmx /> > > <!-- the default high-performance update handler --> > <updateHandler class="solr.DirectUpdateHandler2"> > > <!-- A prefix of "solr." for class names is an alias that > causes solr to search appropriate packages, including > org.apache.solr.(search|update|request|core|analysis) > --> > > <!-- Perform a <commit/> automatically under certain conditions: > maxDocs - number of updates since last commit is greater than this > maxTime - oldest uncommited update (in ms) is this long ago > <autoCommit> > <maxDocs>10000</maxDocs> > <maxTime>1000</maxTime> > </autoCommit> > --> > > <!-- The RunExecutableListener executes an external command. > exe - the name of the executable to run > dir - dir to use as the current working directory. default="." > wait - the calling thread waits until the executable returns. > default="true" > args - the arguments to pass to the program. default=nothing > env - environment variables to set. default=nothing > --> > <!-- A postCommit event is fired after every commit or optimize command > <listener event="postCommit" class="solr.RunExecutableListener"> > <str name="exe">solr/bin/snapshooter</str> > <str name="dir">.</str> > <bool name="wait">true</bool> > <arr name="args"> <str>arg1</str> <str>arg2</str> </arr> > <arr name="env"> <str>MYVAR=val1</str> </arr> > </listener> > --> > <!-- A postOptimize event is fired only after every optimize command, > useful > in conjunction with index distribution to only distribute optimized > indicies > <listener event="postOptimize" class="solr.RunExecutableListener"> > <str name="exe">snapshooter</str> > <str name="dir">solr/bin</str> > <bool name="wait">true</bool> > </listener> > --> > > </updateHandler> > > > <query> > <!-- Maximum number of clauses in a boolean query... can affect > range or prefix queries that expand to big boolean > queries. An exception is thrown if exceeded. --> > <maxBooleanClauses>1024</maxBooleanClauses> > > > <!-- Cache used by SolrIndexSearcher for filters (DocSets), > unordered sets of *all* documents that match a query. > When a new searcher is opened, its caches may be prepopulated > or "autowarmed" using data from caches in the old searcher. > autowarmCount is the number of items to prepopulate. For LRUCache, > the autowarmed items will be the most recently accessed items. > Parameters: > class - the SolrCache implementation (currently only LRUCache) > size - the maximum number of entries in the cache > initialSize - the initial capacity (number of entries) of > the cache. (seel java.util.HashMap) > autowarmCount - the number of entries to prepopulate from > and old cache. > --> > <filterCache > class="solr.LRUCache" > size="512" > initialSize="512" > autowarmCount="128"/> > > <!-- queryResultCache caches results of searches - ordered lists of > document ids (DocList) based on a query, a sort, and the range > of documents requested. --> > <queryResultCache > class="solr.LRUCache" > size="512" > initialSize="512" > autowarmCount="32"/> > > <!-- documentCache caches Lucene Document objects (the stored fields for > each document). > Since Lucene internal document ids are transient, this cache will not > be autowarmed. --> > <documentCache > class="solr.LRUCache" > size="512" > initialSize="512" > autowarmCount="0"/> > > <!-- If true, stored fields that are not requested will be loaded lazily. > > This can result in a significant speed improvement if the usual case is to > not load all stored fields, especially if the skipped fields are large > compressed > text fields. > --> > <enableLazyFieldLoading>true</enableLazyFieldLoading> > > <!-- Example of a generic cache. These caches may be accessed by name > through SolrIndexSearcher.getCache(),cacheLookup(), and cacheInsert(). > The purpose is to enable easy caching of user/application level data. > The regenerator argument should be specified as an implementation > of solr.search.CacheRegenerator if autowarming is desired. --> > <!-- > <cache name="myUserCache" > class="solr.LRUCache" > size="4096" > initialSize="1024" > autowarmCount="1024" > regenerator="org.mycompany.mypackage.MyRegenerator" > /> > --> > > <!-- An optimization that attempts to use a filter to satisfy a search. > If the requested sort does not include score, then the filterCache > will be checked for a filter matching the query. If found, the filter > will be used as the source of document ids, and then the sort will be > applied to that. > <useFilterForSortedQuery>true</useFilterForSortedQuery> > --> > > <!-- An optimization for use with the queryResultCache. When a search > is requested, a superset of the requested number of document ids > are collected. For example, if a search for a particular query > requests matching documents 10 through 19, and queryWindowSize is 50, > then documents 0 through 49 will be collected and cached. Any further > requests in that range can be satisfied via the cache. --> > <queryResultWindowSize>50</queryResultWindowSize> > > <!-- Maximum number of documents to cache for any entry in the > queryResultCache. --> > <queryResultMaxDocsCached>200</queryResultMaxDocsCached> > > <!-- This entry enables an int hash representation for filters (DocSets) > when the number of items in the set is less than maxSize. For smaller > sets, this representation is more memory efficient, more efficient to > iterate over, and faster to take intersections. --> > <HashDocSet maxSize="3000" loadFactor="0.75"/> > > <!-- a newSearcher event is fired whenever a new searcher is being prepared > and there is a current searcher handling requests (aka registered). > --> > <!-- QuerySenderListener takes an array of NamedList and executes a > local query request for each NamedList in sequence. --> > <listener event="newSearcher" class="solr.QuerySenderListener"> > <arr name="queries"> > <lst> <str name="q">solr</str> <str name="start">0</str> <str > name="rows">10</str> </lst> > <lst> <str name="q">rocks</str> <str name="start">0</str> <str > name="rows">10</str> </lst> > </arr> > </listener> > > <!-- a firstSearcher event is fired whenever a new searcher is being > prepared but there is no current registered searcher to handle > requests or to gain autowarming data from. --> > <listener event="firstSearcher" class="solr.QuerySenderListener"> > <arr name="queries"> > <lst> <str name="q">fast_warm</str> <str name="start">0</str> <str > name="rows">10</str> </lst> > </arr> > </listener> > > <!-- If a search request comes in and there is no current registered > searcher, > then immediately register the still warming searcher and use it. If > "false" then all requests will block until the first searcher is done > warming. --> > <useColdSearcher>false</useColdSearcher> > > <!-- Maximum number of searchers that may be warming in the background > concurrently. An error is returned if this limit is exceeded. Recommend > 1-2 for read-only slaves, higher for masters w/o cache warming. --> > <maxWarmingSearchers>4</maxWarmingSearchers> > > </query> > > <!-- > Let the dispatch filter handler /select?qt=XXX > handleSelect=true will use consistent error handling for /select and > /update > handleSelect=false will use solr1.1 style error formatting > --> > <requestDispatcher handleSelect="true" > > <!--Make sure your system has some authentication before enabling remote > streaming! --> > <requestParsers enableRemoteStreaming="false" > multipartUploadLimitInKB="2048" /> > > <!-- Set HTTP caching related parameters (for proxy caches and clients). > > To get the behaviour of Solr 1.2 (ie: no caching related headers) > use the never304="true" option and do not specify a value for > <cacheControl> > --> > <!-- <httpCaching never304="true"> --> > <httpCaching lastModifiedFrom="openTime" > etagSeed="Solr"> > <!-- lastModFrom="openTime" is the default, the Last-Modified value > (and validation against If-Modified-Since requests) will all be > relative to when the current Searcher was opened. > You can change it to lastModFrom="dirLastMod" if you want the > value to exactly corrispond to when the physical index was last > modified. > > etagSeed="..." is an option you can change to force the ETag > header (and validation against If-None-Match requests) to be > differnet even if the index has not changed (ie: when making > significant changes to your config file) > > lastModifiedFrom and etagSeed are both ignored if you use the > never304="true" option. > --> > <!-- If you include a <cacheControl> directive, it will be used to > generate a Cache-Control header, as well as an Expires header > if the value contains "max-age=" > > By default, no Cache-Control header is generated. > > You can use the <cacheControl> option even if you have set > never304="true" > --> > <!-- <cacheControl>max-age=30, public</cacheControl> --> > </httpCaching> > </requestDispatcher> > > > <!-- requestHandler plugins... incoming queries will be dispatched to the > correct handler based on the path or the qt (query type) param. > Names starting with a '/' are accessed with the a path equal to the > registered name. Names without a leading '/' are accessed with: > http://host/app/select?qt=name > > If no qt is defined, the requestHandler that declares default="true" > will be used. > --> > <requestHandler name="standard" class="solr.SearchHandler" default="true"> > <!-- default values for query parameters --> > <lst name="defaults"> > <str name="echoParams">explicit</str> > <!-- > <int name="rows">10</int> > <str name="fl">*</str> > <str name="version">2.1</str> > --> > </lst> > </requestHandler> > > > <!-- DisMaxRequestHandler allows easy searching across multiple fields > for simple user-entered phrases. It's implementation is now > just the standard SearchHandler with a default query type > of "dismax". > see http://wiki.apache.org/solr/DisMaxRequestHandler > > --> > <requestHandler name="dismax" class="solr.SearchHandler" > > <lst name="defaults"> > <str name="defType">dismax</str> > <str name="echoParams">explicit</str> > <float name="tie">0.01</float> > <str name="qf"> > text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 > </str> > <str name="pf"> > text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9 > </str> > <str name="bf"> > ord(popularity)^0.5 recip(rord(price),1,1000,1000)^0.3 > </str> > <str name="fl"> > id,name,price,score > </str> > <str name="mm"> > 2<-1 5<-2 6<90% > </str> > <int name="ps">100</int> > <str name="q.alt">*:*</str> > <!-- example highlighter config, enable per-query with hl=true --> > <str name="hl.fl">text features name</str> > <!-- for this field, we want no fragmenting, just highlighting --> > <str name="f.name.hl.fragsize">0</str> > <!-- instructs Solr to return the field itself if no query terms are > found --> > <str name="f.name.hl.alternateField">name</str> > <str name="f.text.hl.fragmenter">regex</str> <!-- defined below --> > </lst> > </requestHandler> > > <!-- Note how you can register the same handler multiple times with > different names (and different init parameters) > --> > <requestHandler name="partitioned" class="solr.SearchHandler" > > <lst name="defaults"> > <str name="defType">dismax</str> > <str name="echoParams">explicit</str> > <str name="qf">text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0</str> > <str name="mm">2<-1 5<-2 6<90%</str> > <!-- This is an example of using Date Math to specify a constantly > moving date range in a config... > --> > <str name="bq">incubationdate_dt:[* TO NOW/DAY-1MONTH]^2.2</str> > </lst> > <!-- In addition to defaults, "appends" params can be specified > to identify values which should be appended to the list of > multi-val params from the query (or the existing "defaults"). > > In this example, the param "fq=instock:true" will be appended to > any query time fq params the user may specify, as a mechanism for > partitioning the index, independent of any user selected filtering > that may also be desired (perhaps as a result of faceted searching). > > NOTE: there is *absolutely* nothing a client can do to prevent these > "appends" values from being used, so don't use this mechanism > unless you are sure you always want it. > --> > <lst name="appends"> > <str name="fq">inStock:true</str> > </lst> > <!-- "invariants" are a way of letting the Solr maintainer lock down > the options available to Solr clients. Any params values > specified here are used regardless of what values may be specified > in either the query, the "defaults", or the "appends" params. > > In this example, the facet.field and facet.query params are fixed, > limiting the facets clients can use. Faceting is not turned on by > default - but if the client does specify facet=true in the request, > these are the only facets they will be able to see counts for; > regardless of what other facet.field or facet.query params they > may specify. > > NOTE: there is *absolutely* nothing a client can do to prevent these > "invariants" values from being used, so don't use this mechanism > unless you are sure you always want it. > --> > <lst name="invariants"> > <str name="facet.field">cat</str> > <str name="facet.field">manu_exact</str> > <str name="facet.query">price:[* TO 500]</str> > <str name="facet.query">price:[500 TO *]</str> > </lst> > </requestHandler> > > > <!-- > Search components are registered to SolrCore and used by Search Handlers > > By default, the following components are avaliable: > > <searchComponent name="query" > class="org.apache.solr.handler.component.QueryComponent" /> > <searchComponent name="facet" > class="org.apache.solr.handler.component.FacetComponent" /> > <searchComponent name="mlt" > class="org.apache.solr.handler.component.MoreLikeThisComponent" /> > <searchComponent name="highlight" > class="org.apache.solr.handler.component.HighlightComponent" /> > <searchComponent name="debug" > class="org.apache.solr.handler.component.DebugComponent" /> > > Default configuration in a requestHandler would look like: > <arr name="components"> > <str>query</str> > <str>facet</str> > <str>mlt</str> > <str>highlight</str> > <str>debug</str> > </arr> > > If you register a searchComponent to one of the standard names, that will > be used instead. > To insert handlers before or after the 'standard' components, use: > > <arr name="first-components"> > <str>myFirstComponentName</str> > </arr> > > <arr name="last-components"> > <str>myLastComponentName</str> > </arr> > --> > > <!-- The spell check component can return a list of alternative spelling > suggestions. --> > <searchComponent name="spellcheck" class="solr.SpellCheckComponent"> > > <str name="queryAnalyzerFieldType">textSpell</str> > > <lst name="spellchecker"> > <str name="name">default</str> > <str name="field">spell</str> > <str name="spellcheckIndexDir">./spellchecker1</str> > > </lst> > <lst name="spellchecker"> > <str name="name">jarowinkler</str> > <str name="field">spell</str> > <!-- Use a different Distance Measure --> > <str > name="distanceMeasure">org.apache.lucene.search.spell.JaroWinklerDistance</str> > <str name="spellcheckIndexDir">./spellchecker2</str> > > </lst> > > <lst name="spellchecker"> > <str name="classname">solr.FileBasedSpellChecker</str> > <str name="name">file</str> > <str name="sourceLocation">spellings.txt</str> > <str name="characterEncoding">UTF-8</str> > <str name="spellcheckIndexDir">./spellcheckerFile</str> > </lst> > </searchComponent> > > <!-- a request handler utilizing the spellcheck component --> > <requestHandler name="/spellCheckCompRH" class="solr.SearchHandler"> > <lst name="defaults"> > <!-- omp = Only More Popular --> > <str name="spellcheck.onlyMorePopular">false</str> > <!-- exr = Extended Results --> > <str name="spellcheck.extendedResults">false</str> > <!-- The number of suggestions to return --> > <str name="spellcheck.count">1</str> > </lst> > <arr name="last-components"> > <str>spellcheck</str> > </arr> > </requestHandler> > > <!-- a search component that enables you to configure the top results for > a given query regardless of the normal lucene scoring.--> > <searchComponent name="elevator" class="solr.QueryElevationComponent" > > <!-- pick a fieldType to analyze queries --> > <str name="queryFieldType">string</str> > <str name="config-file">elevate.xml</str> > </searchComponent> > > <!-- a request handler utilizing the elevator component --> > <requestHandler name="/elevate" class="solr.SearchHandler" startup="lazy"> > <lst name="defaults"> > <str name="echoParams">explicit</str> > </lst> > <arr name="last-components"> > <str>elevator</str> > </arr> > </requestHandler> > > > <!-- Update request handler. > > Note: Since solr1.1 requestHandlers requires a valid content type > header if posted in > the body. For example, curl now requires: -H 'Content-type:text/xml; > charset=utf-8' > The response format differs from solr1.1 formatting and returns a > standard error code. > > To enable solr1.1 behavior, remove the /update handler or change its > path > --> > <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" /> > > <!-- > Analysis request handler. Since Solr 1.3. Use to returnhow a document is > analyzed. Useful > for debugging and as a token server for other types of applications > --> > <requestHandler name="/analysis" class="solr.AnalysisRequestHandler" /> > > > <!-- CSV update handler, loaded on demand --> > <requestHandler name="/update/csv" class="solr.CSVRequestHandler" > startup="lazy" /> > > > <!-- > Admin Handlers - This will register all the standard admin RequestHandlers. > Adding > this single handler is equivolent to registering: > > <requestHandler name="/admin/luke" > class="org.apache.solr.handler.admin.LukeRequestHandler" /> > <requestHandler name="/admin/system" > class="org.apache.solr.handler.admin.SystemInfoHandler" /> > <requestHandler name="/admin/plugins" > class="org.apache.solr.handler.admin.PluginInfoHandler" /> > <requestHandler name="/admin/threads" > class="org.apache.solr.handler.admin.ThreadDumpHandler" /> > <requestHandler name="/admin/properties" > class="org.apache.solr.handler.admin.PropertiesRequestHandler" /> > <requestHandler name="/admin/file" > class="org.apache.solr.handler.admin.ShowFileRequestHandler" > > > If you wish to hide files under ${solr.home}/conf, explicitly register the > ShowFileRequestHandler using: > <requestHandler name="/admin/file" > class="org.apache.solr.handler.admin.ShowFileRequestHandler" > > <lst name="invariants"> > <str name="hidden">synonyms.txt</str> > <str name="hidden">anotherfile.txt</str> > </lst> > </requestHandler> > --> > <requestHandler name="/admin/" > class="org.apache.solr.handler.admin.AdminHandlers" /> > > <!-- ping/healthcheck --> > <requestHandler name="/admin/ping" class="PingRequestHandler"> > <lst name="defaults"> > <str name="qt">standard</str> > <str name="q">solrpingquery</str> > <str name="echoParams">all</str> > </lst> > </requestHandler> > > <!-- Echo the request contents back to the client --> > <requestHandler name="/debug/dump" class="solr.DumpRequestHandler" > > <lst name="defaults"> > <str name="echoParams">explicit</str> <!-- for all params (including the > default etc) use: 'all' --> > <str name="echoHandler">true</str> > </lst> > </requestHandler> > > <highlighting> > <!-- Configure the standard fragmenter --> > <!-- This could most likely be commented out in the "default" case --> > <fragmenter name="gap" class="org.apache.solr.highlight.GapFragmenter" > default="true"> > <lst name="defaults"> > <int name="hl.fragsize">100</int> > </lst> > </fragmenter> > > <!-- A regular-expression-based fragmenter (f.i., for sentence extraction) > --> > <fragmenter name="regex" class="org.apache.solr.highlight.RegexFragmenter"> > <lst name="defaults"> > <!-- slightly smaller fragsizes work better because of slop --> > <int name="hl.fragsize">70</int> > <!-- allow 50% slop on fragment sizes --> > <float name="hl.regex.slop">0.5</float> > <!-- a basic sentence pattern --> > <str name="hl.regex.pattern">[-\w ,/\n\"']{20,200}</str> > </lst> > </fragmenter> > > <!-- Configure the standard formatter --> > <formatter name="html" class="org.apache.solr.highlight.HtmlFormatter" > default="true"> > <lst name="defaults"> > <str name="hl.simple.pre"><![CDATA[<em>]]></str> > <str name="hl.simple.post"><![CDATA[</em>]]></str> > </lst> > </formatter> > </highlighting> > > > <!-- queryResponseWriter plugins... query responses will be written using the > writer specified by the 'wt' request parameter matching the name of a > registered > writer. > The "default" writer is the default and will be used if 'wt' is not > specified > in the request. XMLResponseWriter will be used if nothing is specified > here. > The json, python, and ruby writers are also available by default. > > <queryResponseWriter name="xml" > class="org.apache.solr.request.XMLResponseWriter" default="true"/> > <queryResponseWriter name="json" > class="org.apache.solr.request.JSONResponseWriter"/> > <queryResponseWriter name="python" > class="org.apache.solr.request.PythonResponseWriter"/> > <queryResponseWriter name="ruby" > class="org.apache.solr.request.RubyResponseWriter"/> > <queryResponseWriter name="php" > class="org.apache.solr.request.PHPResponseWriter"/> > <queryResponseWriter name="phps" > class="org.apache.solr.request.PHPSerializedResponseWriter"/> > > <queryResponseWriter name="custom" class="com.example.MyResponseWriter"/> > --> > > <!-- XSLT response writer transforms the XML output by any xslt file found > in Solr's conf/xslt directory. Changes to xslt files are checked for > every xsltCacheLifetimeSeconds. > --> > <queryResponseWriter name="xslt" > class="org.apache.solr.request.XSLTResponseWriter"> > <int name="xsltCacheLifetimeSeconds">5</int> > </queryResponseWriter> > > > <!-- example of registering a query parser > <queryParser name="lucene" > class="org.apache.solr.search.LuceneQParserPlugin"/> > --> > > <!-- example of registering a custom function parser > <valueSourceParser name="myfunc" class="com.mycompany.MyValueSourceParser" /> > --> > > <!-- config for the admin interface --> > <admin> > <defaultQuery>solr</defaultQuery> > <gettableFiles>solrconfig.xml schema.xml admin-extra.html</gettableFiles> > <!-- pingQuery should be "URLish" ... > & separated key=val pairs ... but there shouldn't be any > URL escaping of the values --> > <pingQuery> > qt=standard&q=solrpingquery > </pingQuery> > > <!-- configure a healthcheck file for servers behind a loadbalancer > <healthcheck type="file">server-enabled</healthcheck> > --> > <healthcheck type="file">server-enabled</healthcheck> > </admin> > ></config> >