For my use case I tried document centric versioning as mentioned here <https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents#UpdatingPartsofDocuments-DocumentCentricVersioningConstraints>. But In my case this is not working I am seeing the document having version older is overwriting the newer ones. I have attached my solrconfig.xml. I have also added my version field in schema.xml as shown below:- <field name="doc_version" type="long" indexed="true" stored="true"/>
I am updating the doc with solrJ as below:- solrClient.add(doc); solrClient.commit(); I am using solr 5.2.1 Can someone let me know what I am doing wrong? On Tue, Dec 22, 2015 at 9:29 PM, Debraj Manna <subharaj.ma...@gmail.com> wrote: > Hi Alex, > > Can you let us know what do you mean by > > *"timestamps" are truly atomic and not local clock-based." ?* > > *Thanks,* > > On Mon, Dec 14, 2015 at 10:53 PM, Alexandre Rafalovitch < > arafa...@gmail.com> wrote: > >> At the first glance, this sounds like a perfect match to >> >> https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents#UpdatingPartsofDocuments-DocumentCentricVersioningConstraints >> >> Just make sure your "timestamps" are truly atomic and not local >> clock-based. The drift could cause interesting problems. >> >> Regards, >> Alex. >> ---- >> Newsletter and resources for Solr beginners and intermediates: >> http://www.solr-start.com/ >> >> >> On 14 December 2015 at 12:17, Debraj Manna <subharaj.ma...@gmail.com> >> wrote: >> > We have a use case in which there are multiple clients writing >> concurrently >> > to solr. Each of the doc is having an 'timestamp' field which indicates >> > when these docs were generated. >> > >> > We also have to ensure that any old doc doesn't overwrite any new doc in >> > solr. So to achieve this we were thinking if we can make use of the >> > _version field in solr doc and set the _version field equal to the >> > 'timestamp' field that is present in each doc. >> > >> > Can someone let me know if the approach that we thought can be done? If >> not >> > can someone suggest some other approach of achieving the same with >> minimum >> > calls to solr? >> > >
<?xml version="1.0" encoding="UTF-8" ?> <config> <luceneMatchVersion>5.0.0</luceneMatchVersion> <lib dir="./lib" regex=".*\.jar"/> <lib dir="${solr.install.dir:../../../..}/contrib/extraction/lib" regex=".*\.jar" /> <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-cell-\d.*\.jar" /> <lib dir="${solr.install.dir:../../../..}/contrib/clustering/lib/" regex=".*\.jar" /> <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-clustering-\d.*\.jar" /> <lib dir="${solr.install.dir:../../../..}/contrib/langid/lib/" regex=".*\.jar" /> <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-langid-\d.*\.jar" /> <lib dir="${solr.install.dir:../../../..}/contrib/velocity/lib" regex=".*\.jar" /> <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-velocity-\d.*\.jar" /> <dataDir>${solr.data.dir:}</dataDir> <directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/> <codecFactory class="solr.SchemaCodecFactory"/> <schemaFactory class="ClassicIndexSchemaFactory"/> <indexConfig> <lockType>${solr.lock.type:native}</lockType> <writeLockTimeout>30000</writeLockTimeout> <commitLockTimeout>10000</commitLockTimeout> <infoStream>true</infoStream> </indexConfig> <jmx /> <updateHandler class="solr.DirectUpdateHandler2"> <updateLog> <str name="dir">${solr.ulog.dir:}</str> </updateLog> <autoCommit> <maxTime>${solr.autoCommit.maxTime:15000}</maxTime> <openSearcher>false</openSearcher> </autoCommit> </updateHandler> <query> <maxBooleanClauses>1024</maxBooleanClauses> <slowQueryThresholdMillis>-1</slowQueryThresholdMillis> <filterCache class="solr.FastLRUCache" size="512" initialSize="512" autowarmCount="0"/> <queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> <documentCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/> <cache name="perSegFilter" class="solr.search.LRUCache" size="10" initialSize="0" autowarmCount="10" regenerator="solr.NoOpRegenerator" /> <enableLazyFieldLoading>true</enableLazyFieldLoading> <queryResultWindowSize>20</queryResultWindowSize> <queryResultMaxDocsCached>200</queryResultMaxDocsCached> <listener event="newSearcher" class="solr.QuerySenderListener"> <arr name="queries"> </arr> </listener> <listener event="firstSearcher" class="solr.QuerySenderListener"> <arr name="queries"> </arr> </listener> <useColdSearcher>false</useColdSearcher> <maxWarmingSearchers>2</maxWarmingSearchers> </query> <requestDispatcher handleSelect="true" > <requestParsers enableRemoteStreaming="true" multipartUploadLimitInKB="2048000" formdataUploadLimitInKB="2048" addHttpRequestToContext="false"/> <httpCaching never304="true" /> </requestDispatcher> <requestHandler name="/select" class="solr.SearchHandler"> <lst name="defaults"> <str name="echoParams">explicit</str> <int name="rows">10</int> <bool name="preferLocalShards">false</bool> </lst> </requestHandler> <requestHandler name="/standard" class="solr.SearchHandler" default="true"> <lst name="defaults"> <str name="fl">sku</str> <str name="echoParams">none</str> <str name="facet">true</str> <str name="facet.limit">400</str> <str name="facet.mincount">1</str> <str name="wt">json</str> <str name="json.nl">map</str> <float name="tie">0.01</float> </lst> </requestHandler> <!-- fulltext search --> <requestHandler name="/edismax" class="solr.SearchHandler" > <lst name="defaults"> <str name="fl">sku</str> <!--<str name="defType">edismax</str>--> <str name="defType">autoPhrasingParser</str> <str name="echoParams">none</str> <str name="facet">true</str> <str name="facet.mincount">1</str> <str name="wt">json</str> <str name="json.nl">map</str> <float name="tie">0.01</float> <str name="qf">name^2.0 brand^2.0 category^2.0 fulltext^1.0</str> <str name="pf">name^2.5 fulltext^1.5</str> <str name="mm">100%</str> <int name="ps">0</int> </lst> <arr name="last-components"> <str>elevator</str> </arr> </requestHandler> <queryParser name="autoPhrasingParser" class="com.jabong.plugin.JBPhraseQParserPlugin" > <str name="phrases">autophrases.txt</str> <str name="replaceWhitespaceWith">_</str> <str name="defType">edismax</str> </queryParser> <searchComponent name="suggester" class="solr.SpellCheckComponent"> <lst name="spellchecker"> <str name="name">suggester</str> <str name="classname">org.apache.solr.spelling.suggest.Suggester</str> <str name="lookupImpl">com.jabong.plugin.JBAnalyzingInfixLookupFactory</str> <str name="suggestAnalyzerFieldType">suggestion_terms</str> <str name="sourceLocation">suggestions.txt</str> <str name="indexPath">${solr.suggestions.dir:solr/discovery/suggestions}</str> <str name="fieldType">string</str> <str name="highlight">false</str> </lst> <lst name="spellchecker"> <str name="name">suggesterB</str> <str name="classname">org.apache.solr.spelling.suggest.Suggester</str> <str name="lookupImpl">com.jabong.plugin.JBAnalyzingInfixLookupFactory</str> <str name="suggestAnalyzerFieldType">suggestion_terms</str> <str name="sourceLocation">suggestions_b.txt</str> <str name="indexPath">${solr.suggestions.dir:solr/discovery/suggestionsB}</str> <str name="fieldType">string</str> <str name="highlight">false</str> </lst> </searchComponent> <requestHandler name="/suggester" class="solr.SearchHandler"> <lst name="defaults"> <str name="spellcheck">true</str> <str name="spellcheck.dictionary">suggester</str> <str name="spellcheck.count">50</str> <str name="spellcheck.onlyMorePopular">true</str> </lst> <arr name="components"> <str>suggester</str> </arr> </requestHandler> <requestHandler name="/suggesterB" class="solr.SearchHandler"> <lst name="defaults"> <str name="spellcheck">true</str> <str name="spellcheck.dictionary">suggesterB</str> <str name="spellcheck.count">50</str> <str name="spellcheck.onlyMorePopular">true</str> </lst> <arr name="components"> <str>suggester</str> </arr> </requestHandler> <!-- A request handler that returns indented JSON by default --> <requestHandler name="/query" class="solr.SearchHandler"> <lst name="defaults"> <str name="echoParams">explicit</str> <str name="wt">json</str> <str name="indent">true</str> </lst> </requestHandler> <requestHandler name="/browse" class="solr.SearchHandler" useParams="query,facets,velocity,browse"> <lst name="defaults"> <str name="echoParams">explicit</str> </lst> </requestHandler> <initParams path="/update/**,/query,/tvrh,/elevate,/browse,/edismax,/standard"> <lst name="defaults"> <str name="df">_text_</str> </lst> </initParams> <initParams path="/update/json/docs"> <lst name="defaults"> <!--this ensures that the entire json doc will be stored verbatim into one field--> <str name="srcField">_src_</str> <!--This means a the uniqueKeyField will be extracted from the fields and all fields go into the 'df' field. In this config df is already configured to be 'text' --> <str name="mapUniqueKeyOnly">true</str> </lst> </initParams> <requestHandler name="/update/extract" startup="lazy" class="solr.extraction.ExtractingRequestHandler" > <lst name="defaults"> <str name="lowernames">true</str> <str name="fmap.meta">ignored_</str> <str name="fmap.content">_text_</str> </lst> </requestHandler> <requestHandler name="/export" class="solr.SearchHandler"> <lst name="invariants"> <str name="rq">{!xport}</str> <str name="wt">xsort</str> <str name="distrib">false</str> </lst> <arr name="components"> <str>query</str> </arr> </requestHandler> <requestHandler name="/stream" class="solr.StreamHandler"> <lst name="invariants"> <str name="wt">json</str> <str name="distrib">false</str> </lst> </requestHandler> <requestHandler name="/analysis/field" startup="lazy" class="solr.FieldAnalysisRequestHandler" /> <requestHandler name="/analysis/document" class="solr.DocumentAnalysisRequestHandler" startup="lazy" /> <requestHandler name="/debug/dump" class="solr.DumpRequestHandler" > <lst name="defaults"> <str name="echoParams">explicit</str> <str name="echoHandler">true</str> </lst> </requestHandler> <requestHandler name="/suggest" class="solr.SearchHandler"> <lst name="defaults"> <str name="echoParams">none</str> <str name="facet">true</str> <str name="facet.mincount">1</str> <str name="wt">json</str> <str name="json.nl">map</str> <str name="q">*:*</str> <str name="rows">0</str> <str name="facet.sort">count</str> <str name="facet.limit">-1</str> <str name="facet.field">suggestions</str> </lst> </requestHandler> <!-- Suggestions --> <requestHandler name="/suggestion" class="solr.SearchHandler"> <lst name="defaults"> <str name="echoParams">none</str> <str name="facet">true</str> <str name="facet.mincount">1</str> <str name="wt">json</str> <str name="json.nl">map</str> <str name="q">*:*</str> <str name="rows">0</str> <str name="facet.sort">count</str> <str name="facet.limit">-1</str> <str name="facet.field">suggestion_segment</str> <str name="facet.field">suggestion_category</str> <str name="facet.field">suggestion_brand</str> <str name="facet.field">suggestion_ty</str> <str name="facet.field">suggestion_catsegment</str> <str name="facet.field">suggestion_brandsegment</str> <str name="facet.field">suggestion_tysegment</str> <str name="facet.field">suggestion_catbrand</str> <str name="facet.field">suggestion_brandty</str> <str name="facet.field">suggestion_seccatbrand</str> <str name="facet.field">suggestion_seccatsegment</str> <str name="facet.field">suggestion_seccategory</str> <str name="facet.field">suggestion_additionalcat</str> <str name="facet.field">suggestions</str> </lst> </requestHandler> <!-- spellchecker --> <requestHandler name="/spell" class="solr.SearchHandler"> <lst name="defaults"> <str name="defType">edismax</str> <str name="wt">phps</str> <float name="tie">0.01</float> <str name="qf">name^2.0 brand^2.0 category^2.0</str> <str name="group">false</str> <str name="spellcheck">true</str> <str name="spellcheck.dictionary">default</str> <str name="spellcheck.count">40</str> <str name="spellcheck.collate">true</str> <str name="spellcheck.collateExtendedResults">true</str> <str name="spellcheck.maxCollations">35</str> <str name="spellcheck.maxCollationTries">30</str> </lst> <arr name="last-components"> <str>spellcheck</str> </arr> </requestHandler> <searchComponent name="spellcheck" class="solr.SpellCheckComponent"> <lst name="spellchecker"> <str name="name">default</str> <str name="field">textSpell</str> <str name="classname">solr.DirectSolrSpellChecker</str> <!-- the spellcheck distance measure used, the default is the internal levenshtein --> <str name="distanceMeasure">internal</str> <!-- minimum accuracy needed to be considered a valid spellcheck suggestion --> <float name="accuracy">0.65</float> <!-- the maximum #edits we consider when enumerating terms: can be 1 or 2 --> <int name="maxEdits">2</int> <!-- the minimum shared prefix when enumerating terms --> <int name="minPrefix">1</int> <!-- maximum number of inspections per result. --> <int name="maxInspections">5</int> <!-- minimum length of a query term to be considered for correction --> <int name="minQueryLength">3</int> <!-- maximum threshold of documents a query term can appear to be considered for correction --> <float name="maxQueryFrequency">0.01</float> <!-- uncomment this to require suggestions to occur in 1% of the documents <float name="thresholdTokenFrequency">.01</float> --> </lst> <str name="queryAnalyzerFieldType">textSpell</str> <lst name="spellchecker"> <str name="name">wordbreak</str> <str name="classname">solr.WordBreakSolrSpellChecker</str> <str name="field">textSpell</str> <str name="combineWords">true</str> <str name="breakWords">true</str> <int name="maxChanges">10</int> </lst> </searchComponent> <searchComponent name="tvComponent" class="solr.TermVectorComponent"/> <requestHandler name="/tvrh" class="solr.SearchHandler" startup="lazy"> <lst name="defaults"> <bool name="tv">true</bool> </lst> <arr name="last-components"> <str>tvComponent</str> </arr> </requestHandler> <searchComponent name="terms" class="solr.TermsComponent"/> <!-- A request handler for demonstrating the terms component --> <requestHandler name="/terms" class="solr.SearchHandler" startup="lazy"> <lst name="defaults"> <bool name="terms">true</bool> <bool name="distrib">false</bool> </lst> <arr name="components"> <str>terms</str> </arr> </requestHandler> <searchComponent name="elevator" class="solr.QueryElevationComponent" > <!-- pick a fieldType to analyze queries --> <str name="queryFieldType">string</str> <str name="config-file">elevate.xml</str> </searchComponent> <!-- A request handler for demonstrating the elevator component --> <requestHandler name="/elevate" class="solr.SearchHandler" startup="lazy"> <lst name="defaults"> <str name="echoParams">explicit</str> </lst> <arr name="last-components"> <str>elevator</str> </arr> </requestHandler> <!-- Highlighting Component http://wiki.apache.org/solr/HighlightingParameters --> <searchComponent class="solr.HighlightComponent" name="highlight"> <highlighting> <!-- Configure the standard fragmenter --> <!-- This could most likely be commented out in the "default" case --> <fragmenter name="gap" default="true" class="solr.highlight.GapFragmenter"> <lst name="defaults"> <int name="hl.fragsize">100</int> </lst> </fragmenter> <!-- A regular-expression-based fragmenter (for sentence extraction) --> <fragmenter name="regex" class="solr.highlight.RegexFragmenter"> <lst name="defaults"> <!-- slightly smaller fragsizes work better because of slop --> <int name="hl.fragsize">70</int> <!-- allow 50% slop on fragment sizes --> <float name="hl.regex.slop">0.5</float> <!-- a basic sentence pattern --> <str name="hl.regex.pattern">[-\w ,/\n\"']{20,200}</str> </lst> </fragmenter> <!-- Configure the standard formatter --> <formatter name="html" default="true" class="solr.highlight.HtmlFormatter"> <lst name="defaults"> <str name="hl.simple.pre"><![CDATA[<em>]]></str> <str name="hl.simple.post"><![CDATA[</em>]]></str> </lst> </formatter> <!-- Configure the standard encoder --> <encoder name="html" class="solr.highlight.HtmlEncoder" /> <!-- Configure the standard fragListBuilder --> <fragListBuilder name="simple" class="solr.highlight.SimpleFragListBuilder"/> <!-- Configure the single fragListBuilder --> <fragListBuilder name="single" class="solr.highlight.SingleFragListBuilder"/> <!-- Configure the weighted fragListBuilder --> <fragListBuilder name="weighted" default="true" class="solr.highlight.WeightedFragListBuilder"/> <!-- default tag FragmentsBuilder --> <fragmentsBuilder name="default" default="true" class="solr.highlight.ScoreOrderFragmentsBuilder"> <!-- <lst name="defaults"> <str name="hl.multiValuedSeparatorChar">/</str> </lst> --> </fragmentsBuilder> <!-- multi-colored tag FragmentsBuilder --> <fragmentsBuilder name="colored" class="solr.highlight.ScoreOrderFragmentsBuilder"> <lst name="defaults"> <str name="hl.tag.pre"><![CDATA[ <b style="background:yellow">,<b style="background:lawgreen">, <b style="background:aquamarine">,<b style="background:magenta">, <b style="background:palegreen">,<b style="background:coral">, <b style="background:wheat">,<b style="background:khaki">, <b style="background:lime">,<b style="background:deepskyblue">]]></str> <str name="hl.tag.post"><![CDATA[</b>]]></str> </lst> </fragmentsBuilder> <boundaryScanner name="default" default="true" class="solr.highlight.SimpleBoundaryScanner"> <lst name="defaults"> <str name="hl.bs.maxScan">10</str> <str name="hl.bs.chars">.,!? 	 </str> </lst> </boundaryScanner> <boundaryScanner name="breakIterator" class="solr.highlight.BreakIteratorBoundaryScanner"> <lst name="defaults"> <!-- type should be one of CHARACTER, WORD(default), LINE and SENTENCE --> <str name="hl.bs.type">WORD</str> <!-- language and country are used when constructing Locale object. --> <!-- And the Locale object will be used when getting instance of BreakIterator --> <str name="hl.bs.language">en</str> <str name="hl.bs.country">US</str> </lst> </boundaryScanner> </highlighting> </searchComponent> <!-- Update Processors Chains of Update Processor Factories for dealing with Update Requests can be declared, and then used by name in Update Request Processors http://wiki.apache.org/solr/UpdateRequestProcessor --> <!-- Add unknown fields to the schema An example field type guessing update processor that will attempt to parse string-typed field values as Booleans, Longs, Doubles, or Dates, and then add schema fields with the guessed field types. This requires that the schema is both managed and mutable, by declaring schemaFactory as ManagedIndexSchemaFactory, with mutable specified as true. See http://wiki.apache.org/solr/GuessingFieldTypes --> <updateRequestProcessorChain name="add-unknown-fields-to-the-schema"> <processor class="solr.DocBasedVersionConstraintsProcessorFactory"> <str name="versionField">doc_version</str> <bool name="ignoreOldUpdates">false</bool> </processor> <!-- UUIDUpdateProcessorFactory will generate an id if none is present in the incoming document --> <processor class="solr.UUIDUpdateProcessorFactory" /> <processor class="solr.LogUpdateProcessorFactory"/> <processor class="solr.DistributedUpdateProcessorFactory"/> <processor class="solr.RemoveBlankFieldUpdateProcessorFactory"/> <processor class="solr.FieldNameMutatingUpdateProcessorFactory"> <str name="pattern">[^\w-\.]</str> <str name="replacement">_</str> </processor> <processor class="solr.ParseBooleanFieldUpdateProcessorFactory"/> <processor class="solr.ParseLongFieldUpdateProcessorFactory"/> <processor class="solr.ParseDoubleFieldUpdateProcessorFactory"/> <processor class="solr.ParseDateFieldUpdateProcessorFactory"> <arr name="format"> <str>yyyy-MM-dd'T'HH:mm:ss.SSSZ</str> <str>yyyy-MM-dd'T'HH:mm:ss,SSSZ</str> <str>yyyy-MM-dd'T'HH:mm:ss.SSS</str> <str>yyyy-MM-dd'T'HH:mm:ss,SSS</str> <str>yyyy-MM-dd'T'HH:mm:ssZ</str> <str>yyyy-MM-dd'T'HH:mm:ss</str> <str>yyyy-MM-dd'T'HH:mmZ</str> <str>yyyy-MM-dd'T'HH:mm</str> <str>yyyy-MM-dd HH:mm:ss.SSSZ</str> <str>yyyy-MM-dd HH:mm:ss,SSSZ</str> <str>yyyy-MM-dd HH:mm:ss.SSS</str> <str>yyyy-MM-dd HH:mm:ss,SSS</str> <str>yyyy-MM-dd HH:mm:ssZ</str> <str>yyyy-MM-dd HH:mm:ss</str> <str>yyyy-MM-dd HH:mmZ</str> <str>yyyy-MM-dd HH:mm</str> <str>yyyy-MM-dd</str> </arr> </processor> <processor class="solr.AddSchemaFieldsUpdateProcessorFactory"> <str name="defaultFieldType">strings</str> <lst name="typeMapping"> <str name="valueClass">java.lang.Boolean</str> <str name="fieldType">booleans</str> </lst> <lst name="typeMapping"> <str name="valueClass">java.util.Date</str> <str name="fieldType">tdates</str> </lst> <lst name="typeMapping"> <str name="valueClass">java.lang.Long</str> <str name="valueClass">java.lang.Integer</str> <str name="fieldType">tlongs</str> </lst> <lst name="typeMapping"> <str name="valueClass">java.lang.Number</str> <str name="fieldType">tdoubles</str> </lst> </processor> <processor class="solr.RunUpdateProcessorFactory"/> </updateRequestProcessorChain> <queryResponseWriter name="json" class="solr.JSONResponseWriter"> <!-- For the purposes of the tutorial, JSON responses are written as plain text so that they are easy to read in *any* browser. If you expect a MIME type of "application/json" just remove this override. --> <str name="content-type">text/plain; charset=UTF-8</str> </queryResponseWriter> <!-- Custom response writers can be declared as needed... --> <queryResponseWriter name="velocity" class="solr.VelocityResponseWriter" startup="lazy"> <str name="template.base.dir">${velocity.template.base.dir:}</str> </queryResponseWriter> <!-- XSLT response writer transforms the XML output by any xslt file found in Solr's conf/xslt directory. Changes to xslt files are checked for every xsltCacheLifetimeSeconds. --> <queryResponseWriter name="xslt" class="solr.XSLTResponseWriter"> <int name="xsltCacheLifetimeSeconds">5</int> </queryResponseWriter> <admin> <defaultQuery>*:*</defaultQuery> </admin> <!-- Solr Master/Slave Server configuration --> <requestHandler name="/replication" class="solr.ReplicationHandler" enable="${enable.replication:false}"> <lst name="master"> <str name="enable">${enable.master:false}</str> <str name="replicateAfter">startup</str> <str name="replicateAfter">commit</str> <str name="replicateAfter">optimize</str> <str name="confFiles">schema.xml,synonyms.txt,stopwords.txt,autophrases.txt,brandstopwords.txt,elevate.xml,suggestions.txt</str> </lst> <lst name="slave"> <str name="enable">${enable.slave:false}</str> <str name="masterUrl">${rep.master.url:false}</str> <str name="pollInterval">00:00:60</str> <str name="compression">internal</str> <str name="httpConnTimeout">5000</str> <str name="httpReadTimeout">10000</str> </lst> </requestHandler> </config>