I am not super familiar with what happens with DIH in the SolrCloud
and I was talking about HTTP access.log (which admittedly needs custom
configuration). However, I do see this in the logs:

2017-03-09 13:40:00.356 INFO  (qtp2080166188-41928) [c:collectionXXX
s:shard1 r:core_node1 x:collectionXXX_shard1_replica2]
o.a.s.c.S.Request [collectionXXX_shard1_replica2]  webapp=/solr
path=/dataimport
params={optimize=true&commit=true&clean=true&command=full-import}
status=0 QTime=3
....
2017-03-09 13:40:26.115 INFO  (qtp2080166188-39334) [c:collectionXXX
s:shard1 r:core_node1 x:collectionXXX_shard1_replica2]
o.a.s.u.p.LogUpdateProcessorFactory [collectionXXX_shard1_replica2]
webapp=/solr path=/update
params={update.distrib=TOLEADER&distrib.from=http://XXX.XXX.XXX.189:8080/solr/collectionXXX_shard1_replica1/&wt=javabin&version=2}{}
0 0
....
2017-03-09 13:40:26.131 INFO  (qtp2080166188-34463) [c:collectionXXX
s:shard1 r:core_node1 x:collectionXXX_shard1_replica2]
o.a.s.u.DirectUpdateHandler2 start
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}

It seems like there is a DIH start at 13:40:00, but then at 13:40:26
there is an update hit to another shard which propogates to this one
and causes the commit making DIH changes at that point visible.
Interestingly, there seem to be an error in that distributed update as
I see "missing required field: adtype", which may mean the actual
incoming records from other shards did not get into the index.

So, to me, my theory stands so far and I would look for something
sending - invalid - documents to shard1_replica1 at the same time as
you are running DIH on shard1_replica2.

Regards,
  Alex.
----
http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 9 March 2017 at 10:13, AJ Lemke <aj.le...@securitylabs.com> wrote:
> Thanks for the tip Alex.
>
> I forgot to mention that we are using Solr in cloud mode.
>
> I have checked the logs and I am seeing the following lines (this is an 
> abridged snippet):
>
> 2017-03-09 13:40:00.356 INFO  (qtp2080166188-41928) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.h.d.DataImporter Loading 
> DIH Configuration: data-config.xml
> 2017-03-09 13:40:00.356 INFO  (qtp2080166188-41928) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.h.d.DataImporter Data 
> Configuration loaded successfully
> 2017-03-09 13:40:00.356 INFO  (Thread-17768) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.h.d.DataImporter Starting 
> Full Import
> 2017-03-09 13:40:00.356 INFO  (qtp2080166188-41928) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.c.S.Request 
> [collectionXXX_shard1_replica2]  webapp=/solr path=/dataimport 
> params={optimize=true&commit=true&clean=true&command=full-import} status=0 
> QTime=3
> 2017-03-09 13:40:00.450 INFO  (Thread-17768) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.s.SolrIndexSearcher 
> Opening [Searcher@7c2e9318[collectionXXX_shard1_replica2] realtime]
> 2017-03-09 13:40:00.560 INFO  (Thread-17768) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.h.d.JdbcDataSource 
> Creating a connection for entity item with URL: 
> jdbc:sqlserver://sqlX:XXX;databaseName=XXXXXX;responseBuffering=adaptive;
> 2017-03-09 13:40:00.576 INFO  (Thread-17768) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.h.d.JdbcDataSource Time 
> taken for getConnection(): 13
> 2017-03-09 13:40:26.115 INFO  (qtp2080166188-39334) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] 
> o.a.s.u.p.LogUpdateProcessorFactory [collectionXXX_shard1_replica2]  
> webapp=/solr path=/update 
> params={update.distrib=TOLEADER&distrib.from=http://XXX.XXX.XXX.189:8080/solr/collectionXXX_shard1_replica1/&wt=javabin&version=2}{}
>  0 0
> 2017-03-09 13:40:26.115 ERROR (qtp2080166188-39334) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.h.RequestHandlerBase 
> org.apache.solr.common.SolrException: [doc=11326865] missing required field: 
> adtype
> 2017-03-09 13:40:26.131 INFO  (qtp2080166188-34463) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.u.DirectUpdateHandler2 
> start 
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> 2017-03-09 13:40:26.131 INFO  (qtp2080166188-34463) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.u.SolrIndexWriter Calling 
> setCommitData with IW:org.apache.solr.update.SolrIndexWriter@61db76c2
> 2017-03-09 13:40:28.350 INFO  (qtp2080166188-34463) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.s.SolrIndexSearcher 
> Opening [Searcher@a4ed679[collectionXXX_shard1_replica2] main]
> 2017-03-09 13:40:28.350 INFO  
> (searcherExecutor-50-thread-1-processing-n:XXX.XXX.XXX.188:8080_solr 
> x:collectionXXX_shard1_replica2 s:shard1 c:collectionXXX r:core_node1) 
> [c:collectionXXX s:shard1 r:core_node1 x:collectionXXX_shard1_replica2] 
> o.a.s.c.QuerySenderListener QuerySenderListener sending requests to 
> Searcher@a4ed679[collectionXXX_shard1_replica2] 
> main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_oa1(6.3.0):C31847)
>  Uninverting(_oa2(6.3.0):C2520)))}
> 2017-03-09 13:40:28.350 INFO  
> (searcherExecutor-50-thread-1-processing-n:XXX.XXX.XXX.188:8080_solr 
> x:collectionXXX_shard1_replica2 s:shard1 c:collectionXXX r:core_node1) 
> [c:collectionXXX s:shard1 r:core_node1 x:collectionXXX_shard1_replica2] 
> o.a.s.c.QuerySenderListener QuerySenderListener done.
> 2017-03-09 13:40:28.350 INFO  
> (searcherExecutor-50-thread-1-processing-n:XXX.XXX.XXX.188:8080_solr 
> x:collectionXXX_shard1_replica2 s:shard1 c:collectionXXX r:core_node1) 
> [c:collectionXXX s:shard1 r:core_node1 x:collectionXXX_shard1_replica2] 
> o.a.s.c.SolrCore [collectionXXX_shard1_replica2] Registered new searcher 
> Searcher@a4ed679[collectionXXX_shard1_replica2] 
> main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_oa1(6.3.0):C31847)
>  Uninverting(_oa2(6.3.0):C2520)))}
> 2017-03-09 13:40:28.365 INFO  (qtp2080166188-34463) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.u.DirectUpdateHandler2 
> end_commit_flush
> 2017-03-09 13:40:28.444 INFO  (qtp2080166188-34463) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] 
> o.a.s.u.p.LogUpdateProcessorFactory [collectionXXX_shard1_replica2]  
> webapp=/solr path=/update 
> params={update.distrib=FROMLEADER&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=http://XXX.XXX.XXX.189:8080/solr/collectionXXX_shard1_replica1/&commit_end_point=true&wt=javabin&version=2&expungeDeletes=false}{commit=}
>  0 2309
> 2017-03-09 13:40:56.459 INFO  (qtp2080166188-40636) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] 
> o.a.s.u.p.LogUpdateProcessorFactory [collectionXXX_shard1_replica2]  
> webapp=/solr path=/update/json params={commit=true&wt=json}{} 0 0
> 2017-03-09 13:40:56.459 ERROR (qtp2080166188-40636) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.h.RequestHandlerBase 
> org.apache.solr.common.SolrException: [doc=553094612] missing required field: 
> adtype
> 2017-03-09 13:41:00.053 INFO  (qtp2080166188-41928) [c:collectionXXX s:shard1 
> r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.c.S.Request 
> [collectionXXX_shard1_replica2]  webapp=/solr path=/schema params={wt=json} 
> status=0 QTime=0
>
>
>
> AJ
>
> -----Original Message-----
> From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
> Sent: Wednesday, March 8, 2017 9:33 AM
> To: solr-user <solr-user@lucene.apache.org>
> Subject: Re: DIH Full Index Issue
>
> Are you perhaps indexing at the same time from the source other than DIH? 
> Because the commit is global and all the changes from all the sources will 
> become visible.
>
> Check the access logs perhaps to see the requests to /update handler or 
> similar.
>
> Regards,
>    Alex.
>
>
> ----
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 8 March 2017 at 09:27, AJ Lemke <aj.le...@securitylabs.com> wrote:
>> Good Morning List!
>>
>> I have an issue where my DIH full index is committed after a minute of 
>> indexing.
>> My counts will fall from around 400K to 85K until the import is finished, 
>> usually about four (4) minutes later.
>>
>> This is problematic for us as there are 315K missing items in our searches.
>>
>> Versioning Info:
>> solr-spec - 6.3.0
>> solr-impl - 6.3.0 a66a44513ee8191e25b477372094bfa846450316 - shalin -
>> 2016-11-02 19:52:42 lucene-spec - 6.3.0 lucene-impl - 6.3.0
>> a66a44513ee8191e25b477372094bfa846450316 - shalin - 2016-11-02
>> 19:47:11
>>
>> solrconfig.xml snippet
>> <!-- The default high-performance update handler --> <updateHandler
>> class="solr.DirectUpdateHandler2">
>>     <updateLog>
>>         <str name="dir">${solr.ulog.dir:}</str>
>>     </updateLog>
>>     <autoCommit>
>>         <maxTime>-1</maxTime>
>>         <openSearcher>false</openSearcher>
>>     </autoCommit>
>>     <autoSoftCommit>
>>         <maxTime>-1</maxTime>
>>     </autoSoftCommit>
>> </updateHandler>
>>
>>
>> Any insights would be greatly appreciated.
>> Let me know if more information is required.
>>
>> AJ

Reply via email to