Hi Team,

Thank you for your contribution.

I recently joined a project which uses Solr for searching. It's a webapp, where 
once a day reindexing happens - all necessary data from a database is saved to 
the index in Solr 9.6.0 with a help of SolrTemplate.saveBeans() of Solr client 
for Java. Before that event all previous data from the index is deleted.

But we are facing an issue - not all data can be found after reindexing.

Let's say we have 14 relevant entities (or rows in the database), but each time 
after reindexing the number of those entities in Solr can be different. It can 
be 12, then after another reindexing it can be 8, then after another reindexing 
the number can change again. Sometimes it returns all 14 entities after a new 
reindexing. So, it is not consistent. According to logs, all is well - no 
exceptions or errors thrown from Solr to the webapp during reindexing.

The strangest thing is that at the beginning Solr log level was INFO, but when 
it was switched to DEBUG (to track what is going on, why not all entities are 
searchable), it turned out, that after reindexing with DEBUG log level all 14 
entities was searchable. Even after a bunch of reindexing (with DEBUG log 
level). When we switched it back to INFO, it again started to give different 
results for search after reindexing...

We tried to change some parameters in configuration to resolve the issue, for 
example:

  *   softCommit true/false
  *   various values for UpdateHandler maxTime for both autoCommit and 
autoSoftCommit
  *   query.enableLazyFieldLoading true/false

We checked updatedHandler.docsPending after reindexing and it is 0, so the 
assumption is that all data is committed.

On the webapp side we also tried different things like:

  *   adding explicitly SolrTemplate.commit() for collection
  *   using different timeout for CommitWithIn parameter in 
SolrTemplate.saveBeans()

But nothing affected the inconsistent results of searching except of switching 
Solr log level to DEBUG...

Did someone face something similar or does someone have any idea why is this 
happening?
And how we can resolve this issue and have all entities in our search after 
each reindexing, without switching to DEBUG?
(with DEBUG it takes more time to reindex all data)

Thank you in advance!
Olga

Reply via email to