Hello,
I am sorry in advance as this will be a lengthy email as I will try to provide 
proper details.
We currently have 2 solr cloud deployments and we are hoping to upgrade to solr 
8.x from these but are running into severe performance problems with solr 
8.1.1.  I am hoping for some guidance in troubleshooting and overcoming this 
problem.

Current setup

Backend email processing.
Used for predefined queries that produce email results for our clients.  
Approximately 35000 emails distributed over different times of the day for our 
clients based on their preferences.
solr-spec 4.10.4
lucene-spec 4.10.4
Runtime Oracle Corporation OpenJDK 64-Bit Server VM (1.8.0_222 25.222-b10)
1 collection 6 shards 5 replicas per shard 17,919,889 current documents (35 
days worth of documents) - indexing new documents regularly throughout the day, 
deleting aged out documents nightly.

Frontend for website.
Used for customer searches, sometimes runs same query as is defined for email 
processing.
solr-spec 6.5.1
lucene-spec 6.5.1
Runtime Oracle Corporation OpenJDK 64-Bit Server VM 1.8.0_222 25.222-b10
1 collection 6 shards 3 replicas per shard 50,821,086 current documents (213 
days (7months) worth of documents) - indexing new documents regularly 
throughout the day, deleting aged out documents nightly.

Backend replacement of solr4 and hopefully Frontend replacement as well.
solr-spec 8.1.1
lucene-spec 8.1.1
Runtime Oracle Corporation OpenJDK 64-Bit Server VM 12 12+33
1 collection 6 shards 5 replicas per shard 17,919,889 current documents (35 
days worth of documents) - indexing new documents regularly throughout the day, 
deleting aged out documents nightly.

We are trying to solve a couple of issues with this upgrade of solr version.

1. Using 2 different solr clouds with different version causes different 
results to come back for our clients in their email and when they search on the 
front end.
2. When previous person attempted to build out solr 6.5.1 for backend it would 
crash in the middle of running through the search that creates the content for 
our client emails.
3. Want to bring both backend and frontend up to current Solr version, and if 
possible run both of of a single solr cloud instead of 2 with same content 
indexed to them.

Problem 1

When I run the backend process in a test with all 35000 email queries dumped 
into a queue on the current solr 4 cloud deployment it takes approximately 7-8 
hours to complete. (This is minimum performance target for new solr cloud 
deployment)
When I run the same backend process in a test with all 35000 email queries 
dumped into a queue on the new solr 8 cloud deployment it takes greater than 24 
hours to complete. (Must be less than 8 hours in order for email deliveries to 
be timely for content)

Problem 2 (likely same core issue as Problem 1, but much easier to work with)

When I run one of our normal queries against solr 6 cloud deployment the 
results return in less than 1/2 second.
When I run the same queries against solr 8 cloud deployment the results return 
in more than 16 seconds.

Link to dropbox folder containing ( 
https://www.dropbox.com/sh/2x2k5c9db7d4pt9/AADnHwuJc7a9Fh4KmUD15rS0a?dl=0 )

"one of our normal queries"

Solr 6 query results
Solr 8 query results

Solr 4 solrconfig.xml
Solr 4 schema.xml
Solr 4 solr.in.sh

Solr 6 solrconfig.xml
Solr 6 schema.xml
Solr 6 solr.in.sh

Solr 8 solrconfig.xml
Solr 8 schema.xml
Solr 8 solr.in.sh

Thank you in advance for any guidance and advice that you can give me,

Russell Bahr
Lead Infrastructure Engineer

Manzama
a MODERN GOVERNANCE company

Reply via email to