Help with complex boolean search queries
Hi, I am new to the solr community, and have this weird problem with the search results here is whats going on. i have a logfile that is indexed into solr with the following config < tokenizer class="solr.StandardTokenizerFactory"/> here is a sample for demonstration purpose, assume the following logfile(text) is indexed to solr in the field "log" AppleCare+ extends the basic warranty that covers non-accidental iPhone mishaps -- such as battery issues or a faulty headphone jack -- from one year to two. The iPhone X was unveiled to much fanfare last month. It boasts a radical update to the iPhone models of years past, with an all-glass display and an option to unlock with facial recognition. It also has an all-glass back, so owners run the risk of cracking either side of the phone. However, Apple has claimed the glass on the iPhone 8 and iPhone X is much stronger than earlier models, so it could be harder to break. Pre-orders for the phone began online Friday, and units were selling out quickly. The U.S. Apple Store site said it would take now the query that i run is as follows: q=("warranty that covers non-accidental") OR ("risking it all" AND "harder to break") hl.q=("warranty that covers non-accidental") OR ("risking it all" AND "harder to break") hl=true hl.fl=log hl.usePhraseHighlighter=true hl.fragsize=2000 hl.maxAnalyzedChars=2097152 indent=on or as a URL http://localhost:8983/solr/mycore/select?hl.usePhraseHighlighter=true&hl.fl=log&hl=true&hl.fragsize=2000&indent=on&wt=json &hl.q=(%22warranty%20that%20covers%20non-accidental%22)%20OR%20(%22risking %20it%20all%22%20AND%20%22harder%20to%20break%22)&q=(%22warranty%20that%20 covers%20non-accidental%22)%20OR%20(%22risking%20it%20all %22%20AND%20%22harder%20to%20break%22) the response is as follows: { "responseHeader":{ "status":0, "QTime":24, "params":{ "q":"(\"warranty that covers non-accidental\") OR (\"risking it all\" AND \"harder to break\")", "hl":"true", "indent":"on", "hl.q":"(\"warranty that covers non-accidental\") OR (\"risking it all\" AND \"harder to break\")", "hl.usePhraseHighlighter":"true", "hl.fragsize":"2000", "hl.fl":"log", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "logid":5487941, "log":"AppleCare+ extends the basic warranty that covers non-accidental iPhone mishaps -- such as battery issues or a faulty headphone jack -- from one year to two.\nThe iPhone X was unveiled to much fanfare last month. It boasts a radical update to the iPhone models of years past, with an all-glass display and an option to unlock with facial recognition.\nIt also has an all-glass back, so owners run the risk of cracking either side of the phone.\nHowever, Apple has claimed the glass on the iPhone 8 and iPhone X is much stronger than earlier models, so it could be harder to break.\nPre-orders for the phone began online Friday, and units were selling out quickly. The U.S. Apple Store site said it would take \n", "_version_":1582439847966015488}] }, "highlighting":{ "5487941":{ "log":["AppleCare+ extends the basic *warranty that covers non-accidental* iPhone mishaps -- such as battery issues or a faulty headphone jack -- from one year to two.\nThe iPhone X was unveiled to much fanfare last month. It boasts a radical update to the iPhone models of years past, with an all-glass display and an option to unlock with facial recognition.\nIt also has an all-glass back, so owners run the risk of cracking either side of the phone.\nHowever, Apple has claimed the glass on the iPhone 8 and iPhone X is much stronger than earlier models, so it could be *harder to break*.\nPre-orders for the phone began online Friday, and units were selling out quickly. The U.S. Apple Store site said it would take \n"] } } } i get the correct document as a hit, but the highlighted text is wrong, i am wondering the querying is straight forward, match either condition 1 or condition 2 where condition 1 = "warranty that covers non-accidental" and condition 2 = "risking it all" AND "harder to break" now the hit is correct as condition 1 matched, but why is the highlight indicating that it also matched part of the condition 2. why is it ignoring the AND operator for condition 2 Am i missing something here. I am hoping to get this resolved as i am planning to use even more complex queries like *(condition 1) OR (**condition **2) OR (**condition **3) OR (**condition **4 AND condition 5) OR (**condition **6 AND condition 5 NOT condition 7)* I am looking for some pointers Thanks Ankit
Query performance degrades when TLOG replica
We have the following setup , solr 7.7.2 with 1 TLOG Leader & 1 TLOG replica with a single shard. We have about 34.5 million documents with an approximate index size of 600GB. I have noticed a degraded query performance whenever the replica is trying to (guessing here) sync or perform actual replication. To test this, I fire a very basic query using solrj client & the query comes back right away, but whenever the replication is trying to see how far behind it is by comparing the generation ids the same queries take longer. In production we do not make these simple queries, but rather complex queries with filter queries & sorting. These queries take too long as compared to our previous (standalone solr 6.1.0) Any help here is appreciated 20-09-02 16:35:30 INFO [db_shard1_replica_t3] webapp=/solr path=/select params={q=*:*&fl=id&sort=id+desc&rows=1&wt=xml&version=2.2} hits=34458909 status=0 QTime=0 2020-09-02 16:35:30 INFO [db_shard1_replica_t3] webapp=/solr path=/select params={q=*:*&fl=id&sort=id+desc&rows=1&wt=xml&version=2.2} hits=34458909 status=0 QTime=0 2020-09-02 16:36:00 INFO [db_shard1_replica_t3] webapp=/solr path=/select params={q=*:*&fl=id&sort=id+desc&rows=1&wt=xml&version=2.2} hits=34458909 status=0 QTime=0 2020-09-02 16:36:00 INFO [db_shard1_replica_t3] webapp=/solr path=/select params={q=*:*&fl=id&sort=id+desc&rows=1&wt=xml&version=2.2} hits=34458909 status=0 QTime=0 2020-09-02 16:36:30 INFO [db_shard1_replica_t3] webapp=/solr path=/select params={q=*:*&fl=id&sort=id+desc&rows=1&wt=xml&version=2.2} hits=34458909 status=0 QTime=0 2020-09-02 16:36:30 INFO [db_shard1_replica_t3] webapp=/solr path=/select params={q=*:*&fl=id&sort=id+desc&rows=1&wt=xml&version=2.2} hits=34458909 status=0 QTime=0 *2020-09-02 16:37:01* INFO [db_shard1_replica_t3] webapp=/solr path=/select params={q=*:*&fl=id&sort=id+desc&rows=1&wt=xml&version=2.2} hits=34458909 status=0 QTime=*1011* *2020-09-02 16:37:01* INFO [db_shard1_replica_t3] webapp=/solr path=/select params={q=*:*&fl=id&sort=id+desc&rows=1&wt=xml&version=2.2} hits=34458909 status=0 QTime=*758* *2020-09-02 16:37:32* INFO [db_shard1_replica_t3] webapp=/solr path=/select params={q=*:*&fl=id&sort=id+desc&rows=1&wt=xml&version=2.2} hits=34458957 status=0 QTime=*1077* *2020-09-02 16:37:32* INFO [db_shard1_replica_t3] webapp=/solr path=/select params={q=*:*&fl=id&sort=id+desc&rows=1&wt=xml&version=2.2} hits=34458957 status=0 QTime=*1081* 2020-09-02 16:38:02 INFO [db_shard1_replica_t3] webapp=/solr path=/select params={q=*:*&fl=id&sort=id+desc&rows=1&wt=xml&version=2.2} hits=34458957 status=0 QTime=*668* 2020-09-02 16:38:03 INFO [db_shard1_replica_t3] webapp=/solr path=/select params={q=*:*&fl=id&sort=id+desc&rows=1&wt=xml&version=2.2} hits=34458957 status=0 QTime=*1001* *2020-09-02 16:37:01* INFO Master's generation: 263116 *2020-09-02 16:37:01* INFO Master's version: 1599064577322 *2020-09-02 16:37:01* INFO Slave's generation: 263116 *2020-09-02 16:37:01* INFO Slave's version: 1599064577322 *2020-09-02 16:37:01* INFO Slave in sync with master. 2020-09-02 16:37:02 INFO Master's generation: 104189 2020-09-02 16:37:02 INFO Master's version: 1599064620532 2020-09-02 16:37:02 INFO Slave's generation: 104188 2020-09-02 16:37:02 INFO Slave's version: 1599064560341 2020-09-02 16:37:02 INFO Starting replication process 2020-09-02 16:37:02 INFO Number of files in latest index in master: 1010 2020-09-02 16:37:02 INFO Starting download (fullCopy=false) to NRTCachingDirectory(MMapDirectory@/opt/solr-7.7.2/server/solr/test_shard1_replica_t3/data/index.20200902163702345 lockFactory=org.apache.lucene.store.NativeFSLockFactory@77247ee; maxCacheMB=48.0 maxMergeSizeMB=4.0) 2020-09-02 16:37:02 INFO Bytes downloaded: 837587, Bytes skipped downloading: 0 2020-09-02 16:37:02 INFO Total time taken for download (fullCopy=false,bytesDownloaded=837587) : 0 secs (null bytes/sec) to NRTCachingDirectory(MMapDirectory@/opt/solr-7.7.2/server/solr/test_shard1_replica_t3/data/index.20200902163702345 lockFactory=org.apache.lucene.store.NativeFSLockFactory@77247ee; maxCacheMB=48.0 maxMergeSizeMB=4.0) 2020-09-02 16:37:03 INFO New IndexWriter is ready to be used. 2020-09-02 16:37:03 INFO Master's generation: 124002 2020-09-02 16:37:03 INFO Master's version: 1599064617242 2020-09-02 16:37:03 INFO Slave's generation: 124000 2020-09-02 16:37:03 INFO Slave's version: 1599064492914 2020-09-02 16:37:03 INFO Starting replication process 2020-09-02 16:37:04 INFO [db_shard1_replica_t3] webapp=/solr path=/update params={update.distrib=FROMLEADER&distrib.from= http://178.33.234.1:8983/solr/db_shard1_replica_t25/&wt=javabin&version=2}{add=[11907382419 (1676740784884285440), 11907383701 (1676740784889528320), 11907383253 (1676740784900014080), 11907379290 (1676740785002774528), 11907382623 (1676740785005920256), 11907378461 (1676740785011163136), 11907382429 (1676740785012211712), 11907380739 (1676740785023746048), 11907381184 (1676740785038426112), 11907380614 (1