Another round of tests this morning. Ten rounds of imports all done on the non-leader node: 902294 900089 899267 898127 901945 901055 899638 899392 899880 901812
The expected number of records is 903990 I am getting this error: org.apache.solr.common.SolrException: Bad Request request: http://192.168.20.51:8983/solr/Inventory_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F192.168.20.51%3A7574%2Fsolr%2FInventory_shard1_replica2%2F&wt=javabin&version=2 at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:241) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) And I am getting this warning: org.apache.solr.common.SolrException: Bad Request request: http://192.168.20.51:8983/solr/Inventory_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F192.168.20.51%3A7574%2Fsolr%2FInventory_shard1_replica2%2F&wt=javabin&version=2 at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:241) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) These are both from the admin logging section. I have retained the logs files if it would help AJ -----Original Message----- From: AJ Lemke [mailto:aj.le...@securitylabs.com] Sent: Monday, November 3, 2014 5:31 PM To: solr-user@lucene.apache.org Subject: RE: Missing Records So I jumped back on this. I have not been using the optimize option on this new set of tests. If I run the full index on the leader I seem to get all of the items in the database minus 3 that have a missing field. Indexing completed. Added/Updated: 903,990 documents. Deleted 0 documents. (Duration: 25m 11s) Requests: 1 (0/s), Fetched: 903,993 (598/s), Skipped: 0, Processed: 903,990 Last Modified:2 minutes ago Num Docs:903990 Max Doc:903990 Heap Memory Usage:2625744 Deleted Docs:0 Version:3249 Segment Count:7 Optimized: Current: If I run it on the other node I get: Indexing completed. Added/Updated: 903,993 documents. Deleted 0 documents. (Duration: 27m 08s) Requests: 1 (0/s), Fetched: 903,993 (555/s), Skipped: 0, Processed: 903,993 (555/s) Last Modified:about a minute ago Num Docs:897791 Max Doc:897791 Heap Memory Usage:2621072 Deleted Docs:0 Version:3285 Segment Count:7 Optimized: Current: Any ideas? If there is any more info that is needed let me know. AJ -----Original Message----- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Friday, October 31, 2014 1:44 PM To: solr-user@lucene.apache.org Subject: Re: Missing Records Sorry to say this, but I don't think the numDocs/maxDoc numbers are telling you anything..... because it looks like you've optimized which purges any data associated with deleted docs, including the internal IDs which are the numDocs/maxDocs figures. So if there were deletions, we can't see any evidence of same. Siiggggh. On Fri, Oct 31, 2014 at 9:56 AM, AJ Lemke <aj.le...@securitylabs.com> wrote: > I have run some more tests so the numbers have changed a bit. > > Index Results done on Node 1: > Indexing completed. Added/Updated: 903,993 documents. Deleted 0 > documents. (Duration: 31m 47s) > Requests: 1 (0/s), Fetched: 903,993 (474/s), Skipped: 0, Processed: > 903,993 > > Node 1: > Last Modified: 44 minutes ago > Num Docs: 824216 > Max Doc: 824216 > Heap Memory Usage: -1 > Deleted Docs: 0 > Version: 1051 > Segment Count: 1 > Optimized: checked > Current: checked > > Node 2: > Last Modified: 44 minutes ago > Num Docs: 824216 > Max Doc: 824216 > Heap Memory Usage: -1 > Deleted Docs: 0 > Version: 1051 > Segment Count: 1 > Optimized: checked > Current: checked > > Search results are the same as the doc numbers above. > > Logs only have one instance of an error: > > ERROR - 2014-10-31 10:47:12.867; > org.apache.solr.update.StreamingSolrServers$1; error > org.apache.solr.common.SolrException: Bad Request > > > > request: > http://192.168.20.57:7574/solr/inventory_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F192.168.20.57%3A8983%2Fsolr%2Finventory_shard1_replica2%2F&wt=javabin&version=2 > at > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:241) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > > Some info that may be of help > This is on my local vm using jetty with the embedded zookeeper. > Commands to start cloud: > > java -DzkRun -jar start.jar > java -Djetty.port=7574 -DzkRun -DzkHost=localhost:9983 -jar start.jar > > sh zkcli.sh -zkhost localhost:9983 -cmd upconfig -confdir > ~/development/configs/inventory/ -confname config_ inventory sh > zkcli.sh -zkhost localhost:9983 -cmd linkconfig -collection inventory > -confname config_ inventory > > curl > "http://localhost:8983/solr/admin/collections?action=CREATE&name=inventory&numShards=1&replicationFactor=2&maxShardsPerNode=4" > curl "http://localhost:8983/solr/admin/collections?action=RELOAD&name= > inventory " > > AJ > > > -----Original Message----- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Friday, October 31, 2014 9:49 AM > To: solr-user@lucene.apache.org > Subject: Re: Missing Records > > OK, that is puzzling. > > bq: If there were duplicates only one of the duplicates should be removed and > I still should be able to search for the ID and find one correct? > > Correct. > > Your bad request error is puzzling, you may be on to something there. > What it looks like is that somehow some of the documents you're > sending to Solr aren't getting indexed, either being dropped through > the network or perhaps have invalid fields, field formats (i.e. a date > in the wrong format, > whatever) or some such. When you complete the run, what are the maxDoc and > numDocs numbers on one of the nodes? > > What else do you see in the logs? They're pretty big after that many adds, > but maybe you can grep for ERROR and see something interesting like stack > traces. Or even "org.apache.solr". This latter will give you some false hits, > but at least it's better than paging through a huge log file.... > > Personally, in this kind of situation I sometimes use SolrJ to do my indexing > rather than DIH, I find it easier to debug so that's another possibility. In > the worst case with SolrJ, you can send the docs one at a time.... > > Best, > Erick > > On Fri, Oct 31, 2014 at 7:37 AM, AJ Lemke <aj.le...@securitylabs.com> wrote: >> Hi Erick: >> >> All of the records are coming out of an auto numbered field so the ID's will >> all be unique. >> >> Here is the the test I ran this morning: >> >> Indexing completed. Added/Updated: 903,993 documents. Deleted 0 >> documents. (Duration: 28m) >> Requests: 1 (0/s), Fetched: 903,993 (538/s), Skipped: 0, Processed: >> 903,993 (538/s) >> Started: 33 minutes ago >> >> Last Modified:4 minutes ago >> Num Docs:903829 >> Max Doc:903829 >> Heap Memory Usage:-1 >> Deleted Docs:0 >> Version:1517 >> Segment Count:16 >> Optimized: checked >> Current: checked >> >> If there were duplicates only one of the duplicates should be removed and I >> still should be able to search for the ID and find one correct? >> As it is right now I am missing records that should be in the collection. >> >> I also noticed this: >> >> org.apache.solr.common.SolrException: Bad Request >> >> >> >> request: >> http://192.168.20.57:7574/solr/inventory_shard1_replica1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F192.168.20.57%3A8983%2Fsolr%2Finventory_shard1_replica2%2F&wt=javabin&version=2 >> at >> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:241) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >> at java.lang.Thread.run(Thread.java:745) >> >> AJ >> >> >> -----Original Message----- >> From: Erick Erickson [mailto:erickerick...@gmail.com] >> Sent: Thursday, October 30, 2014 7:08 PM >> To: solr-user@lucene.apache.org >> Subject: Re: Missing Records >> >> First question: Is there any possibility that some of the docs have >> duplicate IDs (<uniqueKey>s)? If so, then some of the docs will be replaced, >> which will lower your returns. >> One way to figuring this out is to go to the admin screen and if numDocs < >> maxDoc, then documents have been replaced. >> >> Also, if numDocs is smaller than 903,993 then you probably have some docs >> being replaced. One warning, however. Even if docs are deleted, then this >> could still be the case because when segments are merged the deleted docs >> are purged. >> >> Best, >> Erick >> >> On Thu, Oct 30, 2014 at 3:12 PM, S.L <simpleliving...@gmail.com> wrote: >>> I am curious , how many shards do you have and whats the replication >>> factor you are using ? >>> >>> On Thu, Oct 30, 2014 at 5:27 PM, AJ Lemke <aj.le...@securitylabs.com> wrote: >>> >>>> Hi All, >>>> >>>> We have a SOLR cloud instance that has been humming along nicely >>>> for months. >>>> Last week we started experiencing missing records. >>>> >>>> Admin DIH Example: >>>> Fetched: 903,993 (736/s), Skipped: 0, Processed: 903,993 (736/s) A >>>> *:* search claims that there are only 903,902 this is the first >>>> full index. >>>> Subsequent full indexes give the following counts for the *:* >>>> search >>>> 903,805 >>>> 903,665 >>>> 826,357 >>>> >>>> All the while the admin returns: Fetched: 903,993 (x/s), Skipped: >>>> 0, >>>> Processed: 903,993 (x/s) every time. ---records per second is >>>> variable >>>> >>>> >>>> I found an item that should be in the index but is not found in a search. >>>> >>>> Here are the referenced lines of the log file. >>>> >>>> DEBUG - 2014-10-30 15:10:51.160; >>>> org.apache.solr.update.processor.LogUpdateProcessor; PRE_UPDATE >>>> add{,id=750041421} >>>> {{params(debug=false&optimize=true&indent=true&commit=true&clean=tr >>>> u >>>> e >>>> &wt=json&command=full-import&entity=ads&verbose=false),defaults(con >>>> f >>>> i >>>> g=data-config.xml)}} >>>> DEBUG - 2014-10-30 15:10:51.160; >>>> org.apache.solr.update.SolrCmdDistributor; sending update to >>>> http://192.168.20.57:7574/solr/inventory_shard1_replica2/ retry:0 >>>> add{,id=750041421} >>>> params:update.distrib=TOLEADER&distrib.from=http%3A%2F%2F192.168.20. >>>> 5 7%3A8983%2Fsolr%2Finventory_shard1_replica1%2F >>>> >>>> --- there are 746 lines of log between entries --- >>>> >>>> DEBUG - 2014-10-30 15:10:51.340; org.apache.http.impl.conn.Wire; >>>> >> >>>> "[0x2][0xc3][0xe0]¶ms[0xa2][0xe0].update.distrib(TOLEADER[0xe0] >>>> , >>>> d >>>> istrib.from?[0x17] >>>> http://192.168.20.57:8983/solr/inventory_shard1_replica1/[0xe0]&del >>>> B >>>> y >>>> Q[0x0][0xe0]'docsMap[0xe][0x13][0x10]8[0x8]?[0x80][0x0][0x0][0xe0]# >>>> Z >>>> i >>>> p%51106[0xe0]-IsReelCentric[0x2][0xe0](HasPrice[0x1][0xe0]*Make_Low >>>> e r 'ski-doo[0xe0])StateName$Iowa[0xe0]-OriginalModel/Summit >>>> Highmark[0xe0]/VerticalSiteIDs!2[0xe0]-ClassBinaryIDp@[0xe0]#lat(42. >>>> 4 >>>> 8929[0xe0]-SubClassFacet01704|Snowmobiles[0xe0](FuelType%Other[0xe0 >>>> ] >>>> 2 >>>> DivisionName_Lower,recreational[0xe0]&latlon042.4893,-96.3693[0xe0] >>>> * P hotoCount!8[0xe0](HasVideo[0x2][0xe0]"ID)750041421[0xe0]&Engine >>>> [0xe0]*ClassFacet.12|Snowmobiles[0xe0]$Make'Ski-Doo[0xe0]$City*Siou >>>> x >>>> City[0xe0]#lng*-96.369302[0xe0]-Certification!N[0xe0]0EmotionalTagline0162" >>>> Long Track >>>> [0xe0]*IsEnhanced[0x1][0xe0]*SubClassID$1704[0xe0](NetPrice$4500[0x >>>> e >>>> 0 >>>> ]1IsInternetSpecial[0x2][0xe0](HasPhoto[0x1][0xe0]/DealerSortOrder! >>>> 2 [ 0xe0]+Description?VThis Bad boy will pull you through the >>>> deepest snow!With the 162" track and 1000cc of power you can fly up >>>> any hill!![0xe0],DealerRadius+8046.720000[0xe0],Transmission >>>> [0xe0]*ModelFacet7Ski-Doo|Summit >>>> Highmark[0xe0]/DealerNameFacet9Certified >>>> Auto, >>>> Inc.|4150[0xe0])StateAbbr"IA[0xe0])ClassName+Snowmobiles[0xe0](Deal >>>> e r ID$4150[0xe0]&AdCode$DX1Q[0xe0]*DealerName4Certified >>>> Auto, >>>> Inc.[0xe0])Condition$Used[0xe0]/Condition_Lower$used[0xe0]-Exterior >>>> C >>>> o >>>> lor+Blue/Yellow[0xe0],DivisionName,Recreational[0xe0]$Trim(1000 >>>> SDI[0xe0](SourceID!1[0xe0]0HasAdEnhancement!0[0xe0]'ClassID"12[0xe0]. >>>> FuelType_Lower%other[0xe0]$Year$2005[0xe0]+DealerFacet?[0x8]4150|Ce >>>> r >>>> t >>>> ified Auto, Inc.|Sioux >>>> City|IA[0xe0],SubClassName+Snowmobiles[0xe0]%Model/Summit >>>> Highmark[0xe0])EntryDate42011-11-17T10:46:00Z[0xe0]+StockNumber&000 >>>> 1 >>>> 0 5[0xe0]+PriceRebate!0[0xe0]+Model_Lower/summit >>>> highmark[\n]" >>>> What could be the issue and how does one fix this issue? >>>> >>>> Thanks so much and if more information is needed I have preserved >>>> the log files. >>>> >>>> AJ >>>>