Hi Erick, You're right - IO was extraordinarily high. But something odd happened. To actually build a relation, I tried different heap sizes with default solrconfig.xml values as you recommended.
1. Increased RAM to 4G, speed 8500k. 2. Decreased to 2G, back to old 65k. 3. Increased back to 4G, speed 50k 4. Decreased to 3G, speed 50k 5. Increased to 10G, speed 8500k. The speed is 1 min average after the indexing is started. With last 10G, as (maybe) expected, I got java.lang.NullPointerException at org.apache.solr.handler.component.RealTimeGetComponent.getInputDocument before committing. I'm not getting the faster speeds with any of the heap sizes now. I will continue digging in deeper and in the meantime, I will be getting the 24G RAM. Currently giving Solr 6G heap (speed is 55k - too low). After making the progress, this may be a step backward but I do believe I will take 2 steps forward soon. All credits to you. Getting into GC logs now. I'm a newbie here - know about GC theory but have never analyzed those. What tool do you prefer? I'm planning to use GCeasy for uploading the solr current gc log. On Wed, 11 Dec 2019 at 18:21, Erick Erickson <erickerick...@gmail.com> wrote: > I doubt GC alone would make nearly that difference. More likely > it’s I/O interacting with MMapDirectory. Lucene uses OS memory > space for much of its index, i.e. the RAM left over > after that used for the running Solr process (and any other > processes of course). See: > > https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html > > So if you, you don’t leave much OS memory space for Lucene’s > use via MMap, that can lead to swapping. My bet is that was > what was happening, and your CPU utilization was low; Lucene and > thus Solr was spending all its time waiting around for I/O. If that theory > is true, your disk I/O should have been much higher before you reduced > your heap. > > IOW, I claim if you left the java heap at 12G and increased the physical > memory to 24G you’d see an identical (or nearly) speedup. GC for a 12G > heap is rarely a bottleneck. That said you want to use as little heap for > your Java process as possible, but if you reduce it too much you wind up > with other problems. OOM for one, and I’ve also seen GC take an inordinate > amount of time when it’s _barely_ enough to run. You hit a GC that > recovers, > say, 10M of heap which is barely enough to continue for a few milliseconds > and hits another GC….. As you can tell, “this is more art than science”… > > Glad to hear you’re making progress! > Erick > > > On Dec 11, 2019, at 5:06 AM, Paras Lehana <paras.leh...@indiamart.com> > wrote: > > > > Just to update, I kept the defaults. The indexing got only a little boost > > though I have decided to continue with the defaults and do incremental > > experiments only. To my surprise, our development server had only 12GB > RAM, > > of which 8G was allocated to Java. Because I could not increase the RAM, > I > > tried decreasing it to 4G and guess what! My indexing speed got a boost > of > > over *50x*. Erick, thanks for helping. I think I should do more homework > > about GCs also. Your GC guess seems to be valid. I have raised the > request > > to increase RAM on the development to 24GB. > > > > On Mon, 9 Dec 2019 at 20:23, Erick Erickson <erickerick...@gmail.com> > wrote: > > > >> Note that that article is from 2011. That was in the Solr 3x days when > >> many, many, many things were different. There was no SolrCloud for > >> instance. Plus Tom’s problem space is indexing _books_. Whole, complete, > >> books. Which is, actually, not “normal” indexing at all as most Solr > >> indexes are much smaller documents. Books are a perfectly reasonable > >> use-case of course, but have a whole bunch of special requirements. > >> > >> get-by-id should be very efficient, _except_ that the longer you spend > >> before opening a new searcher, the larger the internal data buffers > >> supporting get-by-id need to be. > >> > >> Anyway, best of luck > >> Erick > >> > >>> On Dec 9, 2019, at 1:05 AM, Paras Lehana <paras.leh...@indiamart.com> > >> wrote: > >>> > >>> Hi Erick, > >>> > >>> I have reverted back to original values and yes, I did see > improvement. I > >>> will collect more stats. *Thank you for helping. :)* > >>> > >>> Also, here is the reference article that I had referred for changing > >>> values: > >>> > >> > https://www.hathitrust.org/blogs/large-scale-search/forty-days-and-forty-nights-re-indexing-7-million-books-part-1 > >>> > >>> The article was perhaps for normal indexing and thus, suggested > >> increasing > >>> mergeFactor and then finally optimizing. In my case, a large number of > >>> segments could have impacted get-by-id of atomic updates? Just being > >>> curious. > >>> > >>> On Fri, 6 Dec 2019 at 19:02, Paras Lehana <paras.leh...@indiamart.com> > >>> wrote: > >>> > >>>> Hey Erick, > >>>> > >>>> We have just upgraded to 8.3 before starting the indexing. We were on > >> 6.6 > >>>> before that. > >>>> > >>>> Thank you for your continued support and resources. Again, I have > >> already > >>>> taken your suggestion to start afresh and that's what I'm going to do. > >>>> Don't get me wrong but I have been just asking doubts. I will surely > get > >>>> back with my experience after performing the full indexing. > >>>> > >>>> Thanks again! :) > >>>> > >>>> On Fri, 6 Dec 2019 at 18:48, Erick Erickson <erickerick...@gmail.com> > >>>> wrote: > >>>> > >>>>> Nothing implicitly handles optimization, you must continue to do that > >>>>> externally. > >>>>> > >>>>> Until you get to the bottom of your indexing slowdown, I wouldn’t > >> bother > >>>>> with it at all, trying to do all these things at once is what lead to > >> your > >>>>> problem in the first place, please change one thing at a time. You > say: > >>>>> > >>>>> “For a full indexing, optimizations occurred 30 times between > batches”. > >>>>> > >>>>> This is horrible. I’m not sure what version of Solr you’re using. If > >> it’s > >>>>> 7.4 or earlier, this means the the entire index was rewritten 30 > times. > >>>>> The first time it would condense all segments into a single segment, > or > >>>>> 1/30 of the total. The second time it would rewrite all that, 2/30 of > >> the > >>>>> index into a new segment. The third time 3/30. And so on. > >>>>> > >>>>> If Solr 7.5 or later, it wouldn’t be as bad, assuming your index was > >> over > >>>>> 5G. But still. > >>>>> > >>>>> See: > >>>>> > >> > https://lucidworks.com/post/segment-merging-deleted-documents-optimize-may-bad/ > >>>>> for 7.4 and earlier, > >>>>> https://lucidworks.com/post/solr-and-optimizing-your-index-take-ii/ > >> for > >>>>> 7.5 and later > >>>>> > >>>>> Eventually you can optimize by sending in an http or curl request > like > >>>>> this: > >>>>> ../solr/collection/update?optimize=true > >>>>> > >>>>> You also changed to using StandardDirectory. The default has > heuristics > >>>>> built in > >>>>> to choose the best directory implementation. > >>>>> > >>>>> I can’t emphasize enough that you’re changing lots of things at one > >> time. > >>>>> I > >>>>> _strongly_ urge you to go back to the standard setup, make _no_ > >>>>> modifications > >>>>> and change things one at a time. Some very bright people have done a > >> lot > >>>>> of work to try to make Lucene/Solr work well. > >>>>> > >>>>> Make one change at a time. Measure. If that change isn’t helpful, > undo > >> it > >>>>> and > >>>>> move to the next one. You’re trying to second-guess the Lucene/Solr > >>>>> developers who have years of understanding how this all works. Assume > >> they > >>>>> picked reasonable options for defaults and that Lucene/Solr performs > >>>>> reasonably > >>>>> well. When I get unexplainably poor results, I usually assume it was > >> the > >>>>> last > >>>>> thing I changed…. > >>>>> > >>>>> Best, > >>>>> Erick > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>> On Dec 6, 2019, at 1:31 AM, Paras Lehana < > paras.leh...@indiamart.com> > >>>>> wrote: > >>>>>> > >>>>>> Hi Erick, > >>>>>> > >>>>>> I believed optimizing explicitly merges segments and that's why I > was > >>>>>> expecting it to give performance boost. I know that optimizations > >> should > >>>>>> not be done very frequently. For a full indexing, optimizations > >>>>> occurred 30 > >>>>>> times between batches. I take your suggestion to undo all the > changes > >>>>> and > >>>>>> that's what I'm going to do. I mentioned about the optimizations > >> giving > >>>>> an > >>>>>> indexing boost (for sometime) only to support your point of my > >>>>> mergePolicy > >>>>>> backfiring. I will certainly read again about the merge process. > >>>>>> > >>>>>> Taking your suggestions - so, commits would be handled by > autoCommit. > >>>>> What > >>>>>> implicitly handles optimizations? I think the merge policy or is > there > >>>>> any > >>>>>> other setting I'm missing? > >>>>>> > >>>>>> I'm indexing via Curl API on the same server. The Current Speed of > >> curl > >>>>> is > >>>>>> only 50k (down from 1300k in the first batch). I think - as the curl > >> is > >>>>>> transmitting the XML, the documents are getting indexing. Because > then > >>>>> only > >>>>>> would speed be so low. I don't think that the whole XML is taking > the > >>>>>> memory - I remember I had to change the curl options to get rid of > the > >>>>>> transmission error for large files. > >>>>>> > >>>>>> This is my curl request: > >>>>>> > >>>>>> curl 'http://localhost:$port/solr/product/update?commit=true' -T > >>>>>> batch1.xml -X POST -H 'Content-type:text/xml > >>>>>> > >>>>>> Although, we had been doing this since ages - I think I should now > >>>>> consider > >>>>>> using the solr post service (since the indexing files stays on the > >> same > >>>>>> server) or using Solarium (we use PHP to make XMLs). > >>>>>> > >>>>>> On Thu, 5 Dec 2019 at 20:00, Erick Erickson < > erickerick...@gmail.com> > >>>>> wrote: > >>>>>> > >>>>>>>> I think I should have also done optimize between batches, no? > >>>>>>> > >>>>>>> No, no, no, no. Absolutely not. Never. Never, never, never between > >>>>> batches. > >>>>>>> I don’t recommend optimizing at _all_ unless there are > demonstrable > >>>>>>> improvements. > >>>>>>> > >>>>>>> Please don’t take this the wrong way, the whole merge process is > >> really > >>>>>>> hard to get your head around. But the very fact that you’d suggest > >>>>>>> optimizing between batches shows that the entire merge process is > >>>>>>> opaque to you. I’ve seen many people just start changing things and > >>>>>>> get themselves into a bad place, then try to change more things to > >> get > >>>>>>> out of that hole. Rinse. Repeat. > >>>>>>> > >>>>>>> I _strongly_ recommend that you undo all your changes. Neither > >>>>>>> commit nor optimize from outside Solr. Set your autocommit > >>>>>>> settings to something like 5 minutes with openSearcher=true. > >>>>>>> Set all autowarm counts in your caches in solrconfig.xml to 0, > >>>>>>> especially filterCache and queryResultCache. > >>>>>>> > >>>>>>> Do not set soft commit at all, leave it at -1. > >>>>>>> > >>>>>>> Repeat do _not_ commit or optimize from the client! Just let your > >>>>>>> autocommit settings do the commits. > >>>>>>> > >>>>>>> It’s also pushing things to send 5M docs in a single XML packet. > >>>>>>> That all has to be held in memory and then indexed, adding to > >>>>>>> pressure on the heap. I usually index from SolrJ in batches > >>>>>>> of 1,000. See: > >>>>>>> https://lucidworks.com/post/indexing-with-solrj/ > >>>>>>> > >>>>>>> Simply put, your slowdown should not be happening. I strongly > >>>>>>> believe that it’s something in your environment, most likely > >>>>>>> 1> your changes eventually shoot you in the foot OR > >>>>>>> 2> you are running in too little memory and eventually GC is > killing > >>>>> you. > >>>>>>> Really, analyze your GC logs. OR > >>>>>>> 3> you are running on underpowered hardware which just can’t take > the > >>>>> load > >>>>>>> OR > >>>>>>> 4> something else in your environment > >>>>>>> > >>>>>>> I’ve never heard of a Solr installation with such a massive > slowdown > >>>>> during > >>>>>>> indexing that was fixed by tweaking things like the merge policy > etc. > >>>>>>> > >>>>>>> Best, > >>>>>>> Erick > >>>>>>> > >>>>>>> > >>>>>>>> On Dec 5, 2019, at 12:57 AM, Paras Lehana < > >> paras.leh...@indiamart.com > >>>>>> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>> Hey Erick, > >>>>>>>> > >>>>>>>> This is a huge red flag to me: "(but I could only test for the > first > >>>>> few > >>>>>>>>> thousand documents”. > >>>>>>>> > >>>>>>>> > >>>>>>>> Yup, that's probably where the culprit lies. I could only test for > >> the > >>>>>>>> starting batch because I had to wait for a day to actually > compare. > >> I > >>>>>>>> tweaked the merge values and kept whatever gave a speed boost. My > >>>>> first > >>>>>>>> batch of 5 million docs took only 40 minutes (atomic updates > >> included) > >>>>>>> and > >>>>>>>> the last batch of 5 million took more than 18 hours. If this is an > >>>>> issue > >>>>>>> of > >>>>>>>> mergePolicy, I think I should have also done optimize between > >> batches, > >>>>>>> no? > >>>>>>>> I remember, when I indexed a single XML of 80 million after > >> optimizing > >>>>>>> the > >>>>>>>> core already indexed with 30 XMLs of 5 million each, I could post > 80 > >>>>>>>> million in a day only. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>> The indexing rate you’re seeing is abysmal unless these are > _huge_ > >>>>>>>>> documents > >>>>>>>> > >>>>>>>> > >>>>>>>> Documents only contain the suggestion name, possible titles, > >>>>>>>> phonetics/spellcheck/synonym fields and numerical fields for > >> boosting. > >>>>>>> They > >>>>>>>> are far smaller than what a Search Document would contain. > >>>>> Auto-Suggest > >>>>>>> is > >>>>>>>> only concerned about suggestions so you can guess how simple the > >>>>>>> documents > >>>>>>>> would be. > >>>>>>>> > >>>>>>>> > >>>>>>>> Some data is held on the heap and some in the OS RAM due to > >>>>> MMapDirectory > >>>>>>>> > >>>>>>>> > >>>>>>>> I'm using StandardDirectory (which will make Solr choose the right > >>>>>>>> implementation). Also, planning to read more about these (looking > >>>>> forward > >>>>>>>> to use MMap). Thanks for the article! > >>>>>>>> > >>>>>>>> > >>>>>>>> You're right. I should change one thing at a time. Let me > experiment > >>>>> and > >>>>>>>> then I will summarize here what I tried. Thank you for your > >>>>> responses. :) > >>>>>>>> > >>>>>>>> On Wed, 4 Dec 2019 at 20:31, Erick Erickson < > >> erickerick...@gmail.com> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>>> This is a huge red flag to me: "(but I could only test for the > >> first > >>>>> few > >>>>>>>>> thousand documents” > >>>>>>>>> > >>>>>>>>> You’re probably right that that would speed things up, but pretty > >>>>> soon > >>>>>>>>> when you’re indexing > >>>>>>>>> your entire corpus there are lots of other considerations. > >>>>>>>>> > >>>>>>>>> The indexing rate you’re seeing is abysmal unless these are > _huge_ > >>>>>>>>> documents, but you > >>>>>>>>> indicate that at the start you’re getting 1,400 docs/second so I > >>>>> don’t > >>>>>>>>> think the complexity > >>>>>>>>> of the docs is the issue here. > >>>>>>>>> > >>>>>>>>> Do note that when we’re throwing RAM figures out, we need to > draw a > >>>>>>> sharp > >>>>>>>>> distinction > >>>>>>>>> between Java heap and total RAM. Some data is held on the heap > and > >>>>> some > >>>>>>> in > >>>>>>>>> the OS > >>>>>>>>> RAM due to MMapDirectory, see Uwe’s excellent article: > >>>>>>>>> > >>>>>>> > >>>>> > >> > https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html > >>>>>>>>> > >>>>>>>>> Uwe recommends about 25% of your available physical RAM be > >> allocated > >>>>> to > >>>>>>>>> Java as > >>>>>>>>> a starting point. Your particular Solr installation may need a > >> larger > >>>>>>>>> percent, IDK. > >>>>>>>>> > >>>>>>>>> But basically I’d go back to all default settings and change one > >>>>> thing > >>>>>>> at > >>>>>>>>> a time. > >>>>>>>>> First, I’d look at GC performance. Is it taking all your CPU? In > >>>>> which > >>>>>>>>> case you probably need to > >>>>>>>>> increase your heap. I pick this first because it’s very common > that > >>>>> this > >>>>>>>>> is a root cause. > >>>>>>>>> > >>>>>>>>> Next, I’d put a profiler on it to see exactly where I’m spending > >>>>> time. > >>>>>>>>> Otherwise you wind > >>>>>>>>> up making random changes and hoping one of them works. > >>>>>>>>> > >>>>>>>>> Best, > >>>>>>>>> Erick > >>>>>>>>> > >>>>>>>>>> On Dec 4, 2019, at 3:21 AM, Paras Lehana < > >>>>> paras.leh...@indiamart.com> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> (but I could only test for the first few > >>>>>>>>>> thousand documents > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> -- > >>>>>>>> Regards, > >>>>>>>> > >>>>>>>> *Paras Lehana* [65871] > >>>>>>>> Development Engineer, Auto-Suggest, > >>>>>>>> IndiaMART Intermesh Ltd. > >>>>>>>> > >>>>>>>> 8th Floor, Tower A, Advant-Navis Business Park, Sector 142, > >>>>>>>> Noida, UP, IN - 201303 > >>>>>>>> > >>>>>>>> Mob.: +91-9560911996 > >>>>>>>> Work: 01203916600 | Extn: *8173* > >>>>>>>> > >>>>>>>> -- > >>>>>>>> * > >>>>>>>> * > >>>>>>>> > >>>>>>>> <https://www.facebook.com/IndiaMART/videos/578196442936091/> > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> -- > >>>>>> -- > >>>>>> Regards, > >>>>>> > >>>>>> *Paras Lehana* [65871] > >>>>>> Development Engineer, Auto-Suggest, > >>>>>> IndiaMART Intermesh Ltd. > >>>>>> > >>>>>> 8th Floor, Tower A, Advant-Navis Business Park, Sector 142, > >>>>>> Noida, UP, IN - 201303 > >>>>>> > >>>>>> Mob.: +91-9560911996 > >>>>>> Work: 01203916600 | Extn: *8173* > >>>>>> > >>>>>> -- > >>>>>> * > >>>>>> * > >>>>>> > >>>>>> <https://www.facebook.com/IndiaMART/videos/578196442936091/> > >>>>> > >>>>> > >>>> > >>>> -- > >>>> -- > >>>> Regards, > >>>> > >>>> *Paras Lehana* [65871] > >>>> Development Engineer, Auto-Suggest, > >>>> IndiaMART Intermesh Ltd. > >>>> > >>>> 8th Floor, Tower A, Advant-Navis Business Park, Sector 142, > >>>> Noida, UP, IN - 201303 > >>>> > >>>> Mob.: +91-9560911996 > >>>> Work: 01203916600 | Extn: *8173* > >>>> > >>> > >>> > >>> -- > >>> -- > >>> Regards, > >>> > >>> *Paras Lehana* [65871] > >>> Development Engineer, Auto-Suggest, > >>> IndiaMART Intermesh Ltd. > >>> > >>> 8th Floor, Tower A, Advant-Navis Business Park, Sector 142, > >>> Noida, UP, IN - 201303 > >>> > >>> Mob.: +91-9560911996 > >>> Work: 01203916600 | Extn: *8173* > >>> > >>> -- > >>> * > >>> * > >>> > >>> <https://www.facebook.com/IndiaMART/videos/578196442936091/> > >> > >> > > > > -- > > -- > > Regards, > > > > *Paras Lehana* [65871] > > Development Engineer, Auto-Suggest, > > IndiaMART Intermesh Ltd. > > > > 8th Floor, Tower A, Advant-Navis Business Park, Sector 142, > > Noida, UP, IN - 201303 > > > > Mob.: +91-9560911996 > > Work: 01203916600 | Extn: *8173* > > > > -- > > * > > * > > > > <https://www.facebook.com/IndiaMART/videos/578196442936091/> > > -- -- Regards, *Paras Lehana* [65871] Development Engineer, Auto-Suggest, IndiaMART Intermesh Ltd. 8th Floor, Tower A, Advant-Navis Business Park, Sector 142, Noida, UP, IN - 201303 Mob.: +91-9560911996 Work: 01203916600 | Extn: *8173* -- * * <https://www.facebook.com/IndiaMART/videos/578196442936091/>