Can ulimit <https://lucene.apache.org/solr/guide/8_3/taking-solr-to-production.html#ulimit-settings-nix-operating-systems> settings impact this? Review once.
On Thu, 5 Dec 2019 at 23:31, Shawn Heisey <apa...@elyograg.org> wrote: > On 12/5/2019 10:28 AM, Rahul Goswami wrote: > > We have a Solr 7.2.1 Solr Cloud setup where the client is indexing in 5 > > parallel threads with 5000 docs per batch. This is a test setup and all > > documents are indexed on the same node. We are seeing connection timeout > > issues thereafter some time into indexing. I am yet to analyze GC pauses > > and other possibilities, but as a guideline just wanted to know what > > indexing rate might be "too high" for Solr so as to consider throttling ? > > The documents are mostly metadata with about 25 odd fields, so not very > > heavy. > > Would be nice to know a baseline performance expectation for better > > application design considerations. > > It's not really possible to give you a number here. It depends on a lot > of things, and every install is going to be different. > > On a setup that I once dealt with, where there was only a single thread > doing the indexing, indexing on each core happened at about 1000 docs > per second. I've heard people mention rates beyond 50000 docs per > second. I've also heard people talk about rates of indexing far lower > than what I was seeing. > > When you say "connection timeout" issues ... that could mean a couple of > different things. It could mean that the connection never gets > established because it times out while trying, or it could mean that the > connection gets established, and then times out after that. Which are > you seeing? Usually dealing with that involves changing timeout > settings on the client application. Figuring out what's causing the > delays that lead to the timeouts might be harder. GC pauses are a > primary candidate. > > There are typically two bottlenecks possible when indexing. One is that > the source system cannot supply the documents fast enough. The other is > that the Solr server is sitting mostly idle while the indexing program > waits for an opportunity to send more documents. The first is not > something we can help you with. The second is dealt with by making the > indexing application multi-threaded or multi-process, or adding more > threads/processes. > > Thanks, > Shawn > -- -- Regards, *Paras Lehana* [65871] Development Engineer, Auto-Suggest, IndiaMART Intermesh Ltd. 8th Floor, Tower A, Advant-Navis Business Park, Sector 142, Noida, UP, IN - 201303 Mob.: +91-9560911996 Work: 01203916600 | Extn: *8173* -- * * <https://www.facebook.com/IndiaMART/videos/578196442936091/>