On 5/8/2013 8:12 AM, marotosg wrote: > Hi, > > I have 4 different cores in same machine. > Person core -> 3 million docs -> 20 GB size > Company Core -> 1 million docs -> 2GB size > Documents Core -> 5 million docs -> 5GB size > Emails Core -> 50,000 thousand -> 200 Mb > > While I am indexing data performance in server is almost the same if I am > indexing only one core or all > cores at the same time. > > I thought having different cores allow you to get different threads in > parallel gaining some performance. > Am I right?. My server is never reaching 100% CPU use. It always about 50% > or even less. > I had a look to I/O and it is not a problem.
You say that I/O performance appears to be good, but I/O is still likely the bottleneck here. When you are indexing them sequentially, each one has access to full I/O resources, so each one goes at top speed. If you do them all at the same time, then they are competing for I/O resources, so one can do its thing and the others have to wait until the I/O scheduler can work on their requests. In most cases, Solr is I/O bound, and the fact that it takes the same amount of time either way is additional support for the idea that you are limited by I/O resources, not CPU resources. Your I/O system is keeping up, which is good. If it weren't keeping up, parallel indexing would actually take even longer. Thanks, Shawn