On Fri, Mar 16, 2018 at 6:03 PM, Shawn Heisey <elyog...@elyograg.org> wrote:
> On 3/15/2018 6:34 AM, BlackIce wrote: > >> However the main app that will be >> running is more or less a single threated app which takes advantage when >> run under several instances, ie: parallelism, so I thought, since I'm at >> it >> I may give solr a few instances as well >> > > ***Deepak*** I did a performance study of Solr a while back. And I found that it does not scale beyond a particular point on a single machine (could be due to the way its coded). Hence multiple instances might make sense. https://docs.google.com/document/d/1kUqEcZl3NhOo6SLklo5Icg3fMnn9OtLY_lwnc6wbXus/edit?usp=sharing ***Deepak*** > Solr is a fully threaded app, capable of doing LOTS of things at the same > time, without multiple instances. > > Thnx for the Heap pointer.. I've read, from some Professor.. that Solr >> actually is more efficient with a very small Heap and to have everything >> mapped to virtual memory... Which brings me to the next question.. is the >> Virtual memory mapping done by the OS or Solar? Does the Virtual memory >> reside on the OS HDD? Or on the Solr HDD?.. and if the Virtual memory >> mapping is done on the OS HDD, wouldn't it be beneficial to run the OS off >> a SSD? >> > > ***Deepak*** If you have a small RAM (I am assuming that is what you mean by a small heap), then OS will do swapping or demand paging to manage your memory requirements. SSD will help. However it might be better to have a larger RAM than rely on SSD. ***Deepak*** > There appears to be some confusion here. > > The virtual memory doesn't reside on ANY hard drive, unless you've REALLY > configured the system badly and the system starts using swap space. If the > system starts using swap, performance is going to be terrible, no matter > how fast the disk where swap resides is. > > The "mapping to virtual memory" feature is something the operating system > does. Lucene/Solr utilizes MMAP code in Java, which then turns around and > uses MMAP functionality provided by the OS. > > At that point, that file can be accessed by the application as if it were > a very large block of memory. Mapping the file doesn't immediately use any > memory at all. The OS manages the access to the file. If the part of the > file that is being accessed has not been accessed before, then the OS will > read the data off the disk, place it into the OS disk cache, and provide it > to whatever requested it. If it has been accessed before and is still in > the disk cache, then it won't read the disk, it will just provide the data > from the cache. Getting most data from cache is *required* for good Solr > performance. > > http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html > > Running with your indexes on SSD might indeed help performance, and > regardless of anything that's going on, WILL help performance in the short > term, when you first turn the machine on. But if it also helps with > long-term query performance, then chances are that the machine doesn't have > enough memory.When Solr servers are sized correctly, running on SSD is > typically not going to make a big difference, unless the machine does a lot > more indexing than querying. > > For now.. my FEELING is to run one Solr instance on this particular >> machine.. by the time the RAM is outgrown add another machine and so >> forth... >> > > Any plans you have for a growth strategy with multiple Solr instances are > extremely likely to still be possible with only one instance, with very > little change. > > Thanks, > Shawn Deepak "Please stop cruelty to Animals, help by becoming a Vegan" +91 73500 12833 deic...@gmail.com Facebook: https://www.facebook.com/deicool LinkedIn: www.linkedin.com/in/deicool "Plant a Tree, Go Green" On Fri, Mar 16, 2018 at 6:03 PM, Shawn Heisey <elyog...@elyograg.org> wrote: > On 3/15/2018 6:34 AM, BlackIce wrote: > >> However the main app that will be >> running is more or less a single threated app which takes advantage when >> run under several instances, ie: parallelism, so I thought, since I'm at >> it >> I may give solr a few instances as well >> > > Solr is a fully threaded app, capable of doing LOTS of things at the same > time, without multiple instances. > > Thnx for the Heap pointer.. I've read, from some Professor.. that Solr >> actually is more efficient with a very small Heap and to have everything >> mapped to virtual memory... Which brings me to the next question.. is the >> Virtual memory mapping done by the OS or Solar? Does the Virtual memory >> reside on the OS HDD? Or on the Solr HDD?.. and if the Virtual memory >> mapping is done on the OS HDD, wouldn't it be beneficial to run the OS off >> a SSD? >> > > There appears to be some confusion here. > > The virtual memory doesn't reside on ANY hard drive, unless you've REALLY > configured the system badly and the system starts using swap space. If the > system starts using swap, performance is going to be terrible, no matter > how fast the disk where swap resides is. > > The "mapping to virtual memory" feature is something the operating system > does. Lucene/Solr utilizes MMAP code in Java, which then turns around and > uses MMAP functionality provided by the OS. > > At that point, that file can be accessed by the application as if it were > a very large block of memory. Mapping the file doesn't immediately use any > memory at all. The OS manages the access to the file. If the part of the > file that is being accessed has not been accessed before, then the OS will > read the data off the disk, place it into the OS disk cache, and provide it > to whatever requested it. If it has been accessed before and is still in > the disk cache, then it won't read the disk, it will just provide the data > from the cache. Getting most data from cache is *required* for good Solr > performance. > > http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html > > Running with your indexes on SSD might indeed help performance, and > regardless of anything that's going on, WILL help performance in the short > term, when you first turn the machine on. But if it also helps with > long-term query performance, then chances are that the machine doesn't have > enough memory.When Solr servers are sized correctly, running on SSD is > typically not going to make a big difference, unless the machine does a lot > more indexing than querying. > > For now.. my FEELING is to run one Solr instance on this particular >> machine.. by the time the RAM is outgrown add another machine and so >> forth... >> > > Any plans you have for a growth strategy with multiple Solr instances are > extremely likely to still be possible with only one instance, with very > little change. > > Thanks, > Shawn > >