That sounds like some SAN vendor BS if you ask me. Breaking up 300gb into smaller chunks would only be relevant if they were caching entire files not blocks and I find that hard to believe. Would be interested to know more about the specifics of the problem as the vendor sees it.
As Shawn said local attached storage (preferably SSD) is the way to go.. In addition using MMapDirectory with lots of RAM will give the best performance. My rule of thumb is to keep a maximum of 4-1 ratio between index size and amount of RAM on a box. steve On Wed, Sep 21, 2016 at 8:08 PM Shawn Heisey <apa...@elyograg.org> wrote: > On 9/21/2016 7:52 AM, Kyle Daving wrote: > > We are currently running solr 5.2.1 and attempted to upgrade to 6.2.1. > > We attempted this last week but ran into disk access latency problems > > so reverted back to 5.2.1. We found that after upgrading we overran > > the NVRAM on our SAN and caused a fairly large queue depth for disk > > access (we did not have this problem in 5.2.1). We reached out to our > > SAN vendor and they said that is was due to the size of our optimized > > indexes. It is not uncommon for us to have roughly 300GB single file > > optimized indexes. Our SAN vendor advised that splitting the index > > into smaller fragmented chunks would alleviate the NVRAM/queue depth > > problem. > > How is this filesystem presented to the server? Is it a block device > using a protocol like iSCSI, or is it a network filesystem, like NFS or > SMB? Block filesystems will appear to the OS as if they are a > completely local filesystem, and local machine memory will be used to > cache data. Network filesystems will usually require memory on the > storage device for caching, and typically those machines do not have a > lot of memory compared to the amount of storage space they have. > > > Why do we not see this problem with the same size index in 5.2.1? Did > > solr change the way it accesses disk in v5 vs v6? > > It's hard to say why you didn't have the problem with the earlier version. > > All the index disk access is handled by Lucene, and from Solr's point of > view, it's a black box, with only minimal configuration available. > Lucene is constantly being improved, but those improvements assume the > general best-case installation -- a machine with a local filesystem and > plenty of spare memory to effectively cache the data that filesystem > contains. > > > Is there a configuration file we should be looking at making > > adjustments in? > > Unless we can figure out why there's a problem, this question cannot be > answered. > > > Since everything worked fine in 5.2.1 there has to be something we are > > overlooking when trying to use 6.2.1. Any comments and thoughts are > > appreciated. > > Best guess (which could be wrong): There's not enough memory to > effectively cache the data in the Lucene indexes. A newer version of > Solr generally has *better* performance characteristics than an earlier > version, but *ONLY* if there's enough memory available to effectively > cache the index data, which assures that data can be accessed very > quickly. When the actual disk must be read, access speed will be slow > ... and the problem may get worse with a different version. > > How much memory is in your Solr server, and how much is assigned to the > Java heap for Solr? Are you running more than one Solr instance per > server? > > When you're dealing with a remote filesystem on a SAN, exactly where to > add memory to boost performance will depend on how the filesystem is > being presented. > > I strongly recommend against using a network filesystem like NFS or SMB > to hold a Solr index. Solr works best when the filesystem is local to > the server and there's plenty of extra memory for caching. The amount > of memory required for good performance with a 300GB index will be > substantial. > > Thanks, > Shawn > >