Thank you Shawn. Sounds like increasing the autoSoftCommit maxTime would be a good idea. I assume this would go along with also increasing autoCommit? All of our collections (just 2 at the moment) have the same settings. The data directory is in HDFS and is the same data directory for every shard. The two cores have different directories. ---------------- root@hades logs]# hadoop fs -ls /solr5.2 Found 2 items drwxr-xr-x - solr hadoop 0 2015-10-05 12:54 /solr5.2/IMAGEDATA drwxr-xr-x - solr hadoop 0 2015-06-09 15:54 /solr5.2/DOCUMENTS
[root@hades logs]# hadoop fs -ls /solr5.2/DOCUMENTS Found 27 items drwxr-xr-x - solr hadoop 0 2015-06-09 15:08 /solr5.2/DOCUMENTS/core_node1 drwxr-xr-x - solr hadoop 0 2015-06-09 15:35 /solr5.2/DOCUMENTS/core_node10 drwxr-xr-x - solr hadoop 0 2015-06-09 15:36 /solr5.2/DOCUMENTS/core_node11 drwxr-xr-x - solr hadoop 0 2015-06-09 15:36 /solr5.2/DOCUMENTS/core_node12 drwxr-xr-x - solr hadoop 0 2015-06-09 15:36 /solr5.2/DOCUMENTS/core_node13 drwxr-xr-x - solr hadoop 0 2015-06-09 15:36 /solr5.2/DOCUMENTS/core_node14 drwxr-xr-x - solr hadoop 0 2015-06-09 15:36 /solr5.2/DOCUMENTS/core_node15 drwxr-xr-x - solr hadoop 0 2015-06-09 15:36 /solr5.2/DOCUMENTS/core_node16 drwxr-xr-x - solr hadoop 0 2015-06-09 15:36 /solr5.2/DOCUMENTS/core_node17 drwxr-xr-x - solr hadoop 0 2015-06-09 15:36 /solr5.2/DOCUMENTS/core_node18 drwxr-xr-x - solr hadoop 0 2015-06-09 15:36 /solr5.2/DOCUMENTS/core_node19 drwxr-xr-x - solr hadoop 0 2015-06-09 15:08 /solr5.2/DOCUMENTS/core_node2 drwxr-xr-x - solr hadoop 0 2015-06-09 15:36 /solr5.2/DOCUMENTS/core_node20 drwxr-xr-x - solr hadoop 0 2015-06-09 15:36 /solr5.2/DOCUMENTS/core_node21 drwxr-xr-x - solr hadoop 0 2015-06-09 15:36 /solr5.2/DOCUMENTS/core_node22 drwxr-xr-x - solr hadoop 0 2015-06-09 15:36 /solr5.2/DOCUMENTS/core_node23 drwxr-xr-x - solr hadoop 0 2015-06-09 15:36 /solr5.2/DOCUMENTS/core_node24 drwxr-xr-x - solr hadoop 0 2015-06-09 15:36 /solr5.2/DOCUMENTS/core_node25 drwxr-xr-x - solr hadoop 0 2015-06-09 15:44 /solr5.2/DOCUMENTS/core_node26 drwxr-xr-x - solr hadoop 0 2015-06-09 15:54 /solr5.2/DOCUMENTS/core_node27 drwxr-xr-x - solr hadoop 0 2015-06-09 15:08 /solr5.2/DOCUMENTS/core_node3 drwxr-xr-x - solr hadoop 0 2015-06-09 15:21 /solr5.2/DOCUMENTS/core_node4 drwxr-xr-x - solr hadoop 0 2015-06-09 15:34 /solr5.2/DOCUMENTS/core_node5 drwxr-xr-x - solr hadoop 0 2015-06-09 15:34 /solr5.2/DOCUMENTS/core_node6 drwxr-xr-x - solr hadoop 0 2015-06-09 15:35 /solr5.2/DOCUMENTS/core_node7 drwxr-xr-x - solr hadoop 0 2015-06-09 15:35 /solr5.2/DOCUMENTS/core_node8 drwxr-xr-x - solr hadoop 0 2015-06-09 15:35 /solr5.2/DOCUMENTS/core_node9 ----------------- Right now we are not running any replicas. -Joe On Fri, Feb 5, 2016 at 10:43 AM, Shawn Heisey <apa...@elyograg.org> wrote: > On 2/5/2016 8:11 AM, Joseph Obernberger wrote: > > Thank you for the reply Scott - we have the commit settings as: > > <autoCommit> > > <maxTime>60000</maxTime> > > <openSearcher>false</openSearcher> > > </autoCommit> > > <autoSoftCommit> > > <maxTime>15000</maxTime> > > </autoSoftCommit> > > > > Is that 50% disk space rule across the entire HDFS cluster or on an > > individual spindle? > > That autoSoftCommit maxTime is pretty small. Frequent commits can be a > source of problems, if the actual commits take anywhere near (or longer > than) the maxTime value to complete. If your commits are taking > significantly less than 15 seconds to complete, then it probably isn't > anything to worry about. > > The rule with disk space and Solr/Lucene is that you must have enough > free disk space for your largest index to triple in size temporarily, > and it's actually recommended to have three times the disk space of > *all* your indexes, not just the largest. Most of the time the largest > merge you'll see will double the disk space, but in some unusual edge > cases, it can triple. > > I have no idea how disk space works with HDFS when individual data nodes > become full. Someone else will have to tackle that question, and it > might need to be answered by the Hadoop project rather than here. > > With autoCommit at 60 seconds, your transaction logs should remain small > and there shouldn't be very many of them, so I really have no idea what > might be happening with those. Do you have this same > autoCommit/autoSoftCommit config on every Solr collection? > > Erick's note about AlreadyBeingCreatedException may be relevant. Are > you possibly sharing a data directory between two or more Solr cores? > This can't normally be done, and even if you configure the locking > mechanism to allow it, it's NOT recommended, especially with SolrCloud. > In SolrCloud, all replicas will write to the index. If two replicas try > to write to the same index, then that index will become corrupted and > unusable. > > Thanks, > Shawn > >