On 1/4/2017 1:43 PM, Chetas Joshi wrote:
> while creating a new collection, it fails to spin up solr cores on some
> nodes due to "insufficient direct memory".
>
> Here is the error:
>
>    - *3044_01_17_shard42_replica1:*
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
>    The max direct memory is likely too low. Either increase it (by adding
>    -XX:MaxDirectMemorySize=<size>g -XX:+UseLargePages to your containers
>    startup args) or disable direct allocation using
>    solr.hdfs.blockcache.direct.memory.allocation=false in solrconfig.xml. If
>    you are putting the block cache on the heap, your java heap size might not
>    be large enough. Failed allocating ~2684.35456 MB.
>
> The error is self explanatory.
>
> My question is: why does it require around 2.7 GB of off-heap memory to
> spin up a single core??
This message comes from the HdfsDirectoryFactory class.  This is the
calculation of the total amount of memory needed:

    long totalMemory = (long) bankCount * (long) numberOfBlocksPerBank
        * (long) blockSize;

The numberOfBlocksPerBank variable can come from the configuration, the
code defaults it to 16384.  The blockSize variable gets assigned by a
convoluted method involving bit shifts, and defaults to 8192.   The
bankCount variable seems to come from solr.hdfs.blockcache.slab.count,
and apparently defaults to 1.  Looks like it's been set to 20 on your
config.  If we assume the other two are at their defaults and you have
20 for the slab count, then this results in 2684354560 bytes, which
would cause the exact output seen in the error message when the memory
allocation fails.

I know very little about HDFS or how the HDFS directory works, but
apparently it needs a lot of memory if you want good performance. 
Reducing solr.hdfs.blockcache.slab.count sounds like it might result in
less memory being required.

You might want to review this page for info about how to set up HDFS,
where it says that each slab requires 128MB of memory:

https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS

The default settings for the HDFS directory cause the block cache to be
global, so all cores use it, instead of spinning up another cache for
every additional core.

What I've seen sounds like one of these two problems:  1) You've turned
off the global cache option.  2) This node doesn't yet have any HDFS
cores, so your collection create is tryin to create the first core using
HDFS.  That action is trying to allocate the global cache, which has
been sized at 20 slabs.

Thanks,
Shawn

Reply via email to