Hello,

We are currently running into a situation where Solr (version 7.4) in
slowly using up all available memory allocated to the heap, and then
eventually hitting an OutOfMemory error. We have tried increasing the heap
size and also tuning the GC settings, but this does not seem to solve the
issue. What we see is a slow increase in G1 Old Gen heap utilization until
it eventually takes all of the heap space and causes instances to crash.
Previously we tried running each instance with 10GB of heap space
allocated. We then tried running with 20GB of heap space, and we ran into
the same issue. I have attached a histogram of the heap captured from an
instance using nearly all the available heap when allocated 10GB. What I’m
trying to determine is (1) How much heap does this setup need before it
stabilizes and stops crashing with OOM errors, (2) can this requirement
somehow be reduced so that we can use less memory, and (3) from the heap
histogram, what is actually using memory (lots of primitive type arrays and
data structures, but what part of Solr is using those)?

I am aware that distributing the index would reduce the requirements for
each shard, but we’d like to avoid that for as long as possible due to
operational difficulties associated. As far as I can tell, very few of the
conditions listed under
https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap section
actually apply to our instance. We don’t have a very large index, we never
update in production (only query), the documents don’t seem very large
(~4KB each), we don’t use faceting, caches are reasonably small (~3GB max),
RAMBufferSizeMB is 100MB, we don’t use RAMDirectoryFactory (as far as I can
tell), and we don’t use sort parameters. The solr instance is used for a
full-text complete-as-you-type use case. The typical query looks something
like the following (field names anonymized):

?q=(single_value_f1:"baril" OR multivalue_f1:"baril")^=1
(single_value_f2:(baril) OR multivalue_f2:(baril))^=0.5
&fl=score,myfield1,myfield2,myfield3:myfield3.ar&bf=product(def(myfield3.ar
,0),1)&rows=200&df=dummy&spellcheck=on&spellcheck.dictionary=spellchecker.es&spellcheck.dictionary=spellchecker.und&spellcheck.q=baril&spellcheck.accuracy=0.5&spellcheck.count=1&fq=+myfield1:(100
OR 200 OR 500)&fl=score&fl=myfield1&fl=myfield2&fl=myfield3:myfield3.ar

I have attached in various screenshots details from top on a running Solr
instance, GC logs, solr-config.xml, and also a heap histogram sampled with
Java Mission Control. I also provide various additional details below
related to how the instances are set up and details about their
configuration.

Operational summary:
We run multiple Solr instances, each acting as a completely independent
node. They are not a cluster and are not set up using Solr Cloud. Each
replica contains the entire index. These replicas run in Kubernetes on GCP.

GC Settings:
-XX:+UnlockExperimentalVMOptions -Xlog:gc*,heap=info
-XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=50
-XX:InitiatingHeapOccupancyPercent=40 -XX:-G1UseAdaptiveIHOP

Index summary:
* ~2,100,000 documents
* Total size: 9.09 GB
* Average document size = 9.09 GB / 2,100,000 docs = 4.32 KB/doc
* 215 fields per document
    * 77 are stored.
    * 137 are multivalued
* Makes fields use of many spell checkers for different languages (see solr
config.xml)
* Most fields include some sort of tokenization and analysis. Example
config:

  <fieldType name=“myfield" class="solr.TextField"
positionIncrementGap="100">
    <analyzer type="index">
      <tokenizer class="solr.KeywordTokenizerFactory"/>
      <filter class="solr.ASCIIFoldingFilterFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.PatternReplaceFilterFactory"
pattern="([^a-z0-9])" replacement="" replace="all"/>
      <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
maxGramSize="40"/>
    </analyzer>


    <analyzer type="query">
      <tokenizer class="solr.KeywordTokenizerFactory"/>
      <filter class="solr.ASCIIFoldingFilterFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.PatternReplaceFilterFactory"
pattern="([^a-z0-9])" replacement="" replace="all"/>
    </analyzer>
  </fieldType>

Please let me know if there is any additional information required.
<?xml version="1.0" encoding="UTF-8" ?>

<config>

    <luceneMatchVersion>${solr.indexVersion:7.4.0}</luceneMatchVersion>

  <dataDir>${solr.data.dir}/${solr.indexVersion:7.4.0}/${solr.core.name}/data</dataDir>

    <directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>

    <codecFactory class="solr.SchemaCodecFactory"/>

    <schemaFactory class="ClassicIndexSchemaFactory"/>

    <lib path="shared-libs/*.jar"/>
    
    <indexConfig>
        <ramBufferSizeMB>100</ramBufferSizeMB>
        <lockType>${solr.lock.type:native}</lockType>

        <deletionPolicy class="solr.SolrDeletionPolicy">
            <str name="maxCommitsToKeep">0</str>
            <str name="maxOptimizedCommitsToKeep">0</str>
        </deletionPolicy>

    </indexConfig>

    <jmx />

    <updateHandler class="solr.DirectUpdateHandler2"/>

    <query>
        <maxBooleanClauses>1024</maxBooleanClauses>

        <queryResultCache
            class="solr.FastLRUCache"
            maxRamMB="3072"
            autowarmCount="0"/>

        <filterCache
            class="solr.FastLRUCache"
            maxRamMB="128"
            initialSize="128"
            autowarmCount="100%"/>

        <documentCache
            class="solr.FastLRUCache"
            size="2048"
            initialSize="2048"
            autowarmCount="0"/>

        <enableLazyFieldLoading>true</enableLazyFieldLoading>
        <queryResultWindowSize>200</queryResultWindowSize>
        <queryResultMaxDocsCached>200</queryResultMaxDocsCached>

        <!-- A newSearcher event is fired whenever a new searcher is being prepared and there is a current searcher handling requests. -->
        <listener event="newSearcher" class="solr.QuerySenderListener">
        </listener>

        <!-- A firstSearcher event occurs when a new searcher is being prepared but there is no current registered searcher to handle requests or to gain auto-warming data from (i.e., on Solr startup). -->
        <listener event="firstSearcher" class="solr.QuerySenderListener">
        </listener>

        <useColdSearcher>false</useColdSearcher>
        <maxWarmingSearchers>2</maxWarmingSearchers>
    </query>

    <requestDispatcher handleSelect="false" >
        <requestParsers enableRemoteStreaming="true" multipartUploadLimitInKB="2048000" formdataUploadLimitInKB="2048"/>

        <httpCaching lastModFrom="dirLastMod"
                     etagSeed="2f6137a6-b580-11e6-80f5-76304dec7eb7">
            <cacheControl>max-age=604800, public</cacheControl>
        </httpCaching>
    </requestDispatcher>

    <requestHandler name="/select" class="solr.SearchHandler">
        <lst name="defaults">
            <int name="rows">10</int>
            <str name="df">key</str>
            <str name="q.op">OR</str>
        </lst>
        <arr name="last-components">
            <str>spellcheck</str>
        </arr>
    </requestHandler>

    <requestHandler name="/suggest" class="org.apache.solr.handler.component.SearchHandler">
        <lst name="defaults">
            <str name="defType">edismax</str>
            <int name="rows">5</int>
            <str name="q.op">OR</str>

            <str name="spellcheck">on</str>
            <str name="spellcheck.dictionary">spellchecker.en</str>
            <str name="spellcheck.count">20</str>
            <bool name="spellcheck.collate">true</bool>
            <str name="wt">f</str>
        </lst>

        <arr name="last-components">
            <str>spellcheck</str>
        </arr>
    </requestHandler>

    <requestHandler name="/update" class="${solr.requestHandler.update:solr.UpdateRequestHandler}"/>
    <requestHandler name="/analysis/field" startup="lazy" class="solr.FieldAnalysisRequestHandler" />
    <requestHandler name="/admin/ping" class="solr.PingRequestHandler">
        <lst name="invariants">
            <str name="df">identifier_id</str>
            <str name="q">ping</str>
        </lst>
    </requestHandler>

    <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
        <lst name="defaults">
            <str name="queryAnalyzerFieldType">spellchecker_analyzer</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
            <str name="distanceMeasure">internal</str> <!-- Levenshtein -->
            <int name="maxEdits">2</int> <!-- defines the number of changes to the term to allow -->
            <int name="minPrefix">1</int> <!-- defines the minimum number of characters the terms should share -->
            <int name="maxInspections">10</int> <!-- defines the maximum number of possible matches to review before returning results -->
            <int name="minQueryLength">3</int> <!-- defines how many characters must be in the query before suggestions are provided -->
            <float name="accuracy">0.2</float> <!-- defines the threshold for a valid suggestion -->
            <float name="maxQueryFrequency">0.01</float>
            <float name="thresholdTokenFrequency">.01</float>
        </lst>

        <lst name="spellchecker">
            <str name="name">spellchecker.und</str>
            <str name="field">spellchecker.und</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.en</str>
            <str name="field">spellchecker.en</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.de</str>
            <str name="field">spellchecker.de</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.ar</str>
            <str name="field">spellchecker.ar</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.bg</str>
            <str name="field">spellchecker.bg</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.cs</str>
            <str name="field">spellchecker.cs</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.da</str>
            <str name="field">spellchecker.da</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.es</str>
            <str name="field">spellchecker.es</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.fi</str>
            <str name="field">spellchecker.fi</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.fr</str>
            <str name="field">spellchecker.fr</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.el</str>
            <str name="field">spellchecker.el</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.he</str>
            <str name="field">spellchecker.he</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.hr</str>
            <str name="field">spellchecker.hr</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.hu</str>
            <str name="field">spellchecker.hu</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.id</str>
            <str name="field">spellchecker.id</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.it</str>
            <str name="field">spellchecker.it</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.ja</str>
            <str name="field">spellchecker.ja</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.ko</str>
            <str name="field">spellchecker.ko</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.ms</str>
            <str name="field">spellchecker.ms</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.nl</str>
            <str name="field">spellchecker.nl</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.nb</str>
            <str name="field">spellchecker.nb</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.pl</str>
            <str name="field">spellchecker.pl</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.pt</str>
            <str name="field">spellchecker.pt</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.ro</str>
            <str name="field">spellchecker.ro</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.sr</str>
            <str name="field">spellchecker.sr</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.ru</str>
            <str name="field">spellchecker.ru</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.sv</str>
            <str name="field">spellchecker.sv</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.sl</str>
            <str name="field">spellchecker.sl</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.sk</str>
            <str name="field">spellchecker.sk</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.th</str>
            <str name="field">spellchecker.th</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.tr</str>
            <str name="field">spellchecker.tr</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="name">spellchecker.vi</str>
            <str name="field">spellchecker.vi</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="queryAnalyzerFieldType">spellchecker_cjk_analyzer</str>
            <str name="name">spellchecker.zh-hans</str>
            <str name="field">spellchecker.zh-hans</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="queryAnalyzerFieldType">spellchecker_cjk_analyzer</str>
            <str name="name">spellchecker.zh-hant</str>
            <str name="field">spellchecker.zh-hant</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="queryAnalyzerFieldType">spellchecker_cjk_analyzer</str>
            <str name="name">spellchecker.zh_hans</str>
            <str name="field">spellchecker.zh_hans</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
            <str name="queryAnalyzerFieldType">spellchecker_cjk_analyzer</str>
            <str name="name">spellchecker.zh_hant</str>
            <str name="field">spellchecker.zh_hant</str>
            <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>
        <lst name="spellchecker">
          <str name="queryAnalyzerFieldType">spellchecker_cjk_analyzer</str>
          <str name="name">spellchecker.zh</str>
          <str name="field">spellchecker.zh</str>
          <str name="classname">solr.DirectSolrSpellChecker</str>
        </lst>

    </searchComponent>
    <admin>
        <defaultQuery>translations.name.en.text:*</defaultQuery>
    </admin>
</config>

Reply via email to