Hi, I am using a custom field. Below is the field definition. I am using this because I don't want stemming.
<fieldType name="text_no_stem2" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <filter class="solr.WordDelimiterFilterFactory" protected="protwords.txt" generateWordParts="0" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1" preserveOriginal="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> <analyzer type="query"> <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" /> <!--ORIGINAL generateNumberParts="1"--> <filter class="solr.WordDelimiterFilterFactory" protected="protwords.txt" generateWordParts="0" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1" preserveOriginal="1"/> <filter class="solr.LowerCaseFilterFactory"/> <!-- ORIGINAL filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/--> <!-- Webel: switch off Porter-stemmer algorithm to enforce whole word match --> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> </fieldType> Regards, Wael On Mon, Nov 6, 2017 at 10:29 AM, Emir Arnautović < emir.arnauto...@sematext.com> wrote: > Hi Wael, > Can you provide your field definition and sample query. > > Thanks, > Emir > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > > > On 6 Nov 2017, at 08:30, Wael Kader <w...@softech-lb.com> wrote: > > > > Hello, > > > > I am having an index with around 100 Million documents. > > I have a multivalued column that I am saving big chunks of text data in. > It > > has around 20 GB of RAM and 4 CPU's. > > > > I was doing faceting on it to get word cloud but it was taking around 1 > > second to retrieve when the data was 5-10 Million . > > Now I have more data and its taking minutes to get the results (that is > if > > it gets it and SOLR doesn't crash). Whats the best way to make it run or > > maybe its not scalable to make it run on my current schema and design > with > > News articles. > > > > I am looking to find the best solution for this. Maybe create another > index > > to split the data while inserting it or maybe if I change some settings > in > > SolrConfig or add some RAM, it would perform better. > > > > -- > > Regards, > > Wael > > -- Regards, Wael