For start you don’t have to store it. Also, is 10 words shingle really needed?
Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 24 Feb 2018, at 16:58, aneeshkappu <happyaneesh...@gmail.com> wrote: > > Hi All, I want to get the count of a phrase from a document . > Currently im using Shingle Filter factory but it consuming a large disk > space. Any alternate ways or any way to optimize this. > currently it consuming 40GB for just 46K records > > my schema setting is given below > > <field name="data_text" type="texto_indexado" indexed="true" stored="true" > multiValued="false"/> > > > <fieldType name="texto_indexado" class="solr.TextField" omitNorms="false"> > > <analyzer type="index"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.ShingleFilterFactory" maxShingleSize="10" > outputUnigrams="true"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > > </fieldType> > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html