Hi, my SOLR 5 solrconfig.xml file contains the following lines:
<!-- Faceting defaults --> <str name="facet">on</str> <str name="facet.field">text</str> <str name="facet.mincount">100</str> where the 'text' field contains thousands of words. When I start SOLR, the search engine takes several minutes to index the words in the 'text' field (although loading the browse template later only takes a few seconds because the 'text' field has already been indexed). Here are my questions: - should I increase SOLR's JVM memory to make initial indexing faster? e.g., SOLR_JAVA_MEM="-Xms1024m -Xmx204800m" in solr.in.sh - how can I cull facet words according to certain criteria (length, case, etc.)? For instance, my facets are the following: application (22427) inytapdf0 (22427) pdf (22427) the (22334) new (22131) herald (21983) york (21975) paris (21780) a (21692) and (21298) of (21288) i (21247) in (21062) to (20918) on (20899) m (20857) by (20733) de (20664) for (20580) at (20417) with (20371) ... Obviously, words such as "the", "i", "to","m", etc. should not be indexed. Furthermore, I don't care about "nouns". I am only interested in people and location names. Many thanks. Philippe