On 4/22/2015 6:33 AM, Steven White wrote:
> Is there anything I should be taking into consideration if I have a large
> number of fields in my Solr's schema.xml file?
> 
> I will be indexing records into Solr and as I create documents, each
> document will have between 20-200 fields.  However, due to the natural of
> my data source, the combined flatten list of fields that I need to include
> in schema.xml will be upward of 2000 and may reach 3000.
> 
> My questions are as follows, compare a schema with 300 fields vs. 3000:
> 
> 1) Will indexing be slower?  Require more memory?  CPU?
> 2) Will the index size be larger?  If so any idea by what factor?
> 3) Will searches be slower?  Require more memory?  CPU?
> 4) Will the field "type" (float, boolean, date, string, etc.) have any
> factor?
> 5) Anything else I should know that I didn't ask?
> 
> I should make it clear that only about 5 fields will be of type "stored"
> while everything else will be "indexed".

The number of fields in your schema is likely not a significant
contributor to performance.  I'm sure it can have an impact because
there is code that validates everything against the schema, but even
with a few thousand entries, that code should execute quickly.  The
amount of data you are actually indexing is MUCH more relevant.

The Lucene index itself is only aware of the fields that actually
contain data.  The entire Solr schema is not used or recorded by Lucene
code at all.  It is only used within code specific to Solr.

Thanks,
Shawn

Reply via email to