Thanks Shawn. This is good to know. Steve
On Wed, Apr 22, 2015 at 9:00 AM, Shawn Heisey <elyog...@elyograg.org> wrote: > On 4/22/2015 6:33 AM, Steven White wrote: > > Is there anything I should be taking into consideration if I have a large > > number of fields in my Solr's schema.xml file? > > > > I will be indexing records into Solr and as I create documents, each > > document will have between 20-200 fields. However, due to the natural of > > my data source, the combined flatten list of fields that I need to > include > > in schema.xml will be upward of 2000 and may reach 3000. > > > > My questions are as follows, compare a schema with 300 fields vs. 3000: > > > > 1) Will indexing be slower? Require more memory? CPU? > > 2) Will the index size be larger? If so any idea by what factor? > > 3) Will searches be slower? Require more memory? CPU? > > 4) Will the field "type" (float, boolean, date, string, etc.) have any > > factor? > > 5) Anything else I should know that I didn't ask? > > > > I should make it clear that only about 5 fields will be of type "stored" > > while everything else will be "indexed". > > The number of fields in your schema is likely not a significant > contributor to performance. I'm sure it can have an impact because > there is code that validates everything against the schema, but even > with a few thousand entries, that code should execute quickly. The > amount of data you are actually indexing is MUCH more relevant. > > The Lucene index itself is only aware of the fields that actually > contain data. The entire Solr schema is not used or recorded by Lucene > code at all. It is only used within code specific to Solr. > > Thanks, > Shawn > >