Thanks Shawn.  This is good to know.

Steve

On Wed, Apr 22, 2015 at 9:00 AM, Shawn Heisey <elyog...@elyograg.org> wrote:

> On 4/22/2015 6:33 AM, Steven White wrote:
> > Is there anything I should be taking into consideration if I have a large
> > number of fields in my Solr's schema.xml file?
> >
> > I will be indexing records into Solr and as I create documents, each
> > document will have between 20-200 fields.  However, due to the natural of
> > my data source, the combined flatten list of fields that I need to
> include
> > in schema.xml will be upward of 2000 and may reach 3000.
> >
> > My questions are as follows, compare a schema with 300 fields vs. 3000:
> >
> > 1) Will indexing be slower?  Require more memory?  CPU?
> > 2) Will the index size be larger?  If so any idea by what factor?
> > 3) Will searches be slower?  Require more memory?  CPU?
> > 4) Will the field "type" (float, boolean, date, string, etc.) have any
> > factor?
> > 5) Anything else I should know that I didn't ask?
> >
> > I should make it clear that only about 5 fields will be of type "stored"
> > while everything else will be "indexed".
>
> The number of fields in your schema is likely not a significant
> contributor to performance.  I'm sure it can have an impact because
> there is code that validates everything against the schema, but even
> with a few thousand entries, that code should execute quickly.  The
> amount of data you are actually indexing is MUCH more relevant.
>
> The Lucene index itself is only aware of the fields that actually
> contain data.  The entire Solr schema is not used or recorded by Lucene
> code at all.  It is only used within code specific to Solr.
>
> Thanks,
> Shawn
>
>

Reply via email to