Thanks Otis. Will try out using a single index.

karthik c
http://cantspellathing.blogspot.com


On Thu, Mar 19, 2009 at 11:24 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

>
> You can really go either way.  Empty fields are OK.  Having lots of cores
> seems harder to maintain.  Searching against a small core will be faster
> than searching against a single core/index with all data, but you can use
> 'fq' to make things really fast.  The numbers you quote are not really big.
>  If you need to search by name across types, I would go with a single index.
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
> > From: karthik c <karthik...@gmail.com>
> > To: solr-user@lucene.apache.org
> > Sent: Thursday, March 19, 2009 7:14:08 AM
> > Subject: large number of cores
> >
> > Hi guys,
> >
> > We need to index data of a large number of types. I was wondering if it
> is
> > better to create separate cores for each type or add everything to one
> core
> > with a "type" field ?
> >
> > Here are some more details:
> > The database: Currently we have around 200 types of data. The data for
> each
> > type is stored in a separate mysql table. Each type has its own set of
> > fields, though they all share a name field and a globally unique id
> field.
> > The volume of data under each type varies from around 30 records to
> around
> > 1.5 million records.
> >
> > The queries: We will need to support the following kinds of queries:
> >   1. search by name within a type
> >   2. perform faceted filtering on all fields within a type
> >   3. search by name across all types
> >
> > We have currently created separate cores for each type. We also wrote a
> > small tool to create cores for each type and trigger a full-import for
> each
> > of them. I am not sure if this is right approach though. Also, the number
> of
> > types may increase by quite a bit in the future.
> >
> > My concerns with having such a large number of cores is:
> > 1. Does Solr support such a large number of cores ?
> > 2. Will searching across all cores be fast/effective with such a large
> > number of cores ?
> > 3. We ran into an issue where they were too many open file handles and
> had
> > to increase the file open limit in the OS.
> > 4. Triggering the full-import for a lot of cores at once results in some
> > cores not being indexed fully. Manually re-triggering the import for
> these
> > cores seems to fix the problem though.
> >
> > My concerns about using a single core are:
> > 1. The schema will now contain fields for all types. So most fields will
> be
> > empty in most documents.
> > 2. Will searching within a type be slower when compared to having the
> type
> > in a separate core ?
> >
> > Thanks,
> > karthik c
> > http://cantspellathing.blogspot.com
>
>

Reply via email to