Hi,
I am wondering if it is common to have just one very large index, or
multiple smaller indexes specialized for different content types.

We currently have multiple smaller indexes, although one of them is
much larger then the others. We are considering merging them, to allow
the convenience of searching across multiple types at once and get
them back in one list. The largest of the current indexes has a couple
of types that belong together, it has just one text field, and it is
usually quite short and is similar to product names (words like "The"
matter). Another index I would merge with this one, has multiple text
fields (also quite short).

We of course would still like to be able to get specific types. Is
doing filtering on just one type a big performance hit compared to
just querying it from it's own index? Bare in mind all these indexes
run on the same machine. (we replicate them all to three machines and
do load balancing).

There are a number of considerations. From an application standpoint
when querying across all types we may split the results out into the
separate types anyway once we have the list back. If we always do
this, is it silly to have them in one index, rather then query
multiple indexes at once? Is multiple http requests less significant
then the time to post split the results?

In some ways it is easier to maintain a single index, although it has
felt easier to optimize the results for the type of content if they
are in separate indexes. My main concern of putting it all in one
index is that we'll make it harder to work with. We will definitely
want to do filtering on types sometimes, and if we go with a mashed up
index I'd prefer not to maintain separate specialized indexes as well.

Any thoughts?

~Tim.

Reply via email to