Re: SOLR indexing strategy

Jack Krupansky Sat, 21 Mar 2015 04:59:07 -0700

Don't you have a number of "types" of transactions, where some fields may
be common to all transactions, but with plenty of fields that are not
common to all transactions? The point is that if the number of fields that
need to be populated for each document type is relatively low, it becomes
much more practical. But if all 1000 fields must always be populated...
that's much, much harder.

Default values? Try as hard as you can to not store default values in the
index - they take up space and transfer time. Lucene is much more efficient
at storing empty field values.

If you are only indexing 10-15 fields, that's a very good thing, but not
enough by itself.

An alternate model: use Solr to index your 10-15 fields and only store the
native key for each record in Solr. That will keep your Solr index much
smaller. Then, you perform your query in Solr and get back only the native
keys for the matching records, and then you would do a database lookup in
your bulk storage engine directly by those keys to fetch just the records
that match the query results.

What do your queries tend to look like?

-- Jack Krupansky

On Sat, Mar 21, 2015 at 5:36 AM, varun sharma <mechanism_...@yahoo.co.in>
wrote:

> Its more of a financial message where for each customer there are various
> fields that specify various aspects of the transaction
>
>
>      On Friday, 20 March 2015 8:09 PM, Priceputu Cristian <
> priceputu.crist...@gmail.com> wrote:
>
>
>  Why would you need 1000 fields for ?
> C
>
> On Fri, Mar 20, 2015 at 1:12 PM, varun sharma <mechanism_...@yahoo.co.in>
> wrote:
>
> Requirements of the system that we are trying to build are for each date
> we need to create a SOLR index containing about 350-500 million documents ,
> where each document is a single structured record having about 1000 fields
> .Then query same based on index keys & date, for instance we will try to
> search records related to a particular user where date between Jan-1-2015
> to Jan-31-2015. This query should load only indexes within this date range
> into memory and return rows corresponding to the search pattern.Please
> suggest how this can be implemented using SOLR/Lucene.Thank you ,Varun.
>
>
>
>
>
> --
> Regards,
> Cristian.
>
>
>
>

Re: SOLR indexing strategy

Reply via email to