Re: Compound Primary Keys

Gus Heck Wed, 24 Apr 2019 11:35:37 -0700

Hi Vivek

Solr is not a database, nor should one try to use it as such. You'll need
to adjust your thinking some in order to make good use of Solr. In Solr
there is normally an id field and it should be unique across EVERY document
in the entire collection. Thus there's no concept of a primary key, because
there are no tables. In some situations (streaming expressions for example)
you might want to use collections like tables, creating a collection per
data type, but there's no way to define uniqueness in terms of more than
one field within a collection. If your data comes from a database with
complex keys, concatenating the values to form the single unique ID is a
possibility. If you form keys that way of course you also want to retain
the values as individual fields. This duplication might seem odd from a
database perspective where one often works hard to normalize data, but for
search, denormalization is very common. The focus with search engines is
usually speed of retrieval rather than data correctness. Solr should serve
as an index into some other canonical source of truth for your data, and
that source of truth should be in charge of guaranteeing data correctness.

Another alternative is to provide a field that denotes the type (table) for
the document (such as id_type in your example). In that case, all queries
looking for a specific object type as a result should add a filter (fq
parameter) to denote the "table" and you may want to store a db_id field to
correlate the data with a database if that's where it came from. When using
the field/filter strategy you tend to inflate the number of fields in the
index with some fields being sparsely populated and this can have some
performance implications, and furthermore if one "table" gets updated
frequently you wind up interfering with the caching for all data due to
frequent opening of new searchers. On the plus side such a strategy makes
it easier to query across multiple types simultaneously, so these
considerations should be balanced against your usage patterns, performance
needs, ease of management and ease of programming.

Best,
Gus

On Fri, Apr 19, 2019 at 2:10 PM Vivekanand Sahay
<vsa...@walmartlabs.com.invalid> wrote:

> Hello,
>
> I have a use case like below.
>
> USE CASE
> I have a document with fields like
>
> Id,
> Id_type,
> Field_1.
> Filed_2
>
> 2 sample messages will look like
>
> {
>   "id": "12334",
>   "id_type": "USER_RECORD",
>   "field_1": null,
>   "field_2": null
> }
>
>
> {
>   "id": "31321",
>   "id_type": "OWNER_RECORD",
>   "field_1": null,
>   "field_2": null
> }
>
>
> QUESTIONS
>
> I’d like to define the unique key as a compound key from fields id and
> id_type
>
>   1.  Could someone give me an example of how to do this ? Or point to the
> relevant section in the docs?
>   2.  Is this the best way to define a compound primary key ? Is there a
> more efficient way ?
>
> Regards,
> Vivek
>

-- 
http://www.the111shift.com

Re: Compound Primary Keys

Reply via email to