another thing to consider doing is just merge the two fields into the id
value:
"id": "USER_RECORD_12334",
since its a string.



On Wed, Apr 24, 2019 at 2:35 PM Gus Heck <gus.h...@gmail.com> wrote:

> Hi Vivek
>
> Solr is not a database, nor should one try to use it as such. You'll need
> to adjust your thinking some in order to make good use of Solr. In Solr
> there is normally an id field and it should be unique across EVERY document
> in the entire collection. Thus there's no concept of a primary key, because
> there are no tables. In some situations (streaming expressions for example)
> you might want to use collections like tables, creating a collection per
> data type, but there's no way to define uniqueness in terms of more than
> one field within a collection. If your data comes from a database with
> complex keys, concatenating the values to form the single unique ID is a
> possibility. If you form keys that way of course you also want to retain
> the values as individual fields. This duplication might seem odd from a
> database perspective where one often works hard to normalize data, but for
> search, denormalization is very common. The focus with search engines is
> usually speed of retrieval rather than data correctness. Solr should serve
> as an index into some other canonical source of truth for your data, and
> that source of truth should be in charge of guaranteeing data correctness.
>
> Another alternative is to provide a field that denotes the type (table) for
> the document (such as id_type in your example). In that case, all queries
> looking for a specific object type as a result should add a filter (fq
> parameter) to denote the "table" and you may want to store a db_id field to
> correlate the data with a database if that's where it came from. When using
> the field/filter strategy you tend to inflate the number of fields in the
> index with some fields being sparsely populated and this can have some
> performance implications, and furthermore if one "table" gets updated
> frequently you wind up interfering with the caching for all data due to
> frequent opening of new searchers. On the plus side such a strategy makes
> it easier to query across multiple types simultaneously, so these
> considerations should be balanced against your usage patterns, performance
> needs, ease of management and ease of programming.
>
> Best,
> Gus
>
> On Fri, Apr 19, 2019 at 2:10 PM Vivekanand Sahay
> <vsa...@walmartlabs.com.invalid> wrote:
>
> > Hello,
> >
> > I have a use case like below.
> >
> > USE CASE
> > I have a document with fields like
> >
> > Id,
> > Id_type,
> > Field_1.
> > Filed_2
> >
> > 2 sample messages will look like
> >
> > {
> >   "id": "12334",
> >   "id_type": "USER_RECORD",
> >   "field_1": null,
> >   "field_2": null
> > }
> >
> >
> > {
> >   "id": "31321",
> >   "id_type": "OWNER_RECORD",
> >   "field_1": null,
> >   "field_2": null
> > }
> >
> >
> > QUESTIONS
> >
> > I’d like to define the unique key as a compound key from fields id and
> > id_type
> >
> >   1.  Could someone give me an example of how to do this ? Or point to
> the
> > relevant section in the docs?
> >   2.  Is this the best way to define a compound primary key ? Is there a
> > more efficient way ?
> >
> > Regards,
> > Vivek
> >
>
>
> --
> http://www.the111shift.com
>

Reply via email to