Thanks a lot. Appreciate the pointers about indexing engine vs database .
I ended up using concatenating the fields to generate a key.


On Wed, Apr 24, 2019 at 11:39 AM David Hastings <
hastings.recurs...@gmail.com> wrote:

> another thing to consider doing is just merge the two fields into the id
> value:
> "id": "USER_RECORD_12334",
> since its a string.
>
>
>
> On Wed, Apr 24, 2019 at 2:35 PM Gus Heck <gus.h...@gmail.com> wrote:
>
> > Hi Vivek
> >
> > Solr is not a database, nor should one try to use it as such. You'll need
> > to adjust your thinking some in order to make good use of Solr. In Solr
> > there is normally an id field and it should be unique across EVERY
> document
> > in the entire collection. Thus there's no concept of a primary key,
> because
> > there are no tables. In some situations (streaming expressions for
> example)
> > you might want to use collections like tables, creating a collection per
> > data type, but there's no way to define uniqueness in terms of more than
> > one field within a collection. If your data comes from a database with
> > complex keys, concatenating the values to form the single unique ID is a
> > possibility. If you form keys that way of course you also want to retain
> > the values as individual fields. This duplication might seem odd from a
> > database perspective where one often works hard to normalize data, but
> for
> > search, denormalization is very common. The focus with search engines is
> > usually speed of retrieval rather than data correctness. Solr should
> serve
> > as an index into some other canonical source of truth for your data, and
> > that source of truth should be in charge of guaranteeing data
> correctness.
> >
> > Another alternative is to provide a field that denotes the type (table)
> for
> > the document (such as id_type in your example). In that case, all queries
> > looking for a specific object type as a result should add a filter (fq
> > parameter) to denote the "table" and you may want to store a db_id field
> to
> > correlate the data with a database if that's where it came from. When
> using
> > the field/filter strategy you tend to inflate the number of fields in the
> > index with some fields being sparsely populated and this can have some
> > performance implications, and furthermore if one "table" gets updated
> > frequently you wind up interfering with the caching for all data due to
> > frequent opening of new searchers. On the plus side such a strategy makes
> > it easier to query across multiple types simultaneously, so these
> > considerations should be balanced against your usage patterns,
> performance
> > needs, ease of management and ease of programming.
> >
> > Best,
> > Gus
> >
> > On Fri, Apr 19, 2019 at 2:10 PM Vivekanand Sahay
> > <vsa...@walmartlabs.com.invalid> wrote:
> >
> > > Hello,
> > >
> > > I have a use case like below.
> > >
> > > USE CASE
> > > I have a document with fields like
> > >
> > > Id,
> > > Id_type,
> > > Field_1.
> > > Filed_2
> > >
> > > 2 sample messages will look like
> > >
> > > {
> > >   "id": "12334",
> > >   "id_type": "USER_RECORD",
> > >   "field_1": null,
> > >   "field_2": null
> > > }
> > >
> > >
> > > {
> > >   "id": "31321",
> > >   "id_type": "OWNER_RECORD",
> > >   "field_1": null,
> > >   "field_2": null
> > > }
> > >
> > >
> > > QUESTIONS
> > >
> > > I’d like to define the unique key as a compound key from fields id and
> > > id_type
> > >
> > >   1.  Could someone give me an example of how to do this ? Or point to
> > the
> > > relevant section in the docs?
> > >   2.  Is this the best way to define a compound primary key ? Is there
> a
> > > more efficient way ?
> > >
> > > Regards,
> > > Vivek
> > >
> >
> >
> > --
> > http://www.the111shift.com
> >
>

Reply via email to