another thing to consider doing is just merge the two fields into the id value: "id": "USER_RECORD_12334", since its a string.
On Wed, Apr 24, 2019 at 2:35 PM Gus Heck <gus.h...@gmail.com> wrote: > Hi Vivek > > Solr is not a database, nor should one try to use it as such. You'll need > to adjust your thinking some in order to make good use of Solr. In Solr > there is normally an id field and it should be unique across EVERY document > in the entire collection. Thus there's no concept of a primary key, because > there are no tables. In some situations (streaming expressions for example) > you might want to use collections like tables, creating a collection per > data type, but there's no way to define uniqueness in terms of more than > one field within a collection. If your data comes from a database with > complex keys, concatenating the values to form the single unique ID is a > possibility. If you form keys that way of course you also want to retain > the values as individual fields. This duplication might seem odd from a > database perspective where one often works hard to normalize data, but for > search, denormalization is very common. The focus with search engines is > usually speed of retrieval rather than data correctness. Solr should serve > as an index into some other canonical source of truth for your data, and > that source of truth should be in charge of guaranteeing data correctness. > > Another alternative is to provide a field that denotes the type (table) for > the document (such as id_type in your example). In that case, all queries > looking for a specific object type as a result should add a filter (fq > parameter) to denote the "table" and you may want to store a db_id field to > correlate the data with a database if that's where it came from. When using > the field/filter strategy you tend to inflate the number of fields in the > index with some fields being sparsely populated and this can have some > performance implications, and furthermore if one "table" gets updated > frequently you wind up interfering with the caching for all data due to > frequent opening of new searchers. On the plus side such a strategy makes > it easier to query across multiple types simultaneously, so these > considerations should be balanced against your usage patterns, performance > needs, ease of management and ease of programming. > > Best, > Gus > > On Fri, Apr 19, 2019 at 2:10 PM Vivekanand Sahay > <vsa...@walmartlabs.com.invalid> wrote: > > > Hello, > > > > I have a use case like below. > > > > USE CASE > > I have a document with fields like > > > > Id, > > Id_type, > > Field_1. > > Filed_2 > > > > 2 sample messages will look like > > > > { > > "id": "12334", > > "id_type": "USER_RECORD", > > "field_1": null, > > "field_2": null > > } > > > > > > { > > "id": "31321", > > "id_type": "OWNER_RECORD", > > "field_1": null, > > "field_2": null > > } > > > > > > QUESTIONS > > > > I’d like to define the unique key as a compound key from fields id and > > id_type > > > > 1. Could someone give me an example of how to do this ? Or point to > the > > relevant section in the docs? > > 2. Is this the best way to define a compound primary key ? Is there a > > more efficient way ? > > > > Regards, > > Vivek > > > > > -- > http://www.the111shift.com >