The point of <uniqueKey> is that it is used to prevent multiple copies of the documents from being searched. So the first question I'd ask is whether this is a show-stopper right there. Are you delta queries going to pull copies of the *same* document from the view? If so, does your app really want to see the same document over and over and over again? If not, you need to synthesize a uniqueKey somehow so older versions of a document aren't still searchable.
You might be able to do something with creating a synthetic <uniqueKey> that was the timestamp or some such, although be sure you don't generate the same timestamp for two successive records, but I'm guessing. Best Erick On Mon, Sep 5, 2011 at 2:16 AM, Kissue Kissue <kissue...@gmail.com> wrote: > Thanks for replying. Unfortuately the table i need to import from is a view > and there is no unique key in there i can use as a primary key. How does > this affect my using DIH? does it mea i cannot use DIH? > > Thanks. > > > > On Sun, Sep 4, 2011 at 8:44 PM, Shawn Heisey <s...@elyograg.org> wrote: > >> On 9/4/2011 12:16 PM, Kissue Kissue wrote: >> >>> I was reading about DIH on the this Wiki link : >>> http://wiki.apache.org/solr/**DataImportHandler#A_shorter_**data-config<http://wiki.apache.org/solr/DataImportHandler#A_shorter_data-config> >>> The following was said about entity primary key: "is *optional* and only >>> needed when using delta-imports". Does this mean that the primary key is >>> mandatory for delta imports? I am asking because i am going to be >>> importing >>> from a view with no primary key. >>> >> >> I believe what it means is that you have to specify a field to be the >> primary key, and that it must exist in all three queries that you defined - >> query, deltaQuery and deltaImportQuery. In my case, query and >> deltaImportQuery are identical, and deltaQuery is "SELECT 1 AS did". The >> only thing this query does is tell the DIH that there is something to do for >> a delta-import, which it then uses deltaImportQuery to do. I keep track of >> which documents are new outside of Solr and pass values for the query in via >> the dataimport URL. >> >> As you might surmise, did is the primary key in my dataimport config file. >> I couldn't say what would happen if your query results have duplicate >> values in the primary key field. In my case, did actually is is the primary >> key in the database, but I don't think that's required. I use different >> fields for primary key and uniqueKey. This allows us a little extra >> flexibility in the index. >> >> Hopefully you do still have a field that is unique (even if it's not a >> primary key) that you can use as the primary key in your config file. It's >> a good idea to have such a thing available to serve as the uniqueKey in >> schema.xml, for automatic overwrites (delete and reinsert) of documents that >> change. >> >> Thanks, >> Shawn >> >> >