Okay, thanks for clarifying.
On Wed, Mar 20, 2013 at 12:11 AM, Justin L. <jta...@gmail.com> wrote: > Shalin, > > Thanks for your questions- the mystery is solved this morning. My "unique" > key was only unique within an entity and not between them. There was only > one instance of overlap- the no-longer mysterious record and its > doppelganger. > > All the other symptoms were side effects from how I was troubleshooting. > For example, if I did a full import, the doppelganger record (which I didnt > know about) would be imported- but my test query was only looking for the > one that didnt make it in. However, if I imported only that entity, it > would, as expected, update the index record and things would appear fine to > me. > > So, no bug. Just plain old bad/narrow troubleshooting combined with > coincidence (only record not getting imported is first row, etc). > > -justin > > > On Mon, Mar 18, 2013 at 7:34 PM, Shalin Shekhar Mangar < > shalinman...@gmail.com> wrote: > > > That does sound perplexing. > > > > Justin, can you tell us which field in the query is your record id? What > is > > the record id's type in database and in solr schema? What is your unique > > key and its type in solr schema? > > > > > > On Tue, Mar 19, 2013 at 5:19 AM, Justin L. <jta...@gmail.com> wrote: > > > > > Every time I do an import, DataImportHandler is not importing 1 row > from > > my > > > database. > > > > > > I have 3 entities each defined with a single query. I have confirmed, > by > > > looking at totals from solr as well as comparing a "*:*" query to > direct > > db > > > queries-- exactly 1 row is missing every time. And its the same row- > the > > > first row of one of my entities when sorted by primary key. The other > two > > > entities are fully imported without trouble. > > > > > > There are no errors in the log- even when DIH logging is turned up to > > FINE. > > > When I alter the query to retrieve only the mysterious record, it shows > > up > > > as "Fetched: 1 Skipped: 0 Processed: 1". But when I do a query for *:* > it > > > returns 0 documents. > > > > > > Ready for a twist? The DIH query for this entity does not have an ORDER > > BY > > > clause- when I add one to sort by primary key DESC it imports all of > the > > > rows for that entity, including the mysterious record. > > > > > > Ready to have your mind blown? I am using the alternative method for > > doing > > > delta imports (see query below). When I make clean=false, and update > the > > > timestamp on the mysterious record- yup- it gets imported properly. > > > > > > > > > > > > Because I have the ORDER BY DESC hack, I can get by and live to fight > > > another day. But I thought someone might like to know this because I > > think > > > I am hitting a bug in DIH- specifically, something after the querying > but > > > before the posting to solr. If someone familiar with DIH innards wants > to > > > suggest where I should look or how to step through it, I'd be willing > to > > > take a look. > > > > > > xoxo, > > > Justin > > > > > > > > > * Fun facts: > > > Solr 4.0 > > > Oracle 11g > > > The mysterious record's id is "000001" > > > I use field elements to rename the columns rather than in-the-sql > aliases > > > because of a problem I had with them earlier. But I will try changing > > that. > > > > > > > > > * Alternative delta import method: > > > > > > http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport > > > > > > > > > * DIH query that should import mysterious record: > > > > > > select organization_name, organization_id, address > > > from organization o > > > join rolodex r on r.rolodex_id = o.contact_address_id > > > and r.sponsor_address_flag = 'N' > > > and r.actv_ind = 'Y' > > > where '${dataimporter.request.clean}' = 'true' > > > or to_char(o.update_timestamp,'YYYY-MM-DD HH24:MI:SS') > > > > '${dataimporter.organization.last_index_time > > > > > > > > > > > -- > > Regards, > > Shalin Shekhar Mangar. > > > -- Regards, Shalin Shekhar Mangar.