That does sound perplexing.

Justin, can you tell us which field in the query is your record id? What is
the record id's type in database and in solr schema? What is your unique
key and its type in solr schema?


On Tue, Mar 19, 2013 at 5:19 AM, Justin L. <jta...@gmail.com> wrote:

> Every time I do an import, DataImportHandler is not importing 1 row from my
> database.
>
> I have 3 entities each defined with a single query. I have confirmed, by
> looking at totals from solr as well as comparing a "*:*" query to direct db
> queries-- exactly 1 row is missing every time. And its the same row- the
> first row of one of my entities when sorted by primary key. The other two
> entities are fully imported without trouble.
>
> There are no errors in the log- even when DIH logging is turned up to FINE.
> When I alter the query to retrieve only the mysterious record, it shows up
> as "Fetched: 1 Skipped: 0 Processed: 1". But when I do a query for *:* it
> returns 0 documents.
>
> Ready for a twist? The DIH query for this entity does not have an ORDER BY
> clause- when I add one to sort by primary key DESC it imports all of the
> rows for that entity, including the mysterious record.
>
> Ready to have your mind blown? I am using the alternative method for doing
> delta imports (see query below). When I make clean=false, and update the
> timestamp on the mysterious record- yup- it gets imported properly.
>
>
>
> Because I have the ORDER BY DESC hack, I can get by and live to fight
> another day. But I thought someone might like to know this because I think
> I am hitting a bug in DIH- specifically, something after the querying but
> before the posting to solr. If someone familiar with DIH innards wants to
> suggest where I should look or how to step through it, I'd be willing to
> take a look.
>
> xoxo,
> Justin
>
>
> * Fun facts:
> Solr 4.0
> Oracle 11g
> The mysterious record's id is "000001"
> I use field elements to rename the columns rather than in-the-sql aliases
> because of a problem I had with them earlier. But I will try changing that.
>
>
> * Alternative delta import method:
>
> http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport
>
>
> * DIH query that should import mysterious record:
>
> select organization_name, organization_id, address
> from organization o
> join rolodex r on r.rolodex_id = o.contact_address_id
> and r.sponsor_address_flag = 'N'
> and r.actv_ind = 'Y'
> where '${dataimporter.request.clean}' = 'true'
> or to_char(o.update_timestamp,'YYYY-MM-DD HH24:MI:SS') >
> '${dataimporter.organization.last_index_time
>



-- 
Regards,
Shalin Shekhar Mangar.

Reply via email to