It seems that delta import works in 2 steps, first query fetches the
ids of the modified entries, then second query fetches the actual
data.
<entity name="item" pk="ID"
query="select * from item"
deltaImportQuery="select * from item where
ID='${dataimporter.delta.id}'"
deltaQuery="select id from item where last_modified
> '${dataimporter.last_index_time}'">
<entity name="feature" pk="ITEM_ID"
query="select description as features from feature
where item_id='${item.ID}'">
</entity>
<entity name="item_category" pk="ITEM_ID, CATEGORY_ID"
query="select CATEGORY_ID from item_category where
ITEM_ID='${item.ID}'">
<entity name="category" pk="ID"
query="select description as cat from category
where id = '${item_category.CATEGORY_ID}'">
</entity>
</entity>
I am aware that there's a workaround:
http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport
But still, to clarify, and make sure I have up-to-date info how Solr works:
1. Is it possible to fetch the modified data with a single SQL query
using deltaImportQuery, as in:
deltaImportQuery="select * from item where last_modified >
'${dataimporter.last_index_time}'"?
2. If not - what's the reason delta import is implemented like it is?
Why split it in two queries? I would think having a single delta query
that fetches the data would be kind of an "obvious" design unless
there's something that calls for 2 separate queries...?