I recently upgraded a solr index from 3.5 to 4.3.0. I'm now having trouble with 
the data import handler when using the CachedSqlEntityProcessor.

The first issue I found was that the 'where' option doesn't work anymore. 
Instead I am now using 'cacheKey' and 'cacheLookup'.

My next issue is that if any nested entities are used, the delta import does 
not process more than 2 documents.
e.g. (simplified from my actual import file)
<entity name="books"
                pk="ID"
                query="select * from books"
                deltaImportQuery="select * from books where ID = 
${dih.delta.ID}"
                deltaQuery="select ID from Books where UpdateDate &gt; 
'${dih.last_index_time}'">
            <field column="ID" name="id" />
            <field column="ISBN13" name="isbn13" />
             ...
              <entity name="book_authors"
                    processor="CachedSqlEntityProcessor"
                    query="
                        SELECT BookID, AuthorID
                        FROM BookAuthors"
                     cacheKey="BookID"
                     cacheLookup="books.ID">
                <entity name="authors"
                        processor="CachedSqlEntityProcessor"
                        query="
                            SELECT ID, Title
                            FROM Authors"
                         cacheKey="ID"
                         cacheLookup="book_authors.AuthorID">
                    <field column="Title" name="author_name" />
                    <field column="ID" name="author_id" />
                </entity>
            </entity>
    </entity>

Full imports run fine. But delta imports will show as having processed 2 
documents, and then will keep fetching more rows until it eventually runs out 
of memory. For some reason, no additional documents are processed. This was 
working fine in 3.x versions of SOLR (up to 3.5).

I'm aware that there have been some significant changes to caching in 
SOLR-2382, but don't think this scenario should be affected. It seems to be 
specifically when there is an entity using caching that contains a sub entity 
that is also using caching.

Reply via email to