Hi Gora and Arcadius, Thanks for your help. I'll try and answer both your questions here.
I am interested in three database tables. "Book" contains information about books, "page" has the content of each book page by page, and "chapter" contains the title of each chapter in every book, and the page on which the chapter begins. It is a bit of a mess because I need the contents of each chapter in every book, but I have to infer which pages each chapter contains by its page number. So there is quite a complex query. There are 8764 rows in the chapter table .. so 8764 unique chapter headings .. and 6870 books. When I import, I get Num Docs: 2784 Max Doc: 9488 Deleted Docs: 6704 Here is the config file (the relevant part): <entity name="book_chapter" PK="ID" rootEntity="false" query="select id as b_id,title,type_id from book"> <entity name="chapter" query="SELECT CONCAT(CAST('${book_chapter.title}' AS CHAR),'-',CAST(chapter AS CHAR)) as solr_id, book_id,'chapter' as entityType,GROUP_CONCAT(content_raw) from (select id as page_id, book_id, page_no, content_raw, (select title from chapter ch where (ch.begin_page_no < p.page_no OR ch.begin_page_no = p.page_no) and ch.book_id = p.book_id and ch.parent_id = 0 order by begin_page_no desc LIMIT 1) as chapter from page p where book_id = '${book_chapter.b_id}') a group by chapter"> <field column="solr_id" name="id" /> <field column="title" name="title"/> <field column="GROUP_CONCAT(content_raw)" name="pageText"/> <field column="entityType" name="entityType"/> <entity name="book-type2" query="select name,id from book_type where id='${book_chapter.type_id}'"> <field column="name" name="contentType"/> </entity> </entity> </entity> thanks, Csaba -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-deleting-documents-tp4041811p4041996.html Sent from the Solr - User mailing list archive at Nabble.com.