> Now my question is.. Is there a way I can use preImportDeleteQuery to > delete > the documents from SOLR for which the data doesnt exist in back end db? I > dont have anything called delete status in DB, instead I need to get all > the > UID's from SOLR document and compare it with all the UID's in back end and > delete the data from SOLR document for the UID's which is not present in > DB.
I've done something like this with raw Lucene and I'm not sure how or if you could do it with Solr as I'm relatively new to it. We stored a timestamp for when we started to import and stored an update timestamp field for every document added to the index. After the data import, we did a delete by query that matched all documents with a timestamp older than when we started. The assumption being that if we didn't update the timestamp during the load, then the record must have been deleted from the database. Hope this helps. Ben On Wed, Oct 20, 2010 at 8:05 PM, Erick Erickson <erickerick...@gmail.com>wrote: > <<<We are indexing multiple data by data types hence cant delete the index > and > do a complete re-indexing each week also we want to delete the orphan solr > documents (for which the data is not present in back end DB) on a daily > basis.>>> > > Can you make delete by query work? Something like delete all Solr docs of > a certain type and do a full re-index of just that type? > > I have no idea whether this is practical or not.... > > But your solution also works. There's really no way Solr #can# know about > deleted database records, especially since the <uniqueKey> field is > completely > arbitrarily defined. > > Best > Erick > > On Wed, Oct 20, 2010 at 10:51 AM, bbarani <bbar...@gmail.com> wrote: > > > > > Hi, > > > > I have a very common question but couldnt find any post related to my > > question in this forum, > > > > I am currently initiating a full import each week but the data that have > > been deleted in the source is not update in my document as I am using > > clean=false. > > > > We are indexing multiple data by data types hence cant delete the index > and > > do a complete re-indexing each week also we want to delete the orphan > solr > > documents (for which the data is not present in back end DB) on a daily > > basis. > > > > Now my question is.. Is there a way I can use preImportDeleteQuery to > > delete > > the documents from SOLR for which the data doesnt exist in back end db? I > > dont have anything called delete status in DB, instead I need to get all > > the > > UID's from SOLR document and compare it with all the UID's in back end > and > > delete the data from SOLR document for the UID's which is not present in > > DB. > > > > Any suggestion / ideas would be of great help. > > > > Note: Currently I have developed a simple program which will fetch the > > UID's > > from SOLR document and then connect to backend DB to check the orphan > UID's > > and delete the documents from SOLR index corresponding to orphan UID's. I > > just dont want to re-invent the wheel if this feature is already present > in > > SOLR as I need to do more testing in terms of performance / scalability > for > > my program.. > > > > Thanks, > > Barani > > > > > > -- > > View this message in context: > > > http://lucene.472066.n3.nabble.com/How-to-delete-a-SOLR-document-if-that-particular-data-doesnt-exist-in-DB-tp1739222p1739222.html > > Sent from the Solr - User mailing list archive at Nabble.com. > > >