Delete by Query with limited number of rows
I have the following problem and can't seem to find a solution: I'm building up a frequently updated solr index. In order to deal with limited ressources I would like to limit the total number of documents in the index. In other words: I would like to declare that no more than (for example) 1.000.000 documents should be in the index. Whenever new documents are added (or better: when newly added documents are being committed), I would like to: - check, whether the limit is exceeded - delete as many of the oldest documents from the index as necessary, such that the limit is no longer exceeded. Similar to a first in first out list. The problem is: It's easy to check the limit, but how can I delete the oldest documents to go again below the limit? Can I do it with a delete by query request? In that case, I would probably have to limit the number of rows? But I can't seem to find a way to do that. Or would you see a different solution (maybe there is a way to configure the solr core such that it automatically behaves as desribed?)? I would very much appreciate any help! Thanks in Advance. Cheers Michael -- View this message in context: http://lucene.472066.n3.nabble.com/Delete-by-Query-with-limited-number-of-rows-tp3503094p3503094.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Delete by Query with limited number of rows
Hi Yury, thank you very much for your quick reply. Currently I have a timestamp field (solr.DateField) and every time I add a document I use "NOW" for the timestamp field. I only commit documents on the core every four hours. This works fine with the timestamp since I can use "NOW". However, I couldn't figure out, how to define some kind of auto-increment for a particular field. I think I can't handle this from outside since I can have several adds in parallel from different clients. So I was wondering, whether there could be a field type that could actually automatically increase it's value for each added (commited) document? So that I could use a placeholder like "NOW" in the case of the DateField to indicate that I would like to auto-increment the field. Cheers Michael -- View this message in context: http://lucene.472066.n3.nabble.com/Delete-by-Query-with-limited-number-of-rows-tp3503094p3504924.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Delete by Query with limited number of rows
Hi Erick, hi Yury, thanks to your input I found a perfect solution for my case. Even though this is not a solr-only solution, I will just briefly describe how it works since it might be of interest to others: I have put up a mysql database holding two tables. The first only has a primarykey with auto-increment and nothing else. The second has a primarykey but without auto-increment and also fields for the content I store in solr. Now, before I add something to the solr core, I add an entry to the first mysql database. After the insertion, I get the primarykey for the action. I check, whether it is above my limit of documents. If so, I empty the first mysql table and reset the auto-increment to zero. I than insert a mysql entry to the second table using the primarykey taken from the first table (if the primarykey exists, I do not add an entry but update the existing one). And finally I have a solr core which holds my searchable data and has a uniquekey field. Into this core I add a new document by using the primarykey from the first mysql table for the uniquekey field. The solution has two main benefits for me: - I can precisely control the number of documents in my solr core. - I do now also have a backup of my data in mysql Thank you very much for your help! -- View this message in context: http://lucene.472066.n3.nabble.com/Delete-by-Query-with-limited-number-of-rows-tp3503094p3506380.html Sent from the Solr - User mailing list archive at Nabble.com.