Delete by Query with limited number of rows

2011-11-12 Thread mikr00
I have the following problem and can't seem to find a solution:

I'm building up a frequently updated solr index. In order to deal with
limited ressources I would like to limit the total number of documents in
the index. In other words: I would like to declare that no more than (for
example) 1.000.000 documents should be in the index. Whenever new documents
are added (or better: when newly added documents are being committed), I
would like to:

- check, whether the limit is exceeded
- delete as many of the oldest documents from the index as necessary, such
that the limit is no longer exceeded.

Similar to a first in first out list. The problem is: It's easy to check the
limit, but how can I delete the oldest documents to go again below the
limit? Can I do it with a delete by query request? In that case, I would
probably have to limit the number of rows? But I can't seem to find a way to
do that. Or would you see a different solution (maybe there is a way to
configure the solr core such that it automatically behaves as desribed?)?

I would very much appreciate any help!

Thanks in Advance.

Cheers

Michael

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Delete-by-Query-with-limited-number-of-rows-tp3503094p3503094.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Delete by Query with limited number of rows

2011-11-13 Thread mikr00
Hi Yury,

thank you very much for your quick reply. Currently I have a timestamp field
(solr.DateField) and every time I add a document I use "NOW" for the
timestamp field. I only commit documents on the core every four hours. This
works fine with the timestamp since I can use "NOW". However, I couldn't
figure out, how to define some kind of auto-increment for a particular
field. I think I can't handle this from outside since I can have several
adds in parallel from different clients. So I was wondering, whether there
could be a field type that could actually automatically increase it's value
for each added (commited) document? So that I could use a placeholder like
"NOW" in the case of the DateField to indicate that I would like to
auto-increment the field.

Cheers

Michael

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Delete-by-Query-with-limited-number-of-rows-tp3503094p3504924.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Delete by Query with limited number of rows

2011-11-14 Thread mikr00
Hi Erick, hi Yury,

thanks to your input I found a perfect solution for my case. Even though
this is not a solr-only solution, I will just briefly describe how it works
since it might be of interest to others:

I have put up a mysql database holding two tables. The first only has a
primarykey with auto-increment and nothing else. The second has a primarykey
but without auto-increment and also fields for the content I store in solr. 

Now, before I add something to the solr core, I add an entry to the first
mysql database. After the insertion, I get the primarykey for the action. I
check, whether it is above my limit of documents. If so, I empty the first
mysql table and reset the auto-increment to zero. I than insert a mysql
entry to the second table using the primarykey taken from the first table
(if the primarykey exists, I do not add an entry but update the existing
one). And finally I have a solr core which holds my searchable data and has
a uniquekey field. Into this core I add a new document by using the
primarykey from the first mysql table for the uniquekey field.

The solution has two main benefits for me:

- I can precisely control the number of documents in my solr core.
- I do now also have a backup of my data in mysql

Thank you very much for your help!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Delete-by-Query-with-limited-number-of-rows-tp3503094p3506380.html
Sent from the Solr - User mailing list archive at Nabble.com.