The number of documents varies - sometimes it increases, sometimes it decreases - month to month. However, the index size increases monotonically.
I was expecting some gradual growth as I expect Lucene retains terms that are no longer referenced from any documents, so you'll end up with the superset of all possible terms in the end. However, index size growth probably continues at roughly half the speed of it's growth during the "filling up" period. 2009/1/26 Ryan McKinley <ryan...@gmail.com> > > On Jan 25, 2009, at 6:06 PM, James Brady wrote: > > Hi,I have a number of indices that are supposed to maintaining "windows" >> of >> indexed content - the last month's work of data, for example. >> >> At the moment, I'm cleaning out old documents with a simple cron job >> making >> requests like: >> <delete><query>date_added:[* TO NOW-30DAYS]</query></delete> >> >> I was expecting disk usage to plateau pretty sharply as the number of >> documents in the index reaches equilibrium. However, the usage keeps on >> going up, after 30 days, albeit not as quickly, even if I optimise the >> index. >> >> Can anyone offer an explanation for this? Should document deletions >> followed >> by optimises have as much of an effect on disk usage as I was expecting? >> >> > Depends what you are expecting ;) > > Are you sure that the number or size of docs from month to month is > consistent? If you have more docs each month then the previous one, or if > more data is stored, then a months data would be bigger too. > > ryan >