On 12/16/2016 11:13 AM, Dorian Hoxha wrote:
> Yep, that's what came in my search. See how TTL work in hbase/cassandra/
> rocksdb <https://github.com/facebook/rocksdb/wiki/Time-to-Live>. There
> isn't a "delete old docs"query, but old docs are deleted by the
> storage when merging. Looks like this needs to be a lucene-module
> which can then be configured by solr ? 

No.  Lucene doesn't know about expiration and doesn't need to know about
expiration.

The document expiration happens in Solr.  In the background, Solr
finds/deletes old documents in the Lucene index according to how the
expiration feature is configured.  What happens after that is basic
Lucene operation.  If you index enough new data to trigger a merge (or
if you do an optimize/forceMerge), then Lucene will get rid of deleted
documents in the merged segments.  The contents of the documents in your
index (whether that's a timestamp or something else) are completely
irrelevant for decisions made during Lucene's segment merging.

> Just like in hbase,cassandra,rocksdb, when you "select" a row/document
> that has expired, it exists on the storage, but isn't returned by the
> db, because it checks the timestamp and sees that it's expired. Looks
> like this also need to be in lucene?

That's pretty much how Lucene (and by extension, Solr) works, except
it's not related to expiration, it is *deleted* documents that don't
show up in the results.

Thanks,
Shawn

Reply via email to