First a brief description of how the topic expression works. The topic expression allows you to subscribe to a query. Below is how it works internally.
The topic expression maps to the TopicStream in the java Streaming API. So I encourage people who are interested to review this code. Under the covers the TopicStream persists checkpoints to a SolrCloud Collection that describe where the topic left off. The TopicStream uses these checkpoints as a filter on the query to return only documents higher then the last sent checkpoints. After each call to the topic the checkpoints are updated. What is a checkpoint? A checkpoint is the highest version number read from each shard in the collection. The topic stream sorts by _version_ asc. As it cycles through the documents from the shards it tracks the highest version number for each shard and persists it. Why use version numbers? Version numbers are monotonic longs. Each new document receives a version number which is higher then the last document on the shard. So by sorting on _version_ asc you can cycle through all the documents in a shard in batches. Can a topic miss documents? Currently the answer is theoretically yes. But in practice I believe it would be very rare. To miss documents the following must occur: 1) Documents must be indexed with out of order version numbers. On the leader I believe this is no longer possible. So only the replicas have this issue currently. 2) The out of order version numbers must cross commit boundaries. This means that a commit must occur while an out of order document is outside the index. 3) The topic must pull the out of order committed document before the next commit occurs. Once the out of order document is committed the sort by version number will fix up the out of order documents. Since #1 can be eliminated by only querying the leaders, that is one possible option for dealing with the issue. But this will cut down on scalability. But, in my testing getting #1, #2 and #3 to actually occur is very hard. This is particularly true if commit windows are short because that leaves a very short window for #2 and #3 to line up. For example a one second softCommit would allow only a one second window for #2 and #3 to occur at the same time and this would have to coincide with #1. I've spent days attempting to make the TopicStream lose data with different types of stress tests and I've never been able to make it happen. Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Dec 13, 2016 at 11:22 AM, Joel Bernstein <joels...@gmail.com> wrote: > I plan on using this thread to address questions that were posted to > SOLR-4587. Below are the questions asked: > > > 1) You mentioned that "The issue here is that it's possible that an out of > order version number could persist across commits." > > Is the above possible even if I am using optimistic concurrency ( > http://yonik.com/solr/optimistic-concurrency/) to write documents on Solr? > > 2) Query subscription is going be critical part of my project and our > subscribers won't be able to afford loss of alerts. What can I do to make > sure that there is not loss of alerts. As long as I get error message > whenever there is failure, I will make sure that my system re-tries/replays > indexing that specific document. > > 3) Do you happen to have any stats about possibility of data loss in Solr. > How often does that happen? Are there any best practices that we can follow > to avoid it? > > 4) In general, are stream expressions robust enough to be used in > production? > > 5) Is there any more deep dive documentation about topic(). I would love > to know its stats for query volume as big as ours (9-10 million). Or, I > would love to know how its working internally. > > > > > > Joel Bernstein > http://joelsolr.blogspot.com/ >