On 1/11/2018 12:05 AM, Bernd Fehling wrote:
This will nerver pass a Jepsen test and I call it _NOT_ thread safe.
I haven't looked into the code yet, to see if the queue is FIFO, otherwise
this would be stupid.
I was not thinking about order of operations when I said that the client
was threadsafe. I meant that one client object can be used
simultaneously by multiple threads without anything getting
cross-contaminated within the program.
If you are absolutely reliant on operations happening in a precise
order, such that a document could get indexed in one request and then
replaced (or updated) with a later request, you should not use the
concurrent client. You could define it with a single thread, but if you
do that, then the concurrent client doesn't work any faster than the
standard client.
When a concurrent client is built, it creates the specified number of
processing threads. When updates are sent, they are added to an
internal queue. The processing threads will handle requests from the
queue as long as the queue is not empty.
Those threads will process the requests they have been assigned
simultaneously. Although I'm sure that each thread pulls requests off
the queue in a FIFO manner, I have a scenario for you to consider. This
scenario is not just an intellectual exercise, it is the kind of thing
that can easily happen in the wild.
Let's say that when document X is initially indexed, it is at position
997 in a batch of 1000 documents. Then two update requests later, the
new version of document X is at position 2 in another batch of 1000
documents.
If there are at least three threads in the concurrent client, those
update requests may begin execution at nearly the same time. In that
situation, Solr is likely to index document X in the request added later
before it indexes document X in the request added earlier, resulting in
outdated information ending up in the index.
The same thing can happen even with a non-concurrent client when it is
used in a multi-threaded manner.
Preserving order of operations cannot be guaranteed if there are
multiple threads. It could be possible to add some VERY sophisticated
synchronization capabilities, but writing code to do that would be very
difficult, and it wouldn't be trivial to use either.
Thanks,
Shawn