What version of Solr are you running? Mostly that's for curiosity. Is the doc that's not returned something you've recently indexed? Here's a possible scenario: You send the doc out to be indexed. The primary forwards the doc to the followers. Before the follower has a chance to process (but not commit), you issue a RTG against that doc and it happens to be routed to a node that hasn't received it from the leader yet. Does this sound plausible in your scenario?
Hmmm, I suppose it's not even a requirement that the request gets sent to a follower, it could easily be "in process" on the leader/primary. Best, Erick On Wed, Sep 26, 2018 at 11:55 AM sgaron cse <sgaron....@gmail.com> wrote: > > Hey all, > > We're trying to use SOLR for our document store and are facing some issues > with the Realtime Get api. Basically, we're doing an api call from multiple > endpoint to retrieve configuration data. The document that we are > retrieving does not change at all but sometimes the API returns a null > document ({doc:null}). I'd say 99.99% of the time we can retrieve the > document fine but once in a blue moon we get the null document. The problem > is that for us, if SOLR returns null, that means that the document does not > exist but because this is a document that should be there it causes all > sort of problems in our system. > > The API I call is the following: > http://{server_ip}/solr/config/get?id={id}&wt=json&fl=_source_ > > As far as I understand reading the documentation, the Realtime Get API > should get me the document no matter what. Even if the document is not yet > committed to the index. > > I see no errors whatsoever in the SOLR logs that could help me with this > problem. in fact there are no error at all. > > As for our setup, because we're still in testing phase, we only have two > SOLR instances running on the same box in cloud mode with replication=1 > which means that the core that we run the Realtime Get on is only present > in one of the two instances. Our script randomly chooses which instances it > does the query on but as far as I understand, in cloud mode the API call > should be dispatched automatically to the right instance. > > Am I missing anything here? Is it possible that there is a race condition > in the Realtime Get API that could return null data even if the document > exist? > > Thanks, > Steve