Hello,

I have a quasi-realtime indexing application where documents are grouped into collections and documents can be added or removed from collections. The document has an id and multiple collection id (collid) fields reflecting the collections that contain that document. The collid field is a filter query to limit a search to a given collection. When a document is added, a new version of that document is constructed and indexed. The new version has the collid fields that match the collections containing that document. The id / collids relationship is stored in a database table. The indexing is run from a cron job.

I need show a user the state (searchable/not indexed yet) of a document he's added to one of his collections. I think that is exactly when the commit is finished. I want to record that document state in the database. So I have to understand how Solr serializes commits, adds and segment merges. I understand that an a commit blocks adds. But what if the commit times out?

Suppose I commit a number of adds but the adds have caused a lengthy segment merge and the commit blocks and times out. How can I tell when all my adds are searchable? Do I have to query solr for the given collid field value for each document the users document list view?

In another case, can a commit sneak ahead of a list of adds or is the commit blocked until the adds complete? If the commit is not blocked, I can't use the commit to mark a document ready to be searched because it's coll id fields may not be indexed yet.

If the commit is blocked on the adds but the adds take a long time to process the commit could time out. Again, the commit is not a reliable indicator of when the document is ready to be searched.

Is it true that commits, when timeouts enter the picture, will not work to determine state and I need to query instead? Or am I missing something?


Thanks,

Phil

Reply via email to