Hi all, I think I might have discovered a synchronization bug when ingesting a lot of data into Solr, but want to check with the specialists first ;-)
I'm using a little custom written map/reduce framework that boots a 20-something threads to do some heavy processing on data-preparation. When this processing is done, the results of these threads are gathers in a reduce step, where they are ingested into an (embedded) Solr instance. To maximize throughput, I'm ingesting the data in parallel in a couple of threads of their own and this is where I run into a synchronization error. As with all synchronization bugs, it happens "some" of the time and they're hard to debug, but I think I managed to get my finger on the root (I'm using Solr 8.3): in class org.apache.lucene.index.CodecReader, throws a NPE on line 84: getFieldsReader().visitDocument(docID, visitor); The issue is that the getFieldsReader() getter is mapped to a ThreadLocal (more explicitly, org.apache.lucene.index.SegmentCoreReaders.fieldsReaderLocal) that seems to be released (set to null) somewhere automatically, and read afterwards, without synchronizing the two. I don't think I should set any resource locks of my own, since I'm only using the SolrJ API and the /update endpoint. I know this is quite a low-level question, but could anyone point me in the right direction to further investigate this issue? Ie, what could be the reason the reader is released out-of-sync? best, b.