gbbafna commented on issue #15054:
URL: https://github.com/apache/lucene/issues/15054#issuecomment-3194931981

   Thanks @uschindler  , @mikemccand  . Your inputs helped us narrow down the 
root cause mentioned below. 
   
   >It is wrong if you do this on the same IndexInput without really sequential 
access. If you want to slice your IndexInput and use multiple threads then it 
is no longer (strictly speaking) "read once". Because the kernel can't do 
read-ahead correctly. IndexInput is also not the correct classes to do this. If 
you want to do that really effective use asynchronous file channels and so on. 
This is something that Lucene's Directory abstractions are not made for.
   
   Noted . I have created a backlog item to rearchitect our upload flow using 
the right abstractions . 
   
   We were not slicing the IndexInput for multiple threads, but creating it 
naively for all the part uploads. This was causing the number of maps to just 
explode. We are changing that to use clones instead of creating it for each 
part. 
   
   PR : https://github.com/opensearch-project/OpenSearch/pull/19072 
   
   >For copying files over the network, MMapDirectory is a bad choice, sorry. 
And read-once only works correct if you have a single stream and copy it.
   
   Ack. 
   
   >There is a lot of information missing to us. We can't figure out what you 
are doing without an reproducer. Do you have index files open on the same 
directory at the same time?
   
   > If you have heap dumps: Are the shared arenas still alive, is the 
refcounter of the grouping arena 0 or not 0? We need more input. The current 
code works well for Lucene and passs all tests, so theres no refcounting bug on 
how the code is used in Lucene itsself.
   
   We noticed that for same file very high number of maps were open , due to 
the fact that we are creating index input repeatedly for multipart upload 
   
   >If you have a dump of all mapped files /proc/pid/maps can you check if you 
see many *.si files? If this is the case the problem you are seing here is just 
a side effect because you consume already many mappings during normal 
indexing/searching and the remote directory just consumes the rest and then 
fails. It is still strange why the maxPermits sysprop does not help!
   
   We are seeing files of all type . not specifically `*.si` . That also 
explains why maxPermits sysprop is not helping our case. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to