mikemccand commented on issue #12543: URL: https://github.com/apache/lucene/issues/12543#issuecomment-1745049853
Copying another comment from #10520: > Maybe we could first allow FSTCompiler to specify its own DataOutput even when building the tree on-the-fly, instead of always relying on BytesStore? And we are free to choose an appropriate implementation, whether off-heap, or growing byte, etc. Ahh, yes, exactly! (Sorry, catching up and reading comments out-of-order!). > As our use case are building and using FST at runtime, there might be (some, I'm not sure) penalty we would have if we are to read/write entirely with filesystem. In some of my previous experience with Java I/O, using FileChannel would be much faster than InputStream/OutputStream when writing to filesystem, but even so they are still slower than main memory. Well, Lucene's DataOutput abstraction can be backed by heap too, e.g. ByteBufferDataOutput or ByteArrayDataOutput (there may be others!). An application could even easily make its own "double the byte[]" to grow custom DataOutput. > I'm curious if Lucene also has a benchmark for comparison of off-heap and heap mode? (Also by off-heap, I assume you mean filesystem, instead of the direct memory, e.g with ByteBuffer.allocateDirect()?) I think in the original issue that introduced off-heap FST reading, luceneutil benchmarks were run? Not certain though. I don't know of other benchys. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org