sam-herman opened a new issue, #15420: URL: https://github.com/apache/lucene/issues/15420
### Description # [RFC] Add Random Access Write Support to IndexOutput ## Summary This RFC proposes adding random access write capabilities to Lucene's `IndexOutput` interface to enable use cases that require non-sequential writes while maintaining compatibility with Lucene's existing architecture. ## Motivation Currently, Lucene's `IndexOutput` interface is designed exclusively for sequential, append-only writes. This design provides important benefits: - **Immutability**: Index files are written once and never modified - **Efficient checksumming**: Checksums can be computed incrementally during writes without re-reading data - **Atomic commits**: Files are finalized atomically, ensuring index consistency - **Simplicity**: Sequential writes are easier to reason about and optimize However, there are emerging use cases where random access write capabilities would provide significant benefits: ### Use Case 1: Modern Storage Optimization Modern storage devices (SSDs, NVMe) can handle random writes efficiently and support concurrent writes to different file regions. Random access writes could enable parallel construction of index structures, significantly reducing total write time. **Reference**: [JVector PR #542](https://github.com/datastax/jvector/pull/542) demonstrates performance improvements from parallel writes during vector index construction. ### Use Case 2: Mutable Index Structures Some applications require the ability to update index structures in-place without full rewrites: - Dynamic graph updates for vector search indices - In-place modifications during index optimization - Incremental updates to reduce I/O overhead **References**: - [OpenSearch JVector Issue #169](https://github.com/opensearch-project/opensearch-jvector/issues/169) discusses mutable index requirements - [OpenSearch k-NN Issue #2715](https://github.com/opensearch-project/k-NN/issues/2715) discusses reducing I/O during frequent merges ### Use Case 3: Graph Index Construction Optimization When building graph-based indices (HNSW, Vamana) with inlined vectors: - Currently requires maintaining separate flat vector files during construction for scoring - Random access writes would allow reusing the partially-written graph index file - Eliminates redundant storage and I/O operations **Reference**: _[To be added]_ ## Proposed Solution ### Option 1: New Interface Extending IndexOutput Introduce a new `RandomAccessIndexOutput` interface that extends `IndexOutput`: ```java /** * An IndexOutput that supports random access writes in addition to sequential writes. * * <p>This interface allows seeking to arbitrary positions within the file and writing * data at those positions. Implementations must handle checksum computation appropriately * when random access writes are performed. * * <p>Instances of this class are <b>not</b> thread-safe. * * @lucene.experimental */ public abstract class RandomAccessIndexOutput extends IndexOutput { protected RandomAccessIndexOutput(String resourceDescription, String name) { super(resourceDescription, name); } /** * Sets the file pointer to the specified position for the next write. * * @param pos the position in bytes from the beginning of the file * @throws IOException if an I/O error occurs * @throws IllegalArgumentException if pos is negative or beyond the current file length */ public abstract void seek(long pos) throws IOException; /** * Forces any buffered output to be written to the underlying storage. * * @throws IOException if an I/O error occurs */ public abstract void flush() throws IOException; /** * Returns the checksum of bytes in the specified range. * This allows computing checksums for specific regions when random access writes are used. * * @param startOffset the starting position (inclusive) * @param endOffset the ending position (exclusive) * @return the checksum value for the specified range * @throws IOException if an I/O error occurs * @throws IllegalArgumentException if the range is invalid */ public abstract long getChecksum(long startOffset, long endOffset) throws IOException; } ``` ### Option 2: Capability-Based Approach Alternatively, add optional methods to `IndexOutput` with default implementations that throw `UnsupportedOperationException`, similar to how `IndexInput` handles `RandomAccessInput`: ```java // In IndexOutput class public void seek(long pos) throws IOException { throw new UnsupportedOperationException( "This IndexOutput implementation does not support random access writes"); } ``` ## Design Considerations ### Checksum Handling Random access writes complicate incremental checksum computation. Proposed approaches: 1. **Range-based checksums**: The `getChecksum(startOffset, endOffset)` method allows computing checksums for specific regions 2. **Invalidation on seek**: Seeking invalidates the current checksum; callers must explicitly recompute 3. **Implementation-specific**: Leave checksum strategy to implementations (e.g., in-memory implementations could maintain full checksums) ### Safety Guarantees - Random access writes should only be used during index construction, not for modifying committed segments - Implementations should validate that seeks don't extend beyond current file length - Thread-safety remains the caller's responsibility (consistent with existing `IndexOutput` contract) ## Alternatives Considered 1. **External libraries**: Projects could implement random access writes outside Lucene, but this creates fragmentation and prevents sharing optimizations 2. **Custom Directory implementations**: Possible but requires duplicating significant infrastructure ## Open Questions 1. Should random access writes be restricted to specific Directory implementations? 2. What are the implications for index verification and corruption detection? 3. Should there be explicit markers in the index format to indicate random-access-written files? 4. How should this interact with encryption or compression layers? ## Backward Compatibility This proposal is fully backward compatible: - No changes to existing `IndexOutput` contract - New functionality is opt-in via new class or optional methods - Existing `IndexOutput` implementations remain unchanged - Codecs that don't need random access continue using standard `IndexOutput` - Existing codecs and applications continue working unchanged -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
