Re: [PR] Make FSTPostingFormat to build FST off-heap [lucene]
dungba88 commented on code in PR #12980: URL: https://github.com/apache/lucene/pull/12980#discussion_r1436295737 ## lucene/codecs/src/java/org/apache/lucene/codecs/memory/FSTTermsWriter.java: ## @@ -187,33 +200,38 @@ public void write(Fields fields, NormsProducer norms) throws IOException { @Override public void close() throws IOException { -if (out != null) { +if (metaOut != null) { + assert dataOut != null; boolean success = false; try { // write field summary -final long dirStart = out.getFilePointer(); +final long dirStart = metaOut.getFilePointer(); -out.writeVInt(fields.size()); +metaOut.writeVInt(fields.size()); for (FieldMetaData field : fields) { - out.writeVInt(field.fieldInfo.number); - out.writeVLong(field.numTerms); + metaOut.writeVInt(field.fieldInfo.number); + metaOut.writeVLong(field.numTerms); if (field.fieldInfo.getIndexOptions() != IndexOptions.DOCS) { -out.writeVLong(field.sumTotalTermFreq); +metaOut.writeVLong(field.sumTotalTermFreq); } - out.writeVLong(field.sumDocFreq); - out.writeVInt(field.docCount); - field.dict.save(out, out); + metaOut.writeVLong(field.sumDocFreq); + metaOut.writeVInt(field.docCount); + // write the starting file pointer + metaOut.writeVLong(dataOut.getFilePointer()); Review Comment: Oh I think I found the bug, and why using an on-heap DataOutput works. This `close()` method is called after the FST for all fields have been saved (streamed), and thus `dataOut.getFilePointer()` always points to the same pointer (EOF). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add support for index sorting with document blocks [lucene]
s1monw commented on code in PR #12829: URL: https://github.com/apache/lucene/pull/12829#discussion_r1437134740 ## lucene/core/src/java/org/apache/lucene/index/FieldInfos.java: ## @@ -188,6 +200,26 @@ public static FieldInfos getMergedFieldInfos(IndexReader reader) { } } + private static String getAndValidateParentField(List leaves) { +boolean set = false; +String theField = null; +for (LeafReaderContext ctx : leaves) { + String field = ctx.reader().getFieldInfos().getParentField(); + if (set && Objects.equals(field, theField) == false) { +throw new IllegalStateException( Review Comment: yeah I think so too -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Add support for index sorting with document blocks [lucene]
s1monw commented on PR #12829: URL: https://github.com/apache/lucene/pull/12829#issuecomment-1870447436 @mikemccand @jpountz I think it's ready. I added some more testing to it and removed storing the no. of children in the DV field to make it as low impact as possible. we can still optimize this if we want / need later internally -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] Where should we stream FST to disk directly? [lucene]
dungba88 commented on issue #12902: URL: https://github.com/apache/lucene/issues/12902#issuecomment-1870848548 Put the first PR for `FSTPostingsFormat`: https://github.com/apache/lucene/pull/12980 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [I] TransitionAccessor for NFA: get transitions for a given state via random-access leads to wrong results. [lucene]
zhaih closed issue #12906: TransitionAccessor for NFA: get transitions for a given state via random-access leads to wrong results. URL: https://github.com/apache/lucene/issues/12906 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
Re: [PR] Fix bug where NFARunAutomaton#getTransition does not set Transition correctly [lucene]
zhaih merged PR #12909: URL: https://github.com/apache/lucene/pull/12909 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org