[GitHub] [lucene] jtibshirani commented on a change in pull request #204: LUCENE-10020 DocComparator don't skip docs of same docID
jtibshirani commented on a change in pull request #204: URL: https://github.com/apache/lucene/pull/204#discussion_r664107815 ## File path: lucene/core/src/test/org/apache/lucene/search/TestSortOptimization.java ## @@ -332,6 +333,71 @@ public void testFloatSortOptimization() throws IOException { dir.close(); } + /** + * Test that a search with sort on [_doc, other fields] across multiple indices doesn't miss any + * documents. + */ + public void testDocSortOptimizationMultipleIndices() throws IOException { +final int numIndices = 3; +final int numDocsInIndex = atLeast(50); +Directory[] dirs = new Directory[numIndices]; +IndexReader[] readers = new IndexReader[numIndices]; +for (int i = 0; i < numIndices; i++) { + dirs[i] = newDirectory(); + final int remainder = i % 3; Review comment: Since `numIndices` is 3, do we need to take a mod here? ## File path: lucene/core/src/java/org/apache/lucene/search/comparators/DocComparator.java ## @@ -81,7 +87,12 @@ public Integer value(int slot) { public DocLeafComparator(LeafReaderContext context) { this.docBase = context.docBase; if (enableSkipping) { -this.minDoc = topValue + 1; +// For a single sort on _doc, we want to skip all docs before topValue. +// For multiple fields sort on [_doc, other fields], we want to include docs with the same +// docID. +// This is needed in a distributed search, where there are docs from different indices with +// the same docID. +this.minDoc = singleSort ? topValue + 1 : topValue; Review comment: This seems to work and matches the approach in `NumericComparator`. I guess it doesn't specifically address the case where `_doc` is the last sort, for example a sort on `["some_field", "_doc"]`, where we could also use `topValue + 1`. One thing I wondered: is keeping track of `singleSort` really important, or could we simplify and just always use `topValue`? At most we'd consider one extra document. A similar simplification would apply to `NumericComparator`. The skipping logic is a bit complex and I'm thinking about the performance/ simplicity trade-off. ## File path: lucene/core/src/test/org/apache/lucene/search/TestSortOptimization.java ## @@ -332,6 +333,71 @@ public void testFloatSortOptimization() throws IOException { dir.close(); } + /** + * Test that a search with sort on [_doc, other fields] across multiple indices doesn't miss any + * documents. + */ + public void testDocSortOptimizationMultipleIndices() throws IOException { +final int numIndices = 3; +final int numDocsInIndex = atLeast(50); +Directory[] dirs = new Directory[numIndices]; +IndexReader[] readers = new IndexReader[numIndices]; +for (int i = 0; i < numIndices; i++) { + dirs[i] = newDirectory(); + final int remainder = i % 3; + Function valueSupplier = docID -> (docID * 3 + remainder); Review comment: Maybe this could be a simple variable assignment instead of using a supplier. Also I think we can replace 3 with `numIndices`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dsmiley commented on a change in pull request #205: LUCENE-9959: Rename TermVectorsReader to TermVectorsReaderBase
dsmiley commented on a change in pull request #205: URL: https://github.com/apache/lucene/pull/205#discussion_r664111224 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentCoreReaders.java ## @@ -57,7 +57,7 @@ final NormsProducer normsProducer; final StoredFieldsReader fieldsReaderOrig; - final TermVectorsReader termVectorsReaderOrig; + final TermVectorsReaderBase termVectorsReaderOrig; Review comment: In the subsequent PR that re-introduces "TermVectorsReader", I assume you will change some of the references of TermVectorsReaderBase back to TermVectorsReader, such as right here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] zacharymorn commented on pull request #205: LUCENE-9959: Rename TermVectorsReader to TermVectorsReaderBase
zacharymorn commented on pull request #205: URL: https://github.com/apache/lucene/pull/205#issuecomment-874499672 > I'm also okay with introducing the new/simpler TermVectorsReader abstraction if it would reduce this PR diff a lot. Up to you. Thanks @dsmiley for the review and approval! I've thought about having the new `TermVectorsReader` here as well, but I'm leaning more towards introducing it in the other PR after some digging, as I see so far the majority of `TermVectorsReaderBase` usage here involves using the existing codec APIs `checkIntegrity/clone/getMergeInstance`, so the newer `TermVectorsReader` class (that may not have any method yet) may not be usable directly. In addition, introducing it here may also lead to some git merge conflicts with the other PR later as well I feel, as `TermVectors` was already created there as the super class of `TermVectorsReader`. So I think saving it for the other PR may be an easier approach overall? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] zacharymorn commented on a change in pull request #205: LUCENE-9959: Rename TermVectorsReader to TermVectorsReaderBase
zacharymorn commented on a change in pull request #205: URL: https://github.com/apache/lucene/pull/205#discussion_r664278681 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentCoreReaders.java ## @@ -57,7 +57,7 @@ final NormsProducer normsProducer; final StoredFieldsReader fieldsReaderOrig; - final TermVectorsReader termVectorsReaderOrig; + final TermVectorsReaderBase termVectorsReaderOrig; Review comment: > In the subsequent PR that re-introduces "TermVectorsReader", I assume you will change some of the references of TermVectorsReaderBase back to TermVectorsReader Yes I'm planning to do that! > such as right here? For this particular one though, later in the code its `close` method is called via: https://github.com/apache/lucene/blob/167bd99c23520f8e252ad6f98d1e3065d20d5ae6/lucene/core/src/java/org/apache/lucene/index/SegmentCoreReaders.java#L185-L192 So it may still need to use `TermVectorsReaderBase` there, as that class still implements `Closeable` per https://github.com/apache/lucene/pull/180#issuecomment-871155896 ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mayya-sharipova commented on a change in pull request #204: LUCENE-10020 DocComparator don't skip docs of same docID
mayya-sharipova commented on a change in pull request #204: URL: https://github.com/apache/lucene/pull/204#discussion_r664573881 ## File path: lucene/core/src/java/org/apache/lucene/search/comparators/DocComparator.java ## @@ -81,7 +87,12 @@ public Integer value(int slot) { public DocLeafComparator(LeafReaderContext context) { this.docBase = context.docBase; if (enableSkipping) { -this.minDoc = topValue + 1; +// For a single sort on _doc, we want to skip all docs before topValue. +// For multiple fields sort on [_doc, other fields], we want to include docs with the same +// docID. +// This is needed in a distributed search, where there are docs from different indices with +// the same docID. +this.minDoc = singleSort ? topValue + 1 : topValue; Review comment: Great comment! +1 for simplifying the code at the expense for extra single case in `DocComparator`. For `NumericComparator` though this is not the case, and there could be huge number of docs with the same value, so extra optimization for `singleSort` is important. > I guess it doesn't specifically address the case where _doc is the last sort, for example a sort on ["some_field", "_doc"], where we could also use topValue + 1. No, the sort optimizations in `DocComparator` are not applicable where `_doc` is the 1st sort. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mayya-sharipova commented on pull request #204: LUCENE-10020 DocComparator don't skip docs of same docID
mayya-sharipova commented on pull request #204: URL: https://github.com/apache/lucene/pull/204#issuecomment-874777903 @jtibshirani Thanks for the review. I've tried to address all your feedback in b5af441a39611d63df30168452cdec521ce4d578 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dsmiley commented on a change in pull request #205: LUCENE-9959: Rename TermVectorsReader to TermVectorsReaderBase
dsmiley commented on a change in pull request #205: URL: https://github.com/apache/lucene/pull/205#discussion_r664625324 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentCoreReaders.java ## @@ -57,7 +57,7 @@ final NormsProducer normsProducer; final StoredFieldsReader fieldsReaderOrig; - final TermVectorsReader termVectorsReaderOrig; + final TermVectorsReaderBase termVectorsReaderOrig; Review comment: Perhaps then TermVectorsReader should also implement Closeable? (to be done in another PR) I called out the line above because it is not harmonious with the other "*Reader" suffixed classes declared next to it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude opened a new pull request #2522: SOLR-15460: Implement LIKE, IS NOT NULL, IS NULL, and support wildcard * in equals string literal
thelabdude opened a new pull request #2522: URL: https://github.com/apache/lucene-solr/pull/2522 Backport of https://github.com/apache/solr/pull/173 to branch_8x -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] jtibshirani commented on a change in pull request #204: LUCENE-10020 DocComparator don't skip docs of same docID
jtibshirani commented on a change in pull request #204: URL: https://github.com/apache/lucene/pull/204#discussion_r664698038 ## File path: lucene/core/src/java/org/apache/lucene/search/comparators/DocComparator.java ## @@ -81,7 +87,12 @@ public Integer value(int slot) { public DocLeafComparator(LeafReaderContext context) { this.docBase = context.docBase; if (enableSkipping) { -this.minDoc = topValue + 1; +// For a single sort on _doc, we want to skip all docs before topValue. +// For multiple fields sort on [_doc, other fields], we want to include docs with the same +// docID. +// This is needed in a distributed search, where there are docs from different indices with +// the same docID. +this.minDoc = singleSort ? topValue + 1 : topValue; Review comment: > For `NumericComparator` though this is not the case, and there could be huge number of docs with the same value, so extra optimization for `singleSort` is important. Oh I see, `NumericComparator` isn't just used in 'search after' cases so there could be many docs that share the same value. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude merged pull request #2522: SOLR-15460: Implement LIKE, IS NOT NULL, IS NULL, and support wildcard * in equals string literal
thelabdude merged pull request #2522: URL: https://github.com/apache/lucene-solr/pull/2522 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #203: LUCENE-10019: Align file starts in CFS files to have proper alignment (8 bytes)
uschindler commented on pull request #203: URL: https://github.com/apache/lucene/pull/203#issuecomment-874900622 After reviewing the code by DocValues and DirectPackedReader, I figured out that the alignment should be configurable. Depending of a RandomAccessInput based on an IndexInput slice, the alignment should be adapted to the inner entry size (byte, int, short, long). So I changed the IndexOutput method that allows to align the file pointer to take any power of 2 as alignment. I also added extensive tests, so it works correct. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10019) Align file starts in CFS files to have proper alignment (8 bytes)
[ https://issues.apache.org/jira/browse/LUCENE-10019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375832#comment-17375832 ] Uwe Schindler commented on LUCENE-10019: After reviewing the code by DocValues and DirectPackedReader, I figured out that the alignment should be configurable. Depending of a RandomAccessInput based on an IndexInput slice, the alignment should be adapted to the inner entry size (byte, int, short, long). So I changed the IndexOutput method that allows to align the file pointer to take any power of 2 as alignment. I also added extensive tests, so it works correct. Due to the changes I will wait a bit longer until I will merge the PR. > Align file starts in CFS files to have proper alignment (8 bytes) > - > > Key: LUCENE-10019 > URL: https://issues.apache.org/jira/browse/LUCENE-10019 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs, core/store >Affects Versions: main (9.0) >Reporter: Uwe Schindler >Assignee: Uwe Schindler >Priority: Blocker > Fix For: 9.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > While discussing about MMapDirectory and fast access to file contents through > MMap (https://github.com/apache/lucene/pull/177 and previous versions of this > draft, also), I figured out that for most Lucene files, the data inside is > not aligned at all. > We can't fix this easily and it's also not always important, but some files > should really have a CPU fieldly alignment from beginning! This is > escpecially important when we use slices(). > I got many tests with aligned VarHandles to pass, but it broke instantly, if > the file was inside a Compound CFS file. > CompoundFormat.write() just appends all data to the IndexOutput and writes > the offset to the entries file. The fix to make at least file starts aligned > is to just write some null-bytes between the files, so startOffset is aligned > to multiples of 8 bytes. > At a later stage we could also think of aligning to LBA > blocks/sectors/whatever to make OS paging work better. But for performance of > index access, slices of compound files when memory mapped should at least > align to 8 bytes. > Fix is easy: Just add some modulo on startOffset and write some extra bytes > before the next file is serialized. The change is only 2 lines. It does not > even change index format! > I'd like to get this in for 9.0 so we can at least say: our CFS files are > aligned. Aligning other files like docvalues to better help CPU is then > possible. > I will provide a simple pull request for Lucene90CompoundFormat soon. If you > don't see any problems, this is a no-brainer. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude opened a new pull request #2523: SOLR-15451: SolrSchema (for Parallel SQL) should use PKI principal for internal request to /admin/luke to get table metadata
thelabdude opened a new pull request #2523: URL: https://github.com/apache/lucene-solr/pull/2523 Backport of https://github.com/apache/solr/pull/168 to branch_8x -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude merged pull request #2523: SOLR-15451: SolrSchema (for Parallel SQL) should use PKI principal for internal request to /admin/luke to get table metadata
thelabdude merged pull request #2523: URL: https://github.com/apache/lucene-solr/pull/2523 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10003) Disallow C-style array declarations
[ https://issues.apache.org/jira/browse/LUCENE-10003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375864#comment-17375864 ] David Smiley commented on LUCENE-10003: --- I did some experimentation with using a regexp in {{validate-source-patterns.gradle}} {{invalidJavaOnlyPatterns}} and came up with this: {noformat} (~$/(?m)^(?!\s*(/[/*]|\*).*).*\b\w+\s+\w+(\[])+\s*[=,;]/$) : 'C style array declarations disallowed; move the brackets' {noformat} The first part is a negative lookahead to ensure the line doesn't start with a comment, because there are a number of comments that would otherwise match this expression. There is only one place where there's a false positive (MemoryIndex line 949), and it can be addressed trivially via adding a newline. I'll push a PR that just does the changes, and separately push another PR oriented on the automated detection using the above regexp. > Disallow C-style array declarations > --- > > Key: LUCENE-10003 > URL: https://issues.apache.org/jira/browse/LUCENE-10003 > Project: Lucene - Core > Issue Type: Improvement >Reporter: David Smiley >Assignee: David Smiley >Priority: Minor > > The Google Java Format, that which we adhere to, disallows c-style array > declarations: https://google.github.io/styleguide/javaguide.html#s4.8.3-arrays > It's also known to "Error Prone": > https://errorprone.info/bugpattern/MixedArrayDimensions -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude opened a new pull request #2524: SOLR-15456: Get field type info from luke for custom fields instead of defaulting to String in Parallel SQL
thelabdude opened a new pull request #2524: URL: https://github.com/apache/lucene-solr/pull/2524 Backport of https://github.com/apache/solr/pull/181 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude merged pull request #2524: SOLR-15456: Get field type info from luke for custom fields instead of defaulting to String in Parallel SQL
thelabdude merged pull request #2524: URL: https://github.com/apache/lucene-solr/pull/2524 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dsmiley opened a new pull request #206: LUCENE-10003 No C style array declaration
dsmiley opened a new pull request #206: URL: https://github.com/apache/lucene/pull/206 https://issues.apache.org/jira/browse/LUCENE-10003 The first commit was purely automated by IntelliJ. The second commit are my hand edits to other spots IntellIJ didn't find because it was in a comment or some other file like a .jflex. These hand edits were found via ``` ^(?!\s*(\/\/|\/?\*).*).*\b((?!instanceof)\w+)\s+\w+(\[])+\s*[=,;)] ``` Expect a separate PR to add checks to prevent this syntax via tooling. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on pull request #206: LUCENE-10003 No C style array declaration
rmuir commented on pull request #206: URL: https://github.com/apache/lucene/pull/206#issuecomment-874976665 Why are we not doing this with spotless? As mentioned on the JIRA, I'm really -1 on the approach of enforcing *formatting* with anything but spotless. Sorry, it will just be annoying. The purpose of spotless is to avoid troubles like these. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude opened a new pull request #2525: SOLR-15461: Upgrade Apache Calcite to latest release
thelabdude opened a new pull request #2525: URL: https://github.com/apache/lucene-solr/pull/2525 Backport https://github.com/apache/solr/pull/177 to branch_8x -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] markrmiller commented on pull request #2525: SOLR-15461: Upgrade Apache Calcite to latest release
markrmiller commented on pull request #2525: URL: https://github.com/apache/lucene-solr/pull/2525#issuecomment-875004571 Didn't see anything that looks off to me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mayya-sharipova commented on a change in pull request #204: LUCENE-10020 DocComparator don't skip docs of same docID
mayya-sharipova commented on a change in pull request #204: URL: https://github.com/apache/lucene/pull/204#discussion_r664804322 ## File path: lucene/core/src/java/org/apache/lucene/search/comparators/DocComparator.java ## @@ -81,7 +87,12 @@ public Integer value(int slot) { public DocLeafComparator(LeafReaderContext context) { this.docBase = context.docBase; if (enableSkipping) { -this.minDoc = topValue + 1; +// For a single sort on _doc, we want to skip all docs before topValue. +// For multiple fields sort on [_doc, other fields], we want to include docs with the same +// docID. +// This is needed in a distributed search, where there are docs from different indices with +// the same docID. +this.minDoc = singleSort ? topValue + 1 : topValue; Review comment: Right, `singleSort` check is used not only for search after case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mayya-sharipova merged pull request #204: LUCENE-10020 DocComparator don't skip docs of same docID
mayya-sharipova merged pull request #204: URL: https://github.com/apache/lucene/pull/204 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10020) DocComparator should not skip docs with the same docID on multiple sorts with search after
[ https://issues.apache.org/jira/browse/LUCENE-10020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375948#comment-17375948 ] ASF subversion and git services commented on LUCENE-10020: -- Commit 64d9f8c58710eaec8177408b8301a72064581fb1 in lucene's branch refs/heads/main from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=64d9f8c ] LUCENE-10020 DocComparator don't skip docs of same docID (#204) DocComparator should not skip docs with the same docID on multiple sorts with search after. Because of the optimization introduced in LUCENE-9449, currently when searching with sort on [_doc, other fields] with search after, DocComparator can efficiently skip all docs before and including the provided [search after docID]. This is a desirable behaviour in a single index search. But in a distributed search, where multiple indices have docs with the same docID, and when searching on [_doc, other fields], the sort optimization should NOT skip documents with the same docIDs. This PR fixes this. Relates to LUCENE-9449 > DocComparator should not skip docs with the same docID on multiple sorts with > search after > -- > > Key: LUCENE-10020 > URL: https://issues.apache.org/jira/browse/LUCENE-10020 > Project: Lucene - Core > Issue Type: Bug >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 1h 10m > Remaining Estimate: 0h > > Because of the optimization introduced in > [LUCENE-9449|https://issues.apache.org/jira/browse/LUCENE-9449], when > searching with sort on [_doc, other fields] with search after, > DocComparator can efficiently skip all docs before and including the provided > [search after docID]. > This is a desirable behaviour in a single index search. But in a distributed > search, where multiple indices have docs with the same docID, and when > searching on [_doc, other fields], the sort optimization should not skip > documents with the same docIDs. > Relates to LUCENE-9449 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9449) Skip non-competitive documents when sort by _doc with search after
[ https://issues.apache.org/jira/browse/LUCENE-9449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375950#comment-17375950 ] ASF subversion and git services commented on LUCENE-9449: - Commit 64d9f8c58710eaec8177408b8301a72064581fb1 in lucene's branch refs/heads/main from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=64d9f8c ] LUCENE-10020 DocComparator don't skip docs of same docID (#204) DocComparator should not skip docs with the same docID on multiple sorts with search after. Because of the optimization introduced in LUCENE-9449, currently when searching with sort on [_doc, other fields] with search after, DocComparator can efficiently skip all docs before and including the provided [search after docID]. This is a desirable behaviour in a single index search. But in a distributed search, where multiple indices have docs with the same docID, and when searching on [_doc, other fields], the sort optimization should NOT skip documents with the same docIDs. This PR fixes this. Relates to LUCENE-9449 > Skip non-competitive documents when sort by _doc with search after > -- > > Key: LUCENE-9449 > URL: https://issues.apache.org/jira/browse/LUCENE-9449 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mayya Sharipova >Priority: Minor > Fix For: 8.7 > > Time Spent: 5.5h > Remaining Estimate: 0h > > Enhance DocComparator to provide an iterator over competitive documents when > search ing with "after" FieldDoc. > This iterator can quickly position on the desired "after" document, and skip > all documents or even segments that contain documents before "after" > This is especially efficient when "after" is high. > > Related to LUCENE-9280 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9449) Skip non-competitive documents when sort by _doc with search after
[ https://issues.apache.org/jira/browse/LUCENE-9449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375949#comment-17375949 ] ASF subversion and git services commented on LUCENE-9449: - Commit 64d9f8c58710eaec8177408b8301a72064581fb1 in lucene's branch refs/heads/main from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene.git;h=64d9f8c ] LUCENE-10020 DocComparator don't skip docs of same docID (#204) DocComparator should not skip docs with the same docID on multiple sorts with search after. Because of the optimization introduced in LUCENE-9449, currently when searching with sort on [_doc, other fields] with search after, DocComparator can efficiently skip all docs before and including the provided [search after docID]. This is a desirable behaviour in a single index search. But in a distributed search, where multiple indices have docs with the same docID, and when searching on [_doc, other fields], the sort optimization should NOT skip documents with the same docIDs. This PR fixes this. Relates to LUCENE-9449 > Skip non-competitive documents when sort by _doc with search after > -- > > Key: LUCENE-9449 > URL: https://issues.apache.org/jira/browse/LUCENE-9449 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mayya Sharipova >Priority: Minor > Fix For: 8.7 > > Time Spent: 5.5h > Remaining Estimate: 0h > > Enhance DocComparator to provide an iterator over competitive documents when > search ing with "after" FieldDoc. > This iterator can quickly position on the desired "after" document, and skip > all documents or even segments that contain documents before "after" > This is especially efficient when "after" is high. > > Related to LUCENE-9280 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] dsmiley commented on pull request #206: LUCENE-10003 No C style array declaration
dsmiley commented on pull request #206: URL: https://github.com/apache/lucene/pull/206#issuecomment-875011653 This PR isn't about automatic enforcement so I think your veto is intended at the conversation in the JIRA issue about how to enforce. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude merged pull request #2525: SOLR-15461: Upgrade Apache Calcite to latest release
thelabdude merged pull request #2525: URL: https://github.com/apache/lucene-solr/pull/2525 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude opened a new pull request #2526: SOLR-15489: Implement OFFSET & FETCH for LIMIT SQL queries
thelabdude opened a new pull request #2526: URL: https://github.com/apache/lucene-solr/pull/2526 Backport https://github.com/apache/solr/pull/191 to branch_8x -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude merged pull request #2526: SOLR-15489: Implement OFFSET & FETCH for LIMIT SQL queries
thelabdude merged pull request #2526: URL: https://github.com/apache/lucene-solr/pull/2526 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude opened a new pull request #2527: SOLR-15208: Add the countDist aggregation to the stats, facet and timeseries Streaming Expressions
thelabdude opened a new pull request #2527: URL: https://github.com/apache/lucene-solr/pull/2527 Backport to branch_8x -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude merged pull request #2527: SOLR-15208: Add the countDist aggregation to the stats, facet and timeseries Streaming Expressions
thelabdude merged pull request #2527: URL: https://github.com/apache/lucene-solr/pull/2527 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude opened a new pull request #2528: SOLR-15475: Implement COUNT and APPROX_COUNT_DISTINCT aggregation functions for Parallel SQL
thelabdude opened a new pull request #2528: URL: https://github.com/apache/lucene-solr/pull/2528 Backport https://github.com/apache/solr/pull/194 to branch_8x -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude merged pull request #2528: SOLR-15475: Implement COUNT and APPROX_COUNT_DISTINCT aggregation functions for Parallel SQL
thelabdude merged pull request #2528: URL: https://github.com/apache/lucene-solr/pull/2528 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude opened a new pull request #2529: SOLR-15499: StatsStream implement ParallelMetricsRollup to allow for tiered computation of SQL metrics over collection aliases
thelabdude opened a new pull request #2529: URL: https://github.com/apache/lucene-solr/pull/2529 Backport https://github.com/apache/solr/pull/197 to branch_8x -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thelabdude merged pull request #2529: SOLR-15499: StatsStream implement ParallelMetricsRollup to allow for tiered computation of SQL metrics over collection aliases
thelabdude merged pull request #2529: URL: https://github.com/apache/lucene-solr/pull/2529 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova opened a new pull request #2530: LUCENE-10020 DocComparator don't skip docs of same docID
mayya-sharipova opened a new pull request #2530: URL: https://github.com/apache/lucene-solr/pull/2530 DocComparator should not skip docs with the same docID on multiple sorts with search after. Because of the optimization introduced in LUCENE-9449, currently when searching with sort on [_doc, other fields] with search after, DocComparator can efficiently skip all docs before and including the provided [search after docID]. This is a desirable behaviour in a single index search. But in a distributed search, where multiple indices have docs with the same docID, and when searching on [_doc, other fields], the sort optimization should NOT skip documents with the same docIDs. This PR fixes this. Backport for https://github.com/apache/lucene/pull/204 Relates to LUCENE-9449 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova merged pull request #2530: LUCENE-10020 DocComparator don't skip docs of same docID
mayya-sharipova merged pull request #2530: URL: https://github.com/apache/lucene-solr/pull/2530 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9449) Skip non-competitive documents when sort by _doc with search after
[ https://issues.apache.org/jira/browse/LUCENE-9449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376132#comment-17376132 ] ASF subversion and git services commented on LUCENE-9449: - Commit bdef1be78fb46ab7bf1924386daab2a7db1a4316 in lucene-solr's branch refs/heads/branch_8x from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=bdef1be ] LUCENE-10020 DocComparator don't skip docs of same docID (#2530) DocComparator should not skip docs with the same docID on multiple sorts with search after. Because of the optimization introduced in LUCENE-9449, currently when searching with sort on [_doc, other fields] with search after, DocComparator can efficiently skip all docs before and including the provided [search after docID]. This is a desirable behaviour in a single index search. But in a distributed search, where multiple indices have docs with the same docID, and when searching on [_doc, other fields], the sort optimization should NOT skip documents with the same docIDs. This PR fixes this. Backport for https://github.com/apache/lucene/pull/204 Relates to LUCENE-9449 > Skip non-competitive documents when sort by _doc with search after > -- > > Key: LUCENE-9449 > URL: https://issues.apache.org/jira/browse/LUCENE-9449 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mayya Sharipova >Priority: Minor > Fix For: 8.7 > > Time Spent: 5.5h > Remaining Estimate: 0h > > Enhance DocComparator to provide an iterator over competitive documents when > search ing with "after" FieldDoc. > This iterator can quickly position on the desired "after" document, and skip > all documents or even segments that contain documents before "after" > This is especially efficient when "after" is high. > > Related to LUCENE-9280 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9449) Skip non-competitive documents when sort by _doc with search after
[ https://issues.apache.org/jira/browse/LUCENE-9449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376131#comment-17376131 ] ASF subversion and git services commented on LUCENE-9449: - Commit bdef1be78fb46ab7bf1924386daab2a7db1a4316 in lucene-solr's branch refs/heads/branch_8x from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=bdef1be ] LUCENE-10020 DocComparator don't skip docs of same docID (#2530) DocComparator should not skip docs with the same docID on multiple sorts with search after. Because of the optimization introduced in LUCENE-9449, currently when searching with sort on [_doc, other fields] with search after, DocComparator can efficiently skip all docs before and including the provided [search after docID]. This is a desirable behaviour in a single index search. But in a distributed search, where multiple indices have docs with the same docID, and when searching on [_doc, other fields], the sort optimization should NOT skip documents with the same docIDs. This PR fixes this. Backport for https://github.com/apache/lucene/pull/204 Relates to LUCENE-9449 > Skip non-competitive documents when sort by _doc with search after > -- > > Key: LUCENE-9449 > URL: https://issues.apache.org/jira/browse/LUCENE-9449 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mayya Sharipova >Priority: Minor > Fix For: 8.7 > > Time Spent: 5.5h > Remaining Estimate: 0h > > Enhance DocComparator to provide an iterator over competitive documents when > search ing with "after" FieldDoc. > This iterator can quickly position on the desired "after" document, and skip > all documents or even segments that contain documents before "after" > This is especially efficient when "after" is high. > > Related to LUCENE-9280 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-10020) DocComparator should not skip docs with the same docID on multiple sorts with search after
[ https://issues.apache.org/jira/browse/LUCENE-10020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376130#comment-17376130 ] ASF subversion and git services commented on LUCENE-10020: -- Commit bdef1be78fb46ab7bf1924386daab2a7db1a4316 in lucene-solr's branch refs/heads/branch_8x from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=bdef1be ] LUCENE-10020 DocComparator don't skip docs of same docID (#2530) DocComparator should not skip docs with the same docID on multiple sorts with search after. Because of the optimization introduced in LUCENE-9449, currently when searching with sort on [_doc, other fields] with search after, DocComparator can efficiently skip all docs before and including the provided [search after docID]. This is a desirable behaviour in a single index search. But in a distributed search, where multiple indices have docs with the same docID, and when searching on [_doc, other fields], the sort optimization should NOT skip documents with the same docIDs. This PR fixes this. Backport for https://github.com/apache/lucene/pull/204 Relates to LUCENE-9449 > DocComparator should not skip docs with the same docID on multiple sorts with > search after > -- > > Key: LUCENE-10020 > URL: https://issues.apache.org/jira/browse/LUCENE-10020 > Project: Lucene - Core > Issue Type: Bug >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 1h 40m > Remaining Estimate: 0h > > Because of the optimization introduced in > [LUCENE-9449|https://issues.apache.org/jira/browse/LUCENE-9449], when > searching with sort on [_doc, other fields] with search after, > DocComparator can efficiently skip all docs before and including the provided > [search after docID]. > This is a desirable behaviour in a single index search. But in a distributed > search, where multiple indices have docs with the same docID, and when > searching on [_doc, other fields], the sort optimization should not skip > documents with the same docIDs. > Relates to LUCENE-9449 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] zacharymorn commented on a change in pull request #205: LUCENE-9959: Rename TermVectorsReader to TermVectorsReaderBase
zacharymorn commented on a change in pull request #205: URL: https://github.com/apache/lucene/pull/205#discussion_r665060449 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentCoreReaders.java ## @@ -57,7 +57,7 @@ final NormsProducer normsProducer; final StoredFieldsReader fieldsReaderOrig; - final TermVectorsReader termVectorsReaderOrig; + final TermVectorsReaderBase termVectorsReaderOrig; Review comment: Yeah this indeed looks inconsistent. I think the main thing here is, the overall understanding I got from https://github.com/apache/lucene/pull/180#discussion_r650720327 & https://github.com/apache/lucene/pull/180#discussion_r653357562 is that the new `TermVectors / TermVectorsReader` abstraction is supposed to be an index API, while the original `TermVectorsReader` / now `TermVectorsReaderBase` is a codec API, which has `clone/getMergeInstance/checkIntegrity` methods, and also implements `Closeable` (and all other `*Reader/Producer` codec classes also implement `Closeable` as well). If we were to move `Closeable` from `TermVectorsReaderBase` to the new `TermVectorsReader`, wouldn't that break the separation there? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] zacharymorn commented on a change in pull request #205: LUCENE-9959: Rename TermVectorsReader to TermVectorsReaderBase
zacharymorn commented on a change in pull request #205: URL: https://github.com/apache/lucene/pull/205#discussion_r665060449 ## File path: lucene/core/src/java/org/apache/lucene/index/SegmentCoreReaders.java ## @@ -57,7 +57,7 @@ final NormsProducer normsProducer; final StoredFieldsReader fieldsReaderOrig; - final TermVectorsReader termVectorsReaderOrig; + final TermVectorsReaderBase termVectorsReaderOrig; Review comment: Yeah this indeed looks inconsistent. I think the main thing here is, the overall understanding I got from https://github.com/apache/lucene/pull/180#discussion_r650720327 & https://github.com/apache/lucene/pull/180#discussion_r653357562 (this comment specifically asked not to move up `Closeable` from the original `TermVectorsReader`) is that, the new `TermVectors / TermVectorsReader` abstraction is supposed to be an index API, while the original `TermVectorsReader` / now `TermVectorsReaderBase` is a codec API, which has `clone/getMergeInstance/checkIntegrity` methods, and also implements `Closeable` (and all other `*Reader/Producer` codec classes also implement `Closeable` as well). If we were to move `Closeable` from `TermVectorsReaderBase` to the new `TermVectorsReader`, wouldn't that break the separation there? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org