[GitHub] [lucene] zf853109035 opened a new issue, #12441: IndexSearcher.doc(int docID, Set fieldsToLoad) method is so slow?
zf853109035 opened a new issue, #12441: URL: https://github.com/apache/lucene/issues/12441 ### Description I created a file-related index and ten 1 MB files. When I did not store the file content, I ran the doc(int docID, Set fieldsToLoad) of the IndexSearcher class ten times, and the delay was about 30 ms. When I stored the file content, If the doc(int docID, Set fieldsToLoad) of the IndexSearcher class runs ten times, the delay is about 150 ms to 200 ms. Even if the fieldsToLoad does not contain the content field, the delay is also slow. How can I optimize the delay? Why is it slow if fieldsToLoad does not contain content filed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mkhludnev commented on issue #12441: IndexSearcher.doc(int docID, Set fieldsToLoad) method is so slow?
mkhludnev commented on issue #12441: URL: https://github.com/apache/lucene/issues/12441#issuecomment-1636702962 It's by-design: whole block of records need to be decompressed and iterated through. Perhaps docValues (eg binary) might provide some sort of selectivity. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] mkhludnev closed issue #12441: IndexSearcher.doc(int docID, Set fieldsToLoad) method is so slow?
mkhludnev closed issue #12441: IndexSearcher.doc(int docID, Set fieldsToLoad) method is so slow? URL: https://github.com/apache/lucene/issues/12441 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] ChrisHegarty commented on pull request #12417: forutil add vectorized and scalar code
ChrisHegarty commented on PR #12417: URL: https://github.com/apache/lucene/pull/12417#issuecomment-1636713748 Apologies for my tardy and terse interaction here. I've been otherwise preoccupied. I hope to spend time on this soon. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] uschindler commented on pull request #12417: forutil add vectorized and scalar code
uschindler commented on PR #12417: URL: https://github.com/apache/lucene/pull/12417#issuecomment-1636714292 > Note that these benchmarks were running with jdk19 (not 20), so it's possible we'd see something different with 20? Lucene enables and compiles the vectorized code only for jdk 20 and 21. In 19 it won't be enabled. Be sure to also show the .message logged on startup by the `VectorizationProvider`! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] stefanvodita opened a new pull request, #12442: Assert IdxOrDvQuery subqueries and document useful fields
stefanvodita opened a new pull request, #12442: URL: https://github.com/apache/lucene/pull/12442 This is a follow-up from #12426. We introduce assertions in `TestIndexOrDocValuesQuery` that the two wrapped queries are behaving the same way and we document fields that produce indexed structures and doc values, which are good candidates for being used with `IndexOrDocValuesQuery`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] stefanvodita commented on a diff in pull request #12442: Assert IdxOrDvQuery subqueries and document useful fields
stefanvodita commented on code in PR #12442: URL: https://github.com/apache/lucene/pull/12442#discussion_r1264364625 ## lucene/test-framework/src/java/org/apache/lucene/tests/search/QueryUtils.java: ## @@ -675,7 +675,14 @@ public static void checkBulkScorerSkipTo(Random r, Query query, IndexSearcher se query = searcher.rewrite(query); Weight weight = searcher.createWeight(query, ScoreMode.COMPLETE, 1); for (LeafReaderContext context : searcher.getIndexReader().leaves()) { - final Scorer scorer = weight.scorer(context); + final Scorer scorer; + if (weight.scorerSupplier(context) != null) { +// For IndexOrDocValuesQuey, the bulk scorer will use the indexed structure query +// and the scorer with a lead cost of 0 will use the doc values query. +scorer = weight.scorerSupplier(context).get(0); Review Comment: I had some doubts if we should use a lead cost of 0 across the board, but it doesn't seem as if any tests relied on it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] stefanvodita commented on pull request #12426: Introduce VerifyingQuery
stefanvodita commented on PR #12426: URL: https://github.com/apache/lucene/pull/12426#issuecomment-1636716527 Thank you for the suggestions for `IndexOrDocValuesQuery`! I’ve opened a separate [PR](https://github.com/apache/lucene/pull/12442) to address them. Let me know if it matches what you had in mind. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene] rmuir commented on pull request #12417: forutil add vectorized and scalar code
rmuir commented on PR #12417: URL: https://github.com/apache/lucene/pull/12417#issuecomment-1636842807 please, lets not use this integer vectorization when `hasFastIntegerVectors` is false. Otherwise we can see 30x or so slowdown on virtualmachines without properly plumbed AVX. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org