[GitHub] [lucene-jira-archive] vlsi commented on issue #137: Consider spreading attachment folders to subfolders to avoid 10000+ folders under a single root

2022-08-21 Thread GitBox


vlsi commented on issue #137:
URL: 
https://github.com/apache/lucene-jira-archive/issues/137#issuecomment-1221506938

   GitHub limits the listing to 1000 entries only: 
https://github.com/DefinitelyTyped/DefinitelyTyped/tree/master/types


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jtibshirani merged pull request #1077: LUCENE-10577: Remove KnnVectorsFormat#currentVersion

2022-08-21 Thread GitBox


jtibshirani merged PR #1077:
URL: https://github.com/apache/lucene/pull/1077


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] jtibshirani commented on a diff in pull request #1074: Fix for bad cast when sorting a KnnVectors index over BytesRef

2022-08-21 Thread GitBox


jtibshirani commented on code in PR #1074:
URL: https://github.com/apache/lucene/pull/1074#discussion_r950897203


##
lucene/codecs/src/java/org/apache/lucene/codecs/simpletext/SimpleTextKnnVectorsWriter.java:
##
@@ -76,6 +77,10 @@ public class SimpleTextKnnVectorsWriter extends 
BufferingKnnVectorsWriter {
   public void writeField(FieldInfo fieldInfo, KnnVectorsReader 
knnVectorsReader, int maxDoc)
   throws IOException {
 VectorValues vectors = knnVectorsReader.getVectorValues(fieldInfo.name);
+if (fieldInfo.getVectorEncoding() != VectorEncoding.FLOAT32) {

Review Comment:
   Since `VectorEncoding` belongs to `FieldInfo`, it's expected that any codec 
implementations will support it. (It's just not supported by the old HNSW 
codecs, which makes sense). So it seems like we should update 
`SimpleTextKnnVectorsWriter` to support byte encodings. Maybe we could at least 
file an issue about it to track it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene] visionarywind closed pull request #1063: help scorer scan of memory codec

2022-08-21 Thread GitBox


visionarywind closed pull request #1063: help scorer scan of memory codec
URL: https://github.com/apache/lucene/pull/1063


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8810) Flattening of nested disjunctions does not take into account number of clause limitation of builder

2022-08-21 Thread Wads (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582751#comment-17582751
 ] 

Wads commented on LUCENE-8810:
--

[~jpountz], TermInSetQuery works, but seems to only cover SHOULD cases. I do 
not see anything that could cover MUST cases. Am I missing something?

> Flattening of nested disjunctions does not take into account number of clause 
> limitation of builder
> ---
>
> Key: LUCENE-8810
> URL: https://issues.apache.org/jira/browse/LUCENE-8810
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 8.0
>Reporter: Mickaël Sauvée
>Priority: Minor
> Fix For: 8.2
>
> Attachments: LUCENE-8810.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In org.apache.lucene.search.BooleanQuery, at the end of the function 
> rewrite(IndexReader reader), the query is rewritten to flatten nested 
> disjunctions.
> This does not take into account the limitation on the number of clauses in a 
> builder (1024).
>  In some circumstances, this limite can be reached, hence an exception is 
> thrown.
> Here is a unit test that highlight this.
> {code:java}
>   public void testFlattenInnerDisjunctionsWithMoreThan1024Terms() throws 
> IOException {
> IndexSearcher searcher = newSearcher(new MultiReader());
> BooleanQuery.Builder builder1024 = new BooleanQuery.Builder();
> for(int i = 0; i < 1024; i++) {
>   builder1024.add(new TermQuery(new Term("foo", "bar-" + i)), 
> Occur.SHOULD);
> }
> Query inner = builder1024.build();
> Query query = new BooleanQuery.Builder()
> .add(inner, Occur.SHOULD)
> .add(new TermQuery(new Term("foo", "baz")), Occur.SHOULD)
> .build();
> searcher.rewrite(query);
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org