[jira] [Commented] (SOLR-13894) Solr 8.3 streaming expreessions do not return all fields (select)
[ https://issues.apache.org/jira/browse/SOLR-13894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969029#comment-16969029 ] Christian Spitzlay commented on SOLR-13894: --- The above streaming expression does not work for me as the quotes are not the standard ASCII ones. This one can be parsed on my machine: {code:java} select(search(testcollection,q="test",df="Default",defType="edismax",fl="id", qt="/export", sort="id asc"),id,if(eq(1,1),Y,N) as found) {code} > Solr 8.3 streaming expreessions do not return all fields (select) > - > > Key: SOLR-13894 > URL: https://issues.apache.org/jira/browse/SOLR-13894 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud, streaming expressions >Affects Versions: 8.3.0 >Reporter: Jörn Franke >Priority: Major > > {color:#22}I use streaming expressions, e.g.{color} > {color:#22}sortsSelect(search(...),id,if({color}{color:#22}eq(1,1),Y,N) > as found), by=“field A asc”){color} > {color:#22}(Using export handler, sort is not really mandatory , I will > remove it later anyway){color} > {color:#22}This works perfectly fine if I use Solr 8.2.0 (server + > client). It returns Tuples in the form \{ “id”,”12345”, “found”:”Y”}{color} > {color:#22}However, if I use Solr 8.2.0 as server and Solr 8.3.0 as > client then the above statement only returns the id field, but not the > "found" field.{color} > {color:#22}Questions:{color} > {color:#22}1) is this expected behavior, ie Solr client 8.3.0 is in this > case not compatible with Solr 8.2.0 and server upgrade to Solr 8.3.0 will fix > this?{color} > {color:#22}2) has the syntax for the above expression changed? If so > how?{color} > {color:#22}3) is this not expected behavior and I should create a Jira > for it?{color} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] romseygeek commented on a change in pull request #985: LUCENE-9031: test unified highlighter on intervals queries
romseygeek commented on a change in pull request #985: LUCENE-9031: test unified highlighter on intervals queries URL: https://github.com/apache/lucene-solr/pull/985#discussion_r343632341 ## File path: lucene/queries/src/java/org/apache/lucene/queries/intervals/IntervalMatches.java ## @@ -74,7 +75,7 @@ public MatchesIterator getSubMatches() throws IOException { @Override public Query getQuery() { -throw new UnsupportedOperationException(); +return query==null ? query = new IntervalQuery(field, intervalsSource) : query; Review comment: I mean literally replace `throw new UnsupporrtedOperationException` with `return source.getQuery()` - just delegate, like we're doing with submatches four lines above. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] romseygeek commented on a change in pull request #985: LUCENE-9031: test unified highlighter on intervals queries
romseygeek commented on a change in pull request #985: LUCENE-9031: test unified highlighter on intervals queries URL: https://github.com/apache/lucene-solr/pull/985#discussion_r343632519 ## File path: lucene/queries/src/java/org/apache/lucene/queries/intervals/TermIntervalsSource.java ## @@ -200,7 +207,7 @@ public MatchesIterator getSubMatches() { @Override public Query getQuery() { -throw new UnsupportedOperationException(); +return query; Review comment: Have a look at how `SpanWeight` implements this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13782) Make HTML Ref Guide the primary release vehicle instead of PDF
[ https://issues.apache.org/jira/browse/SOLR-13782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969280#comment-16969280 ] ASF subversion and git services commented on SOLR-13782: Commit b5a4f672b9840337f844d91de0d1da0700ed6d37 in lucene-solr's branch refs/heads/jira/SOLR-13452_gradle_7_refguide from Cassandra Targett [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b5a4f67 ] Update Ref Guide Gradle build with changes from SOLR-13782 to remove PDF build > Make HTML Ref Guide the primary release vehicle instead of PDF > -- > > Key: SOLR-13782 > URL: https://issues.apache.org/jira/browse/SOLR-13782 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Cassandra Targett >Assignee: Cassandra Targett >Priority: Major > Fix For: 8.3 > > Time Spent: 10m > Remaining Estimate: 0h > > As discussed in a recent mail thread [1], we have agreed that it is time for > us to stop treating the PDF version of the Ref Guide as the "official" > version and instead emphasize the HTML version as the official version. > The arguments for/against this decision are in the linked thread, but for the > purpose of this issue there are a couple of things to do: > - Modify the publication process docs (under > {{solr/solr-ref-guide/src/meta-docs}} > - Announce to the solr-user list that this is happening > A separate issue will be created to automate parts of the publication > process, since they require some discussion and possibly coordination with > Infra on the options there. > [1] > https://lists.apache.org/thread.html/f517b3b74a0a33e5e6fa87e888459fc007decc49d27a4f49822ca2ee@%3Cdev.lucene.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-11492) More Modern cloud dev script
[ https://issues.apache.org/jira/browse/SOLR-11492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gus Heck resolved SOLR-11492. - Fix Version/s: 8.3 Resolution: Implemented > More Modern cloud dev script > > > Key: SOLR-11492 > URL: https://issues.apache.org/jira/browse/SOLR-11492 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 8.0 >Reporter: Gus Heck >Assignee: Gus Heck >Priority: Minor > Fix For: 8.3 > > Attachments: SOLR-11492.patch, cloud.sh, cloud.sh, cloud.sh, > cloud.sh, cloud.sh, cloud.sh, cloud.sh > > > Most of the scripts in solr/cloud-dev do things like start using java -jar > and other similarly ancient techniques. I recently decided I really didn't > like that it was a pain to setup a cloud to test a patch/feature and that > often one winds up needing to blow away existing testing so working on more > than one thing at a time is irritating... so here's a script I wrote, if > folks like it I'd be happy for it to be included in solr/cloud-dev -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13900) Permissions deleting works wrong
[ https://issues.apache.org/jira/browse/SOLR-13900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969305#comment-16969305 ] Yuliia Sydoruk commented on SOLR-13900: --- [~janhoy] Yes, I mean deleting permissions via Authorization API. My propose is just to remove setIndex(p) call, but I can't understand why it was added there and what would be the impact of removing this line. > Permissions deleting works wrong > > > Key: SOLR-13900 > URL: https://issues.apache.org/jira/browse/SOLR-13900 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Authorization, security >Reporter: Yuliia Sydoruk >Priority: Major > > Permissions indexes in security.json file do not correspond to indexes while > deleting. > The line > {{(141) setIndex(p);}} > in > [https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/security/AutorizationEditOperation.java] > makes indexes renumber before deleting and it leads to wrong behavior. > *USE CASE 1:* > There are 2 new permissions added to security.json (with indexes 13 and 14): > {code:java} > > { > "role":"admin", > "name":"schema-edit", > "index":12}, > { > "collection":"", > "path":"/schema/*", > "role":"test-role", > "index":13}, > { > "path":"/admin/collections", > "params":{"collection":["testCollection"]}, > "role":"test-role", > "index":14} > > {code} > Step 1: remove the permission with index=13; result: permission is deleted > correctly, security.json is next: > {code:java} > > { > "role":"admin", > "name":"schema-edit", > "index":12, > { > "path":"/admin/collections", > "params":{"collection":["testCollection"]}, > "role":"test-role", > "index":14} > > {code} > Step 2: try to remove the permission with index=14; result: "No such index: > 14" error is returned. > *USE CASE 2:* > There are 3 new permissions added to security.json (with indexes 13, 14 and > 15): > {code:json} > > { > "role":"admin", > "name":"schema-edit", > "index":12}, > { > "collection":"", > "path":"/schema/*", > "role":"test-role", > "index":13}, > { > "path":"/admin/collections", > "params":{"collection":["testCollection"]}, > "role":"test-role", > "index":14}, > { > "path":"/admin/collections", > "params":\{"collection":["anotherTestCollection"]}, > "role":"test-role", > "index":15} > > {code} > Step 1: remove the permission with index=13; result: permission is deleted > correctly, security.json becomes next: > {code:json} > >{ > "role":"admin", > "name":"schema-edit", > "index":12}, >{ > "path":"/admin/collections", > "params":{"collection":["testCollection"]}, > "role":"test-role", "index":14}, >{ > "path":"/admin/collections", > "params":{"collection":["anotherTestCollection"]}, > "role":"test-role", > "index":15} > > {code} > > Step 2: try to remove the permission with index=14; result: permission with > index 15 is deleted, which is *wrong* -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9037) ArrayIndexOutOfBoundsException due to repeated IOException during indexing
[ https://issues.apache.org/jira/browse/LUCENE-9037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969319#comment-16969319 ] Michael McCandless commented on LUCENE-9037: What a fun test case :) This is indeed as bug in {{IndexWriter}} ... we already added "best effort" checks to detect when a single in-memory segment ({{DocumentsWriterPerThread}}) was close to its limit, through {{IndexWriterConfig.setRAMPerThreadHardLimitMB}}, but obviously they don't detect this case properly. I don't think we should make all {{IOException}} aborting – that's overkill and would cause "normal" cases of {{IOException}} to abort your {{IndexWriter}} unexpectedly. On {{IOException}} I think IW should simply delete that one document because something went wrong while iterating its tokens. I think, instead, we should fix {{DocumentsWriterPerThread}} to better detect when it has hit the {{setRAMPerThreadHardLimitMB}} and throw a meaningful exception, deleting the unlucky document that ran into that limit. We should improve the best effort check we have today. > ArrayIndexOutOfBoundsException due to repeated IOException during indexing > -- > > Key: LUCENE-9037 > URL: https://issues.apache.org/jira/browse/LUCENE-9037 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: 7.1 >Reporter: Ilan Ginzburg >Priority: Minor > Attachments: TestIndexWriterTermsHashOverflow.java > > Time Spent: 10m > Remaining Estimate: 0h > > There is a limit to the number of tokens that can be held in memory by Lucene > when docs are indexed using DocumentsWriter, then bad things happen. The > limit can be reached by submitting a really large document, by submitting a > large number of documents without doing a commit (see LUCENE-8118) or by > repeatedly submitting documents that fail to get indexed in some specific > ways, leading to Lucene not cleaning up the in memory data structures that > eventually overflow. > The overflow is due to a 32 bit (signed) integer wrapping around to negative > territory, then causing an ArrayIndexOutOfBoundsException. > The failure path that we are reliably hitting is due to an IOException during > doc tokenization. A tokenizer implementing TokenStream throws an exception > from incrementToken() which causes indexing of that doc to fail. > The IOException bubbles back up to DocumentsWriter.updateDocument() (or > DocumentsWriter.updateDocuments() in some other cases) where it is not > treated as an AbortingException therefore it is not causing a reset of the > DocumentsWriterPerThread. On repeated failures (without any successful > indexing in between) if the upper layer (client via Solr) resubmits the doc > that fails again, DocumentsWriterPerThread will eventually cause > TermsHashPerField data structures to grow and overflow, leading to an > exception stack similar to the one in LUCENE-8118 (below stack trace copied > from a test run repro on 7.1): > java.lang.ArrayIndexOutOfBoundsException: > -65536java.lang.ArrayIndexOutOfBoundsException: -65536 > at __randomizedtesting.SeedInfo.seed([394FAB2B91B1D90A:C86FB3F3CE001AA8]:0) > at > org.apache.lucene.index.TermsHashPerField.writeByte(TermsHashPerField.java:198) > at > org.apache.lucene.index.TermsHashPerField.writeVInt(TermsHashPerField.java:221) > at > org.apache.lucene.index.FreqProxTermsWriterPerField.writeProx(FreqProxTermsWriterPerField.java:80) > at > org.apache.lucene.index.FreqProxTermsWriterPerField.addTerm(FreqProxTermsWriterPerField.java:171) > at org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:185) > at > org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:792) > at > org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:430) > at > org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:392) > at > org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:239) > at > org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:481) > at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1717) > at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1462) > Using tokens composed only of lowercase letters, it takes less than > 130,000,000 different tokens (the shortest ones) to overflow > TermsHashPerField. > Using a single document (composed of the 20,000 shortest lowercase tokens) > submitted repeatedly for indexing requires 6352 submissions all failing with > an IOException on incrementToken() to trigger the > ArrayIndexOutOfBoundsException. > A proposed fix is to treat in DocumentsWriter.updateDo
[jira] [Commented] (SOLR-13125) Optimize Queries when sorting by router.field
[ https://issues.apache.org/jira/browse/SOLR-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969325#comment-16969325 ] Gus Heck commented on SOLR-13125: - [~janhoy] I generally share that sentiment, the how is the tough part. I'm pretty sure at a minimum a hook of some sort needs to be added. I'm definitely always keen to factor feature specific stuff out of central classes where it makes sense to do so. > Optimize Queries when sorting by router.field > - > > Key: SOLR-13125 > URL: https://issues.apache.org/jira/browse/SOLR-13125 > Project: Solr > Issue Type: Sub-task >Reporter: mosh >Assignee: Gus Heck >Priority: Minor > Attachments: SOLR-13125-no-commit.patch, SOLR-13125.patch, > SOLR-13125.patch, SOLR-13125.patch > > Time Spent: 10m > Remaining Estimate: 0h > > We are currently testing TRA using Solr 7.7, having >300 shards in the alias, > with much growth in the coming months. > The "hot" data(in our case, more recent) will be stored on stronger > nodes(SSD, more RAM, etc). > A proposal of optimizing queries sorted by router.field(the field which TRA > uses to route the data to the correct collection) has emerged. > Perhaps, in queries which are sorted by router.field, Solr could be smart > enough to wait for the more recent collections, and in case the limit was > reached cancel other queries(or just not block and wait for the results)? > For example: > When querying a TRA which with a filter on a different field than > router.field, but sorting by router.field desc, limit=100. > Since this is a TRA, solr will issue queries for all the collections in the > alias. > But to optimize this particular type of query, Solr could wait for the most > recent collection in the TRA, see whether the result set matches or exceeds > the limit. If so, the query could be returned to the user without waiting for > the rest of the shards. If not, the issuing node will block until the second > query returns, and so forth, until the limit of the request is reached. > This might also be useful for deep paging, querying each collection and only > skipping to the next once there are no more results in the specified > collection. > Thoughts or inputs are always welcome. > This is just my two cents, and I'm always happy to brainstorm. > Thanks in advance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9027) SIMD-based decoding of postings lists
[ https://issues.apache.org/jira/browse/LUCENE-9027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969339#comment-16969339 ] Michael McCandless commented on LUCENE-9027: Hmm why are your {{Term}} results crazy fast (~1400 QPS) on {{wikibigall}}? I thought {{Term}} would also show gains here, since it just walks through all postings blocks for the term, decoding and collecting? > SIMD-based decoding of postings lists > - > > Key: LUCENE-9027 > URL: https://issues.apache.org/jira/browse/LUCENE-9027 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > [~rcmuir] has been mentioning the idea for quite some time that we might be > able to write the decoding logic in such a way that Java would use SIMD > instructions. More recently [~paul.masurel] wrote a [blog > post|https://fulmicoton.com/posts/bitpacking/] that raises the point that > Lucene could still do decode multiple ints at once in a single instruction by > packing two ints in a long and we've had some discussions about what we could > try in Lucene to speed up the decoding of postings. This made me want to look > a bit deeper at what we could do. > Our current decoding logic reads data in a byte[] and decodes packed integers > from it. Unfortunately it doesn't make use of SIMD instructions and looks > like > [this|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/NaiveByteDecoder.java]. > I confirmed by looking at the generated assembly that if I take an array of > integers and shift them all by the same number of bits then Java will use > SIMD instructions to shift multiple integers at once. This led me to writing > this > [implementation|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/SimpleSIMDDecoder.java] > that tries as much as possible to shift long sequences of ints by the same > number of bits to speed up decoding. It is indeed faster than the current > logic we have, up to about 2x faster for some numbers of bits per value. > Currently the best > [implementation|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/SIMDDecoder.java] > I've been able to come up with combines the above idea with the idea that > Paul mentioned in his blog that consists of emulating SIMD from Java by > packing multiple integers into a long: 2 ints, 4 shorts or 8 bytes. It is a > bit harder to read but gives another speedup on top of the above > implementation. > I have a [JMH > benchmark|https://github.com/jpountz/decode-128-ints-benchmark/] available in > case someone would like to play with this and maybe even come up with an even > faster implementation. It is 2-2.5x faster than our current implementation > for most numbers of bits per value. I'm copying results here: > {noformat} > * `readLongs` just reads 2*bitsPerValue longs from the ByteBuffer, it serves > as >a baseline. > * `decodeNaiveFromBytes` reads a byte[] and decodes from it. This is what the >current Lucene codec does. > * `decodeNaiveFromLongs` decodes from longs on the fly. > * `decodeSimpleSIMD` is a simple implementation that relies on how Java >recognizes some patterns and uses SIMD instructions. > * `decodeSIMD` is a more complex implementation that both relies on the C2 >compiler to generate SIMD instructions and encodes 8 bytes, 4 shorts or >2 ints in a long in order to decompress multiple values at once. > Benchmark (bitsPerValue) (byteOrder) > Mode Cnt Score Error Units > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 1 LE > thrpt5 12.912 ± 0.393 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 1 BE > thrpt5 12.862 ± 0.395 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 2 LE > thrpt5 13.040 ± 1.162 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 2 BE > thrpt5 13.027 ± 0.270 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 3 LE > thrpt5 12.409 ± 0.637 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 3 BE > thrpt5 12.268 ± 0.947 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 4 LE > thrpt5 14.177 ± 2.263 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 4 BE > thrpt5 11.457 ± 0.150 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 5 LE > thrpt5 10.988 ± 1.179 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 5
[GitHub] [lucene-solr] HoustonPutman commented on issue #984: SOLR-12217: Support shards.preference for individual shard requests
HoustonPutman commented on issue #984: SOLR-12217: Support shards.preference for individual shard requests URL: https://github.com/apache/lucene-solr/pull/984#issuecomment-551215512 Want to hear your opinion @tflobbe . So I changed the logic for [non-update/admin/v2 request routing](https://github.com/apache/lucene-solr/pull/984/files#diff-e188c2b5189b7a912c7c77b50beddf69L1102) to use core URLs instead of collection URLs, unless the request is querying multiple collections. In this way, if there are certain replicas on a host that you want to target you can make that distinction. However that does change the default behavior in a non-trivial way. The only place where I see this breaks any test is weirdly enough in the [Hadoop auth tests for sending a commit request](https://github.com/apache/lucene-solr/blob/master/solr/core/src/test/org/apache/solr/security/hadoop/TestSolrCloudWithHadoopAuthPlugin.java#L127). It seems that since the commit request is sent directly to a core, instead of the collection url, it is missing an additional count of unauthenticated requests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13894) Solr 8.3 streaming expreessions do not return all fields (select)
[ https://issues.apache.org/jira/browse/SOLR-13894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969497#comment-16969497 ] Jörn Franke commented on SOLR-13894: Yes the above streaming expression was because I wrote from my mobile. I have the same srtreaming expression as you and in Solr 8.3 it returns only the field "id", but not "found" (no error/exception!) In Solr 8.2 it returns "id" and "found". Did you test with Solr 8.3 and can you confirm that it returns both fields for you or did you test another solr version? > Solr 8.3 streaming expreessions do not return all fields (select) > - > > Key: SOLR-13894 > URL: https://issues.apache.org/jira/browse/SOLR-13894 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud, streaming expressions >Affects Versions: 8.3.0 >Reporter: Jörn Franke >Priority: Major > > {color:#22}I use streaming expressions, e.g.{color} > {color:#22}sortsSelect(search(...),id,if({color}{color:#22}eq(1,1),Y,N) > as found), by=“field A asc”){color} > {color:#22}(Using export handler, sort is not really mandatory , I will > remove it later anyway){color} > {color:#22}This works perfectly fine if I use Solr 8.2.0 (server + > client). It returns Tuples in the form \{ “id”,”12345”, “found”:”Y”}{color} > {color:#22}However, if I use Solr 8.2.0 as server and Solr 8.3.0 as > client then the above statement only returns the id field, but not the > "found" field.{color} > {color:#22}Questions:{color} > {color:#22}1) is this expected behavior, ie Solr client 8.3.0 is in this > case not compatible with Solr 8.2.0 and server upgrade to Solr 8.3.0 will fix > this?{color} > {color:#22}2) has the syntax for the above expression changed? If so > how?{color} > {color:#22}3) is this not expected behavior and I should create a Jira > for it?{color} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13782) Make HTML Ref Guide the primary release vehicle instead of PDF
[ https://issues.apache.org/jira/browse/SOLR-13782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-13782: - Fix Version/s: (was: 8.3) 8.4 > Make HTML Ref Guide the primary release vehicle instead of PDF > -- > > Key: SOLR-13782 > URL: https://issues.apache.org/jira/browse/SOLR-13782 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: documentation >Reporter: Cassandra Targett >Assignee: Cassandra Targett >Priority: Major > Fix For: 8.4 > > Time Spent: 10m > Remaining Estimate: 0h > > As discussed in a recent mail thread [1], we have agreed that it is time for > us to stop treating the PDF version of the Ref Guide as the "official" > version and instead emphasize the HTML version as the official version. > The arguments for/against this decision are in the linked thread, but for the > purpose of this issue there are a couple of things to do: > - Modify the publication process docs (under > {{solr/solr-ref-guide/src/meta-docs}} > - Announce to the solr-user list that this is happening > A separate issue will be created to automate parts of the publication > process, since they require some discussion and possibly coordination with > Infra on the options there. > [1] > https://lists.apache.org/thread.html/f517b3b74a0a33e5e6fa87e888459fc007decc49d27a4f49822ca2ee@%3Cdev.lucene.apache.org%3E -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13665) Connecting to ZK on SSL port (secureClient: ClassNotDef found error)
[ https://issues.apache.org/jira/browse/SOLR-13665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-13665: - Fix Version/s: (was: 8.3.0) 8.3 > Connecting to ZK on SSL port (secureClient: ClassNotDef found error) > > > Key: SOLR-13665 > URL: https://issues.apache.org/jira/browse/SOLR-13665 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 8.2 >Reporter: Jörn Franke >Assignee: Jan Høydahl >Priority: Blocker > Fix For: 8.3 > > Time Spent: 0.5h > Remaining Estimate: 0h > > > I managed to setup Zookeeper 3.5.5 with secure Client enabled and configured > in solr.in.sh the zookeeper properties to use that port, which offers SSL. > However, I see the following error in the logfiles when starting up Solr: > 2019-07-30 14:59:09.704 INFO (main) [ ] o.a.z.c.X509Util Setting -D > jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated > TLS renegotiation > 2019-07-30 14:59:09.710 ERROR (main) [ ] o.a.s.s.SolrDispatchFilter Could > not start Solr. Check solr/home property and the logs > 2019-07-30 14:59:09.743 ERROR (main) [ ] o.a.s.c.SolrCore > null:java.lang.NoClassDefFoundError: io/netty/channel/ChannelHandler > at java.base/java.lang.Class.forName0(Native Method) > at java.base/java.lang.Class.forName(Class.java:315) > at > org.apache.zookeeper.ZooKeeper.getClientCnxnSocket(ZooKeeper.java:3063) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:883) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:801) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:950) > at org.apache.zookeeper.ZooKeeper.(ZooKeeper.java:688) > at > org.apache.solr.common.cloud.SolrZooKeeper.(SolrZooKeeper.java:43) > at > org.apache.solr.common.cloud.ZkClientConnectionStrategy.createSolrZooKeeper(ZkClientConnectionStrategy.java:105) > at > org.apache.solr.common.cloud.DefaultConnectionStrategy.connect(DefaultConnectionStrategy.java:37) > at > org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:166) > at > org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:125) > at > org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:120) > at > org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:107) > at > org.apache.solr.servlet.SolrDispatchFilter.loadNodeConfig(SolrDispatchFilter.java:282) > at > org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:259) > at > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:181) > at > org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:136) > at > org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$0(ServletHandler.java:750) > at > java.base/java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) > at > java.base/java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:734) > at > java.base/java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:734) > at > java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658) > at > org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:744) > at > org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:369) > at > org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1497) > at > org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1459) > at > org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:854) > at > org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:278) > at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:545) > at > org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68) > at > org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:46) > at > org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:192) > at > org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:510) > at > org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:153) > at > org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:172) > at > org.eclipse.jetty.deploy.providers.WebAppProvider.fileAdded(WebAppProvider.java:436) > at > org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:65) > at org.e
[jira] [Updated] (SOLR-13835) HttpSolrCall produces incorrect extra AuditEvent on AuthorizationResponse.PROMPT
[ https://issues.apache.org/jira/browse/SOLR-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-13835: - Fix Version/s: (was: 8.3.0) 8.3 > HttpSolrCall produces incorrect extra AuditEvent on > AuthorizationResponse.PROMPT > > > Key: SOLR-13835 > URL: https://issues.apache.org/jira/browse/SOLR-13835 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Authentication, Authorization >Reporter: Chris M. Hostetter >Assignee: Jan Høydahl >Priority: Major > Fix For: 8.3 > > Time Spent: 0.5h > Remaining Estimate: 0h > > spinning this out of SOLR-13741... > {quote} > Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe > there is a code bug, not a test bug. In HttpSolrCall#471 in the > {{authorize()}} call, if authResponse == PROMPT, it will actually match both > blocks and emit two audit events: > [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493] > > {code:java} > if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...} > if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && > !(authResponse.statusCode == HttpStatus.SC_OK)) {...} > {code} > When code==401, it is also true that code!=200. Intuitively there should be > both a sendErrora and return RETURN before line #484 in the first if block? > {quote} > This causes any and all {{REJECTED}} AuditEvent messages to be accompanied by > a coresponding {{UNAUTHORIZED}} AuditEvent. > It's not yet clear if, from the perspective of the external client, there are > any other bugs in behavior (TBD) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13899) zkstatus page incorrectly reports zookeeper in error when Zookeeper observers are present
[ https://issues.apache.org/jira/browse/SOLR-13899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-13899: - Affects Version/s: (was: 8.3.0) 8.3 > zkstatus page incorrectly reports zookeeper in error when Zookeeper observers > are present > - > > Key: SOLR-13899 > URL: https://issues.apache.org/jira/browse/SOLR-13899 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 8.3 >Reporter: Salvatore >Priority: Trivial > Labels: easyfix > Attachments: zkstatus.png > > > When a zookeeper ensemble has 'observers', the zkstatus page incorrectly says > Zookeeper status is in error (See attachment.) > This is because the > [ZookeeperStatusHandler|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/admin/ZookeeperStatusHandler.java] > does not account for the > '[observer|https://zookeeper.apache.org/doc/current/zookeeperObservers.html]' > role whatsoever. > This should be an easy fix - I see there being two options; > 1. Treat observers as followers by changing > [L112|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/admin/ZookeeperStatusHandler.java#L112] > to > {code:java} > if ("follower".equals(state) || "observer".equals(state)) { > {code} > > 2. Ignore observers from the required follower count by changing > [L116|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/handler/admin/ZookeeperStatusHandler.java#L116] > to > {code:java} > reportedFollowers = > Integer.parseInt(String.valueOf(stat.get("zk_synced_followers"))); > {code} > Option 1 will make the zkstatus page show error when an observer is down. > Option 2 will not make the zkstatus page show error when an observer is down. > *Ideally*, additional logic to account for observers should be added, and > show a STATUS_YELLOW when any observers are down (but followers are all up), > as this means the ensemble is only in a degraded, but functional state. > Happy to create a PR, however I don't have a lot of free time at home at the > moment, so it may take a week or two. > > Additional info: > See below for example mntr output for the Leader/Follower/Observer roles, > noting the Leader's zk_followers and zk_synced_followers values, and the > values of zk_server_state. > Leader: > {code:java} > [root@master1 ~]# echo mntr | nc master3 12181 > zk_version 3.5.6-c11b7e26bc554b8523dc929761dd28808913f091, built on > 10/08/2019 20:18 GMT > zk_avg_latency 0 > zk_max_latency 2 > zk_min_latency 0 > zk_packets_received 97 > zk_packets_sent 96 > zk_num_alive_connections 2 > zk_outstanding_requests 0 > zk_server_state leader > zk_znode_count 92 > zk_watch_count 7 > zk_ephemerals_count 9 > zk_approximate_data_size 236333 > zk_open_file_descriptor_count 64 > zk_max_file_descriptor_count 4096 > zk_followers 4 > zk_synced_followers 2 > zk_pending_syncs 0 > zk_last_proposal_size -1 > zk_max_proposal_size -1 > zk_min_proposal_size -1 > {code} > Follower: > {code:java} > [root@master1 ~]# echo mntr | nc master2 12181 > zk_version3.5.6-c11b7e26bc554b8523dc929761dd28808913f091, built on > 10/08/2019 20:18 GMT > zk_avg_latency0 > zk_max_latency6 > zk_min_latency0 > zk_packets_received 97 > zk_packets_sent 96 > zk_num_alive_connections 2 > zk_outstanding_requests 0 > zk_server_state follower > zk_znode_count92 > zk_watch_count7 > zk_ephemerals_count 9 > zk_approximate_data_size 236333 > zk_open_file_descriptor_count 60 > zk_max_file_descriptor_count 4096 > {code} > Observer: > {code:java} > [root@master1 ~]# echo mntr | nc slave1 12181 > zk_version3.5.6-c11b7e26bc554b8523dc929761dd28808913f091, built on > 10/08/2019 20:18 GMT > zk_avg_latency0 > zk_max_latency8 > zk_min_latency0 > zk_packets_received 174 > zk_packets_sent 173 > zk_num_alive_connections 2 > zk_outstanding_requests 0 > zk_server_state observer > zk_znode_count92 > zk_watch_count7 > zk_ephemerals_count 9 > zk_approximate_data_size 236333 > zk_open_file_descriptor_count 59 > zk_max_file_descriptor_count 4096 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13843) The MOVEREPLICA API ignores replica type and always adds 'nrt' replicas
[ https://issues.apache.org/jira/browse/SOLR-13843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-13843: - Fix Version/s: (was: 8.3.0) > The MOVEREPLICA API ignores replica type and always adds 'nrt' replicas > --- > > Key: SOLR-13843 > URL: https://issues.apache.org/jira/browse/SOLR-13843 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Amrit Sarkar >Assignee: Shalin Shekhar Mangar >Priority: Major > Fix For: master (9.0), 8.3 > > Attachments: SOLR-13843.patch > > > MOVEREPLICA API in Solr creates NRT type replica in the target node always, > no matter what type the source replica is. This leads to an inconsistent > cluster state. > MoveReplicaCmd > {code} > ZkNodeProps addReplicasProps = new ZkNodeProps( > COLLECTION_PROP, coll.getName(), > SHARD_ID_PROP, slice.getName(), > CoreAdminParams.NODE, targetNode, > CoreAdminParams.NAME, newCoreName); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13894) Solr 8.3 streaming expreessions do not return all fields (select)
[ https://issues.apache.org/jira/browse/SOLR-13894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-13894: - Affects Version/s: (was: 8.3.0) 8.3 > Solr 8.3 streaming expreessions do not return all fields (select) > - > > Key: SOLR-13894 > URL: https://issues.apache.org/jira/browse/SOLR-13894 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud, streaming expressions >Affects Versions: 8.3 >Reporter: Jörn Franke >Priority: Major > > {color:#22}I use streaming expressions, e.g.{color} > {color:#22}sortsSelect(search(...),id,if({color}{color:#22}eq(1,1),Y,N) > as found), by=“field A asc”){color} > {color:#22}(Using export handler, sort is not really mandatory , I will > remove it later anyway){color} > {color:#22}This works perfectly fine if I use Solr 8.2.0 (server + > client). It returns Tuples in the form \{ “id”,”12345”, “found”:”Y”}{color} > {color:#22}However, if I use Solr 8.2.0 as server and Solr 8.3.0 as > client then the above statement only returns the id field, but not the > "found" field.{color} > {color:#22}Questions:{color} > {color:#22}1) is this expected behavior, ie Solr client 8.3.0 is in this > case not compatible with Solr 8.2.0 and server upgrade to Solr 8.3.0 will fix > this?{color} > {color:#22}2) has the syntax for the above expression changed? If so > how?{color} > {color:#22}3) is this not expected behavior and I should create a Jira > for it?{color} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13106) Multiple mlt.fl does not work well if the termvectors is repeated
[ https://issues.apache.org/jira/browse/SOLR-13106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-13106: - Affects Version/s: (was: 5.5.6) 5.5.5 > Multiple mlt.fl does not work well if the termvectors is repeated > - > > Key: SOLR-13106 > URL: https://issues.apache.org/jira/browse/SOLR-13106 > Project: Solr > Issue Type: Bug > Components: MoreLikeThis >Affects Versions: 5.5.5 >Reporter: luyi >Priority: Minor > > for an example: > my data is: > { "id":"100079750", > "title":" "I like cat, don't like dog", > "tags":["cat"], > "desc":["my cat photo"] > } > by the way title and desc's Tokenizer is IK. > and filed tags' type is text_ws. > while using mlt.fl=title,tags,desc with parameters debugQuery > the result shows: > "interestingTerms":[ "desc:my",1.0, "desc:photo",1.0, "desc:don",1.0, > "title:dog",1.0, "desc:cat",1.0, "title:like",1.0], > "debug":{ > "rawquerystring":"id:61", > "querystring":"id:61", > "parsedquery":"desc:my desc:photo desc:don title:dog desc:cat title:like", > "parsedquery_toString":"desc:my desc:photo desc:don title:dog desc:cat > title:like", > .. > look at the word cat > it appears in field tags, desc and title, > but the result shows the word cat only used in field desc and was ignored in > field tags and title. > Finally, I found the reason when the word is repeated in more than one > field.It will only be used in one field to do the work. > otherwise sometimes word is only in field tags, but while doing the mlt, the > word was shows as other field such as title or desc, in fact there is never > appear in these fields! > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13094) NPE while doing regular Facet
[ https://issues.apache.org/jira/browse/SOLR-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-13094: - Affects Version/s: (was: 7.5.0) 7.5 > NPE while doing regular Facet > - > > Key: SOLR-13094 > URL: https://issues.apache.org/jira/browse/SOLR-13094 > Project: Solr > Issue Type: Bug > Components: Facet Module >Affects Versions: 7.5 >Reporter: Amrit Sarkar >Priority: Major > > I am issuing a regular facet query: > {code} > params = new ModifiableSolrParams() > .add("q", query.trim()) > .add("rows", "0") > .add("facet", "true") > .add("facet.field", "description") > .add("facet.limit", "200"); > {code} > Exception: > {code} > 2018-12-24 15:50:20.843 ERROR (qtp690521419-130) [c:wiki s:shard2 > r:core_node4 x:wiki_shard2_replica_n2] o.a.s.s.HttpSolrCall > null:org.apache.solr.common.SolrException: Exception during facet.field: > description > at > org.apache.solr.request.SimpleFacets.lambda$getFacetFieldCounts$0(SimpleFacets.java:832) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at org.apache.solr.request.SimpleFacets$3.execute(SimpleFacets.java:765) > at > org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:841) > at > org.apache.solr.handler.component.FacetComponent.getFacetCounts(FacetComponent.java:329) > at > org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:273) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2541) > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:709) > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:515) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:377) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:323) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1317) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1219) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219) > at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) > at > org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) > at org.eclipse.jetty.server.Server.handle(Server.java:531) > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:352) > at > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260) > at > org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:281) > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102) > at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333) > at > org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310) >
[jira] [Updated] (SOLR-13026) Admin UI - dataimport status has green bar even when import fails
[ https://issues.apache.org/jira/browse/SOLR-13026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-13026: - Affects Version/s: (was: 7.5.0) 7.5 > Admin UI - dataimport status has green bar even when import fails > - > > Key: SOLR-13026 > URL: https://issues.apache.org/jira/browse/SOLR-13026 > Project: Solr > Issue Type: Bug > Components: Admin UI >Affects Versions: 7.5 > Environment: For screenshot, Solr 7.5.0 on Windows. > The production setup is a patched 7.1.0 version on Linux. >Reporter: Shawn Heisey >Priority: Major > Attachments: DIH-failed-UI-green.png > > > In the admin UI, the dataimport status screen is showing a green status bar > even when an import fails to run. The error that occurred in attached > screenshot was a connection problem -- in this case the database didn't > exist. I have seen this in production when a URL for SQL Server is > incorrect. The raw status output clearly shows "Full import failed". > I believe that the status should show in a different color, probably red. > There is an icon of a green check mark in the status also. For those who are > color blind, that should change to an icon with an X in it for a visual > indicator not related to color. > I am painfully aware of how terrible the DIH status output is. It is great > for human readability, but extremely difficult for a computer to understand. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13126) Multiplicative boost of isn't applied when one of the summed or multiplied queries doesn't match
[ https://issues.apache.org/jira/browse/SOLR-13126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-13126: - Affects Version/s: (was: 7.5.0) 7.5 > Multiplicative boost of isn't applied when one of the summed or multiplied > queries doesn't match > - > > Key: SOLR-13126 > URL: https://issues.apache.org/jira/browse/SOLR-13126 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 7.3, 7.4, 7.5, 7.6, 7.7, 7.7.1 > Environment: Reproduced with macOS 10.14.1, a quick test with Windows > 10 showed the same result. >Reporter: Thomas Aglassinger >Assignee: Alan Woodward >Priority: Major > Fix For: 7.7.2, 8.0 > > Attachments: > 0001-use-deprecated-classes-to-fix-regression-introduced-.patch, > 0002-SOLR-13126-Added-test-case.patch, 2019-02-14_1715.png, SOLR-13126.patch, > SOLR-13126.patch, debugQuery.json, image-2019-02-13-16-17-56-272.png, > screenshot-1.png, solr_match_neither_nextteil_nor_sony.json, > solr_match_neither_nextteil_nor_sony.txt, solr_match_netzteil_and_sony.json, > solr_match_netzteil_and_sony.txt, solr_match_netzteil_only.json, > solr_match_netzteil_only.txt > > > Under certain circumstances search results from queries with multiple > multiplicative boosts using the Solr functions {{product()}} and {{query()}} > result in a score that is inconsistent with the one from the debugQuery > information. Also only the debug score is correct while the actual search > results show a wrong score. > This seems somewhat similar to the behaviour described in > https://issues.apache.org/jira/browse/LUCENE-7132, though this issue has been > resolved a while ago. > A little background: we are using Solr as a search platform for the > e-commerce framework SAP Hybris. There the shop administrator can create > multiplicative boost rules (see below for an example) where a value like 2.0 > means that an item gets boosted to 200%. This works fine in the demo shop > distributed by SAP but breaks in our shop. We encountered the issue when > Upgrading from Solr 7.2.1 / Hybris 6.7 to Solr 7.5 / Hybris 18.8.3 (which > would have been named Hybris 6.8 but the version naming schema changed). > We reduced the Solr query generated by Hybris to the relevant parts and could > reproduce the issue in the Solr admin without any Hybris connection. > I attached the JSON result of a test query but here's a description of the > parts that seemed most relevant to me. > The {{responseHeader.params}} reads (slightly rearranged): > {code:java} > "q":"{!boost b=$ymb}(+{!lucene v=$yq})", > "ymb":"product(query({!v=\"name_text_de\\:Netzteil\\^=2.0\"},1),query({!v=\"name_text_de\\:Sony\\^=3.0\"},1))", > "yq":"*:*", > "sort":"score desc", > "debugQuery":"true", > // Added to keep the output small but probably unrelated to the actual issue > "fl":"score,id,code_string,name_text_de", > "fq":"catalogId:\"someProducts\"", > "rows":"10", > {code} > This example boosts the German product name (field {{name_text_de}}) in case > in contains certain terms: > * "Netzteil" (power supply) is boosted to 200% > * "Sony" is boosted to 300% > Consequently a product containing both terms should be boosted to 600%. > Also the query function has the value 1 specified as default in case the name > does not contain the respective term resulting in a pseudo boost that > preserves the score. > According to the debug information the parser used is the LuceneQParser, > which translates this to the following parsed query: > {quote}FunctionScoreQuery(FunctionScoreQuery(+*:*, scored by > boost(product(query((ConstantScore(name_text_de:netzteil))^2.0,def=1.0),query((ConstantScore(name_text_de:sony))^3.0,def=1.0) > {quote} > And the translated boost is: > {quote}org.apache.lucene.queries.function.valuesource.ProductFloatFunction:product(query((ConstantScore(name_text_de:netzteil))^2.0,def=1.0),query((ConstantScore(name_text_de:sony))^3.0,def=1.0)) > {quote} > When taking a look at the search result, among other the following products > are included (see the JSON comments for an analysis of each result): > {code:javascript} > { > "id":"someProducts/Online/test711", > "name_text_de":"Original Sony Vaio Netzteil", > "code_string":"test711", > // CORRECT, both "Netzteil" and "Sony" are included in the name > "score":6.0}, > { > "id":"someProducts/Online/taxTestingProductThree", > "name_text_de":"Steuertestprodukt Zwei", > "code_string":"taxTestingProductThree", > // CORRECT, neither "Netzteil" nor "Sony" are included in the name > "score":1.0}, > { > "id":"someProduct
[jira] [Updated] (SOLR-13699) maxChars no longer working on CopyField with Javabin
[ https://issues.apache.org/jira/browse/SOLR-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-13699: - Affects Version/s: (was: 8.0.1) > maxChars no longer working on CopyField with Javabin > > > Key: SOLR-13699 > URL: https://issues.apache.org/jira/browse/SOLR-13699 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7, 7.7.1, 7.7.2, 8.0, 8.1, 8.2, 7.7.3, 8.1.1, 8.1.2 >Reporter: Chris Troullis >Assignee: Noble Paul >Priority: Major > Fix For: 8.3 > > Attachments: SOLR-13699.patch, SOLR-13699.patch > > Time Spent: 20m > Remaining Estimate: 0h > > We recently upgraded from Solr 7.3 to 8.1, and noticed that the maxChars > property on a copy field is no longer functioning as designed, while indexing > via SolrJ. Per the most recent documentation it looks like there have been no > intentional changes as to the functionality of this property, so I assume > this is a bug. > > In debugging the issue, it looks like the bug was caused by SOLR-12992. In > DocumentBuilder where the maxChar limit is applied, it first checks if the > value is instanceof String. As of SOLR-12992, string values are now coming in > as ByteArrayUtf8CharSequence (unless they are above a certain size as defined > by JavaBinCodec.MAX_UTF8_SZ), so they are failing the instanceof String > check, and the maxChar truncation is not being applied. > > The issue seems to be limited to Javabin, docs indexed in other formats > (where values come in as strings) are working fine. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13385) Upgrade dependency jackson-databind in solr package contrib/prometheus-exporter/lib
[ https://issues.apache.org/jira/browse/SOLR-13385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-13385: - Affects Version/s: (was: 8.0.1) 8.0 > Upgrade dependency jackson-databind in solr package > contrib/prometheus-exporter/lib > --- > > Key: SOLR-13385 > URL: https://issues.apache.org/jira/browse/SOLR-13385 > Project: Solr > Issue Type: Bug >Affects Versions: 7.6, 8.0 >Reporter: DW >Assignee: Kevin Risden >Priority: Major > Fix For: 8.1, master (9.0) > > > The current used jackson-databind in > /contrib/prometheus-exporter/lib/jackson-databind-2.9.6.jar has known > Security Vulnerabilities record. Please upgrade to 2.9.8+. Thanks. > > Please let me know if you would like detailed CVE records. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13838) igain query parser generating invalid output
[ https://issues.apache.org/jira/browse/SOLR-13838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cassandra Targett updated SOLR-13838: - Fix Version/s: (was: 8.3) > igain query parser generating invalid output > > > Key: SOLR-13838 > URL: https://issues.apache.org/jira/browse/SOLR-13838 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Affects Versions: 8.2 > Environment: The issue is a generic Java defect and therefore will be > independent of the operating system or software platform. >Reporter: Peter Davie >Priority: Major > Attachments: IGainTermsQParserPlugin.java.patch > > > Investigating the output from the "features()" stream source, terms are being > returned with NaN for the score_f field: > "docs": [ > { > "featureSet_s": "business", > "score_f": "NaN", > "term_s": "1,011.15", > "idf_d": "-Infinity", > "index_i": 1, > "id": "business_1" > }, > { > "featureSet_s": "business", > "score_f": "NaN", > "term_s": "10.3m", > "idf_d": "-Infinity", > "index_i": 2, > "id": "business_2" > }, > { > "featureSet_s": "business", > "score_f": "NaN", > "term_s": "01", > "idf_d": "-Infinity", > "index_i": 3, > "id": "business_3" > },... > Looking into{{ org/apache/solr/search/IGainTermsQParserPlugin.java}}, it > seems that when a term is not included in the positive or negative documents, > the docFreq calculation (docFreq = xc + nc) is 0, which means that subsequent > calculations result in NaN (division by 0). > Attached is a patch which skips terms for which docFreq > is 0 in the finish() method of IGainTermsQParserPlugin and this resolves the > issues with NaN scores in the features() output. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12217) Add support for shards.preference in single shard cases
[ https://issues.apache.org/jira/browse/SOLR-12217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969548#comment-16969548 ] Noble Paul commented on SOLR-12217: --- Can you please give me examples of how a user would use it? Over http as well as SolrJ? > Add support for shards.preference in single shard cases > --- > > Key: SOLR-12217 > URL: https://issues.apache.org/jira/browse/SOLR-12217 > Project: Solr > Issue Type: New Feature >Reporter: Tomas Eduardo Fernandez Lobbe >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > SOLR-11982 Added support for {{shards.preference}}, a way to define the > sorting of replicas within a shard by preference (replica types/location). > This only works on multi-shard cases. We should add support for the case of > single shards when using CloudSolrClient -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ctargett commented on issue #990: [SOLR-13885] Typo corrections.
ctargett commented on issue #990: [SOLR-13885] Typo corrections. URL: https://github.com/apache/lucene-solr/pull/990#issuecomment-551261746 I looked at this PR with an eye to commit it, but since the precommit check failed here I thought I'd just make sure on my local machine. When I download this PR as a patch to my machine, precommit fails for me locally in the same spot the GitHub check failed. I checked the changes and can't see any reason why it would fail precommit. I checked for errant EOL spaces, etc., and don't see any. Building the Ref Guide with these changes also works fine. I ran precommit on master after this precommit had failed, and it was fine so it's definitely something in this patch. Unfortunately, we can't commit this until we can get precommit to pass. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-12217) Add support for shards.preference in single shard cases
[ https://issues.apache.org/jira/browse/SOLR-12217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969555#comment-16969555 ] Houston Putman commented on SOLR-12217: --- So for single-sharded collections, you can't really do this over http. Because whichever core receives the request will handle it. Therefore the SolrJ options are needed to send the request to the correct core in the first place. All you need to do is use the {{CloudSolrClient}} or {{CloudHttp2SolrClient}} and use the [shards.preference parameter|https://lucene.apache.org/solr/guide/7_4/distributed-requests.html#shards-preference-parameter] in your request. The client will take your request params and route the request accordingly. For Streaming Expressions, you can add the {{shards.preference}} parameter in the URL params with your streaming expression, or as an argument to the expression itself, e.g. {{search(collection, field:a, , shards.preference=...)}}. The PR has documentation added to the Ref Guide for both SolrJ and Streaming Expressions. > Add support for shards.preference in single shard cases > --- > > Key: SOLR-12217 > URL: https://issues.apache.org/jira/browse/SOLR-12217 > Project: Solr > Issue Type: New Feature >Reporter: Tomas Eduardo Fernandez Lobbe >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > SOLR-11982 Added support for {{shards.preference}}, a way to define the > sorting of replicas within a shard by preference (replica types/location). > This only works on multi-shard cases. We should add support for the case of > single shards when using CloudSolrClient -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ctargett closed pull request #955: Fix ref guide for autoscaling metric trigger SOLR-13847
ctargett closed pull request #955: Fix ref guide for autoscaling metric trigger SOLR-13847 URL: https://github.com/apache/lucene-solr/pull/955 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ctargett commented on issue #955: Fix ref guide for autoscaling metric trigger SOLR-13847
ctargett commented on issue #955: Fix ref guide for autoscaling metric trigger SOLR-13847 URL: https://github.com/apache/lucene-solr/pull/955#issuecomment-551268133 The changes in this PR were committed from a patch - I didn't notice this PR then. Closing this since it's no longer needed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ctargett commented on issue #952: SOLR-12786: update Ref Guide build tooling versions
ctargett commented on issue #952: SOLR-12786: update Ref Guide build tooling versions URL: https://github.com/apache/lucene-solr/pull/952#issuecomment-551270142 So I guess I put up this PR and then made a patch for myself and committed the patch instead of merging this PR? Closing this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ctargett closed pull request #952: SOLR-12786: update Ref Guide build tooling versions
ctargett closed pull request #952: SOLR-12786: update Ref Guide build tooling versions URL: https://github.com/apache/lucene-solr/pull/952 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9027) SIMD-based decoding of postings lists
[ https://issues.apache.org/jira/browse/LUCENE-9027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969571#comment-16969571 ] Adrien Grand commented on LUCENE-9027: -- This QPS is consistent with what we are seeing on the nightly benchmarks http://people.apache.org/~mikemccand/lucenebench/Term.html. I think Term doesn't show a speedup because decoding postings is not the bottleneck of term queries. When running a term query, Lucene would only decode blocks of postings that have a competitive match. On the other hand, for queries like AndHighMed, the high-cardinality clause needs to decode lots of blocks, and it's not unlikely that many decoded blocks don't even translate to a match for the conjunction. > SIMD-based decoding of postings lists > - > > Key: LUCENE-9027 > URL: https://issues.apache.org/jira/browse/LUCENE-9027 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > [~rcmuir] has been mentioning the idea for quite some time that we might be > able to write the decoding logic in such a way that Java would use SIMD > instructions. More recently [~paul.masurel] wrote a [blog > post|https://fulmicoton.com/posts/bitpacking/] that raises the point that > Lucene could still do decode multiple ints at once in a single instruction by > packing two ints in a long and we've had some discussions about what we could > try in Lucene to speed up the decoding of postings. This made me want to look > a bit deeper at what we could do. > Our current decoding logic reads data in a byte[] and decodes packed integers > from it. Unfortunately it doesn't make use of SIMD instructions and looks > like > [this|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/NaiveByteDecoder.java]. > I confirmed by looking at the generated assembly that if I take an array of > integers and shift them all by the same number of bits then Java will use > SIMD instructions to shift multiple integers at once. This led me to writing > this > [implementation|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/SimpleSIMDDecoder.java] > that tries as much as possible to shift long sequences of ints by the same > number of bits to speed up decoding. It is indeed faster than the current > logic we have, up to about 2x faster for some numbers of bits per value. > Currently the best > [implementation|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/SIMDDecoder.java] > I've been able to come up with combines the above idea with the idea that > Paul mentioned in his blog that consists of emulating SIMD from Java by > packing multiple integers into a long: 2 ints, 4 shorts or 8 bytes. It is a > bit harder to read but gives another speedup on top of the above > implementation. > I have a [JMH > benchmark|https://github.com/jpountz/decode-128-ints-benchmark/] available in > case someone would like to play with this and maybe even come up with an even > faster implementation. It is 2-2.5x faster than our current implementation > for most numbers of bits per value. I'm copying results here: > {noformat} > * `readLongs` just reads 2*bitsPerValue longs from the ByteBuffer, it serves > as >a baseline. > * `decodeNaiveFromBytes` reads a byte[] and decodes from it. This is what the >current Lucene codec does. > * `decodeNaiveFromLongs` decodes from longs on the fly. > * `decodeSimpleSIMD` is a simple implementation that relies on how Java >recognizes some patterns and uses SIMD instructions. > * `decodeSIMD` is a more complex implementation that both relies on the C2 >compiler to generate SIMD instructions and encodes 8 bytes, 4 shorts or >2 ints in a long in order to decompress multiple values at once. > Benchmark (bitsPerValue) (byteOrder) > Mode Cnt Score Error Units > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 1 LE > thrpt5 12.912 ± 0.393 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 1 BE > thrpt5 12.862 ± 0.395 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 2 LE > thrpt5 13.040 ± 1.162 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 2 BE > thrpt5 13.027 ± 0.270 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 3 LE > thrpt5 12.409 ± 0.637 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 3 BE > thrpt5 12.268 ± 0.947 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 4 LE > thrpt5 14.17
[jira] [Commented] (LUCENE-9027) SIMD-based decoding of postings lists
[ https://issues.apache.org/jira/browse/LUCENE-9027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969575#comment-16969575 ] Adrien Grand commented on LUCENE-9027: -- I'm curious whether you have thoughts about how the pull request specializes for the endianness of the machine that creates the index. I'm a bit unhappy about this, but on the other hand reversing bytes proved to be a bottleneck on the benchmarks that I ran. > SIMD-based decoding of postings lists > - > > Key: LUCENE-9027 > URL: https://issues.apache.org/jira/browse/LUCENE-9027 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > [~rcmuir] has been mentioning the idea for quite some time that we might be > able to write the decoding logic in such a way that Java would use SIMD > instructions. More recently [~paul.masurel] wrote a [blog > post|https://fulmicoton.com/posts/bitpacking/] that raises the point that > Lucene could still do decode multiple ints at once in a single instruction by > packing two ints in a long and we've had some discussions about what we could > try in Lucene to speed up the decoding of postings. This made me want to look > a bit deeper at what we could do. > Our current decoding logic reads data in a byte[] and decodes packed integers > from it. Unfortunately it doesn't make use of SIMD instructions and looks > like > [this|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/NaiveByteDecoder.java]. > I confirmed by looking at the generated assembly that if I take an array of > integers and shift them all by the same number of bits then Java will use > SIMD instructions to shift multiple integers at once. This led me to writing > this > [implementation|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/SimpleSIMDDecoder.java] > that tries as much as possible to shift long sequences of ints by the same > number of bits to speed up decoding. It is indeed faster than the current > logic we have, up to about 2x faster for some numbers of bits per value. > Currently the best > [implementation|https://github.com/jpountz/decode-128-ints-benchmark/blob/master/src/main/java/jpountz/SIMDDecoder.java] > I've been able to come up with combines the above idea with the idea that > Paul mentioned in his blog that consists of emulating SIMD from Java by > packing multiple integers into a long: 2 ints, 4 shorts or 8 bytes. It is a > bit harder to read but gives another speedup on top of the above > implementation. > I have a [JMH > benchmark|https://github.com/jpountz/decode-128-ints-benchmark/] available in > case someone would like to play with this and maybe even come up with an even > faster implementation. It is 2-2.5x faster than our current implementation > for most numbers of bits per value. I'm copying results here: > {noformat} > * `readLongs` just reads 2*bitsPerValue longs from the ByteBuffer, it serves > as >a baseline. > * `decodeNaiveFromBytes` reads a byte[] and decodes from it. This is what the >current Lucene codec does. > * `decodeNaiveFromLongs` decodes from longs on the fly. > * `decodeSimpleSIMD` is a simple implementation that relies on how Java >recognizes some patterns and uses SIMD instructions. > * `decodeSIMD` is a more complex implementation that both relies on the C2 >compiler to generate SIMD instructions and encodes 8 bytes, 4 shorts or >2 ints in a long in order to decompress multiple values at once. > Benchmark (bitsPerValue) (byteOrder) > Mode Cnt Score Error Units > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 1 LE > thrpt5 12.912 ± 0.393 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 1 BE > thrpt5 12.862 ± 0.395 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 2 LE > thrpt5 13.040 ± 1.162 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 2 BE > thrpt5 13.027 ± 0.270 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 3 LE > thrpt5 12.409 ± 0.637 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 3 BE > thrpt5 12.268 ± 0.947 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 4 LE > thrpt5 14.177 ± 2.263 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 4 BE > thrpt5 11.457 ± 0.150 ops/us > PackedIntsDecodeBenchmark.decodeNaiveFromBytes 5 LE > thrpt5 10.988 ± 1.179 ops/us > PackedIntsDecodeBenchmark.decodeNaiv
[jira] [Commented] (LUCENE-9038) Evaluate Caffeine for LruQueryCache
[ https://issues.apache.org/jira/browse/LUCENE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969604#comment-16969604 ] Adrien Grand commented on LUCENE-9038: -- Hi Ben This looks like a good summary of the issues that affect our cache. I'm quite impressed by how thorough this list is! bq. In retrospect, a separate QueryCache should be implemented. LruQueryCache declares in its contract that methods like onHit, onQueryEviction, etc. are executed under the global lock. This means implementations may rely on this exclusive read/write access to data structures, a requirement that cannot be supported efficiently. If we can build a better cache then we will find a way to transition our users to it, I wouldn't worry about the migration path yet. We'll figure something out. bq. Since the developers are experts on search, not caching, it seems justified to evaluate if an off-the-shelf library would be more helpful in terms of developer time, code complexity, and performance. We want lucene-core to be dependency-free, so we couldn't add caffeine as a dependency of lucene-core. However other options include having it as a dependency of a module that would expose a different cache implementation, reuse some of its ideas in the current cache implementation or fork the code that we need. bq. It appears that the cache's overhead can be just as much of a benefit as a liability, causing various workarounds and complexity. FYI when I implemented this cache, I went for simplicity in terms of locking, so there is certainly room for improvement. One thing that is not obvious immediately and makes implementing a query cache for Lucene a bit tricky is that it needs to be able to efficiently evict all cache entries for a given segment. This is the reason why the current implementation uses two levels of maps instead of a single map that would take (Query,CacheHeler.Key) pairs as keys. > Evaluate Caffeine for LruQueryCache > --- > > Key: LUCENE-9038 > URL: https://issues.apache.org/jira/browse/LUCENE-9038 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ben Manes >Priority: Major > > [LRUQueryCache|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java] > appears to play a central role in Lucene's performance. There are many > issues discussing its performance, such as LUCENE-7235, LUCENE-7237, > LUCENE-8027, LUCENE-8213, and LUCENE-9002. It appears that the cache's > overhead can be just as much of a benefit as a liability, causing various > workarounds and complexity. > When reviewing the discussions and code, the following issues are concerning: > # The cache is guarded by a single lock for all reads and writes. > # All computations are performed outside of the any locking to avoid > penalizing other callers. This doesn't handle the cache stampedes meaning > that multiple threads may cache miss, compute the value, and try to store it. > That redundant work becomes expensive under load and can be mitigated with ~ > per-key locks. > # The cache queries the entry to see if it's even worth caching. At first > glance one assumes that is so that inexpensive entries don't bang on the lock > or thrash the LRU. However, this is also used to indicate data dependencies > for uncachable items (per JIRA), which perhaps shouldn't be invoking the > cache. > # The cache lookup is skipped if the global lock is held and the value is > computed, but not stored. This means a busy lock reduces performance across > all usages and the cache's effectiveness degrades. This is not counted in the > miss rate, giving a false impression. > # An attempt was made to perform computations asynchronously, due to their > heavy cost on tail latencies. That work was reverted due to test failures and > is being worked on. > # An [in-progress change|https://github.com/apache/lucene-solr/pull/940] > tries to avoid LRU thrashing due to large, infrequently used items being > cached. > # The cache is tightly intertwined with business logic, making it hard to > tease apart core algorithms and data structures from the usage scenarios. > It seems that more and more items skip being cached because of concurrency > and hit rate performance, causing special case fixes based on knowledge of > the external code flows. Since the developers are experts on search, not > caching, it seems justified to evaluate if an off-the-shelf library would be > more helpful in terms of developer time, code complexity, and performance. > Solr has already introduced [Caffeine|https://github.com/ben-manes/caffeine] > in SOLR-8241 and SOLR-13817. > The proposal is to replace the internals {{LruQueryCache}} so that external > usages are not affected in terms of the API. H
[jira] [Commented] (SOLR-11492) More Modern cloud dev script
[ https://issues.apache.org/jira/browse/SOLR-11492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969610#comment-16969610 ] Robert Bunch commented on SOLR-11492: - Now that 8.3 is out, please give my 9/21 version of cloud.sh a shot. As I recall I did a lot of cleanup and whatnot, and the override file. > More Modern cloud dev script > > > Key: SOLR-11492 > URL: https://issues.apache.org/jira/browse/SOLR-11492 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 8.0 >Reporter: Gus Heck >Assignee: Gus Heck >Priority: Minor > Fix For: 8.3 > > Attachments: SOLR-11492.patch, cloud.sh, cloud.sh, cloud.sh, > cloud.sh, cloud.sh, cloud.sh, cloud.sh > > > Most of the scripts in solr/cloud-dev do things like start using java -jar > and other similarly ancient techniques. I recently decided I really didn't > like that it was a pain to setup a cloud to test a patch/feature and that > often one winds up needing to blow away existing testing so working on more > than one thing at a time is irritating... so here's a script I wrote, if > folks like it I'd be happy for it to be included in solr/cloud-dev -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-11492) More Modern cloud dev script
[ https://issues.apache.org/jira/browse/SOLR-11492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969610#comment-16969610 ] Robert Bunch edited comment on SOLR-11492 at 11/7/19 10:43 PM: --- Now that 8.3 is out, please give my 9/21 version of cloud.sh a shot. As I recall I did a lot of cleanup and whatnot, and the override file. Hmm... though this ticket is closed, guess I should just open a ticket and submit the patch? was (Author: bunchr): Now that 8.3 is out, please give my 9/21 version of cloud.sh a shot. As I recall I did a lot of cleanup and whatnot, and the override file. > More Modern cloud dev script > > > Key: SOLR-11492 > URL: https://issues.apache.org/jira/browse/SOLR-11492 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 8.0 >Reporter: Gus Heck >Assignee: Gus Heck >Priority: Minor > Fix For: 8.3 > > Attachments: SOLR-11492.patch, cloud.sh, cloud.sh, cloud.sh, > cloud.sh, cloud.sh, cloud.sh, cloud.sh > > > Most of the scripts in solr/cloud-dev do things like start using java -jar > and other similarly ancient techniques. I recently decided I really didn't > like that it was a pain to setup a cloud to test a patch/feature and that > often one winds up needing to blow away existing testing so working on more > than one thing at a time is irritating... so here's a script I wrote, if > folks like it I'd be happy for it to be included in solr/cloud-dev -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13901) Update jackson-databind to 2.10.0.pr1 for security vulnerabilities
[ https://issues.apache.org/jira/browse/SOLR-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Dumont updated SOLR-13901: -- Security: Public (was: Private (Security Issue)) > Update jackson-databind to 2.10.0.pr1 for security vulnerabilities > -- > > Key: SOLR-13901 > URL: https://issues.apache.org/jira/browse/SOLR-13901 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: Build >Affects Versions: 8.3 >Reporter: Charles Dumont >Priority: Major > > This is needed to resolve the following security vulnerabilities: > [CVE-2019-14540|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14540]], > > [CVE-2019-16335|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16335]], > > [CVE-2019-16942|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16942]], > > [CVE-2019-16943|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16943]], > > [CVE-2019-17267|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17267]] > and > [CVE-2019-17531|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17531]]. > If solr is not impacted by these vulnerabilities then go ahead and > de-escalate this issue. Thanks. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9036) ExitableDirectoryReader to interrupt DocValues as well
[ https://issues.apache.org/jira/browse/LUCENE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969618#comment-16969618 ] Adrien Grand commented on LUCENE-9036: -- Besides formatting issues (constants should be final and capitalized, spaces around equals signs), the approach looks good to me. > ExitableDirectoryReader to interrupt DocValues as well > -- > > Key: LUCENE-9036 > URL: https://issues.apache.org/jira/browse/LUCENE-9036 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mikhail Khludnev >Priority: Major > Attachments: LUCENE-9036.patch > > > This allow to make AnalyticsComponent and json.facet sensitive to time > allowed. > Does it make sense? Is it enough to check on DV creation ie per field/segment > or it's worth to check every Nth doc? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13901) Update jackson-databind to 2.10.0.pr1 for security vulnerabilities
[ https://issues.apache.org/jira/browse/SOLR-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Dumont updated SOLR-13901: -- Description: This is needed to resolve the following security vulnerabilities: [CVE-2019-14540 |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14540]], [CVE-2019-16335 |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16335]], [CVE-2019-16942 |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16942]], [CVE-2019-16943 |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16943]], [CVE-2019-17267 |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17267]] and [CVE-2019-17531 |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17531]]. If solr is not impacted by these vulnerabilities then go ahead and de-escalate this issue. Thanks. was: This is needed to resolve the following security vulnerabilities: [CVE-2019-14540|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14540]], [CVE-2019-16335|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16335]], [CVE-2019-16942|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16942]], [CVE-2019-16943|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16943]], [CVE-2019-17267|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17267]] and [CVE-2019-17531|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17531]]. If solr is not impacted by these vulnerabilities then go ahead and de-escalate this issue. Thanks. > Update jackson-databind to 2.10.0.pr1 for security vulnerabilities > -- > > Key: SOLR-13901 > URL: https://issues.apache.org/jira/browse/SOLR-13901 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: Build >Affects Versions: 8.3 >Reporter: Charles Dumont >Priority: Major > > This is needed to resolve the following security vulnerabilities: > [CVE-2019-14540 > |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14540]], > [CVE-2019-16335 > |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16335]], > [CVE-2019-16942 > |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16942]], > [CVE-2019-16943 > |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16943]], > [CVE-2019-17267 > |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17267]] and > [CVE-2019-17531 > |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17531]]. > If solr is not impacted by these vulnerabilities then go ahead and > de-escalate this issue. Thanks. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13901) Update jackson-databind to 2.10.0.pr1 for security vulnerabilities
[ https://issues.apache.org/jira/browse/SOLR-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Dumont updated SOLR-13901: -- Description: This is needed to resolve the following security vulnerabilities: [CVE-2019-14540|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14540]], [CVE-2019-16335|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16335]], [CVE-2019-16942|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16942]], [CVE-2019-16943|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16943]], [CVE-2019-17267|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17267]] and [CVE-2019-17531|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17531]]. If solr is not impacted by these vulnerabilities then go ahead and de-escalate this issue. Thanks. was: This is needed to resolve the following security vulnerabilities: [CVE-2019-14540 |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14540]], [CVE-2019-16335 |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16335]], [CVE-2019-16942 |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16942]], [CVE-2019-16943 |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16943]], [CVE-2019-17267 |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17267]] and [CVE-2019-17531 |[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17531]]. If solr is not impacted by these vulnerabilities then go ahead and de-escalate this issue. Thanks. > Update jackson-databind to 2.10.0.pr1 for security vulnerabilities > -- > > Key: SOLR-13901 > URL: https://issues.apache.org/jira/browse/SOLR-13901 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: Build >Affects Versions: 8.3 >Reporter: Charles Dumont >Priority: Major > > This is needed to resolve the following security vulnerabilities: > [CVE-2019-14540|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-14540]], > > [CVE-2019-16335|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16335]], > > [CVE-2019-16942|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16942]], > > [CVE-2019-16943|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16943]], > > [CVE-2019-17267|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17267]] > and > [CVE-2019-17531|[https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-17531]]. > If solr is not impacted by these vulnerabilities then go ahead and > de-escalate this issue. Thanks. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #940: LUCENE-9002: Query caching leads to absurdly slow queries
jpountz commented on a change in pull request #940: LUCENE-9002: Query caching leads to absurdly slow queries URL: https://github.com/apache/lucene-solr/pull/940#discussion_r343917507 ## File path: lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java ## @@ -114,12 +115,20 @@ * Expert: Create a new instance that will cache at most maxSize * queries with at most maxRamBytesUsed bytes of memory, only on * leaves that satisfy {@code leavesToCache}. + * + * Also, clauses whose cost is {@code skipCacheFactor} times more than the cost of the top-level query + * will not be cached in order to not slow down queries too much. */ public LRUQueryCache(int maxSize, long maxRamBytesUsed, - Predicate leavesToCache) { + Predicate leavesToCache, float skipCacheFactor) { this.maxSize = maxSize; this.maxRamBytesUsed = maxRamBytesUsed; this.leavesToCache = leavesToCache; +if (skipCacheFactor < 1) { Review comment: ```suggestion if (skipCacheFactor >= 1 == false) { // NaN >= 1 evaluates true ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8213) Cache costly subqueries asynchronously
[ https://issues.apache.org/jira/browse/LUCENE-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969632#comment-16969632 ] Adrien Grand commented on LUCENE-8213: -- If you want to give it a try, I'd be happy to review it. > Cache costly subqueries asynchronously > -- > > Key: LUCENE-8213 > URL: https://issues.apache.org/jira/browse/LUCENE-8213 > Project: Lucene - Core > Issue Type: Improvement > Components: core/query/scoring >Affects Versions: 7.2.1 >Reporter: Amir Hadadi >Priority: Minor > Labels: performance > Attachments: > 0001-Reproduce-across-segment-caching-of-same-query.patch, > thetaphi_Lucene-Solr-master-Linux_24839.log.txt > > Time Spent: 20h 20m > Remaining Estimate: 0h > > IndexOrDocValuesQuery allows to combine costly range queries with a selective > lead iterator in an optimized way. However, the range query at some point > gets cached by a querying thread in LRUQueryCache, which negates the > optimization of IndexOrDocValuesQuery for that specific query. > It would be nice to see an asynchronous caching implementation in such cases, > so that queries involving IndexOrDocValuesQuery would have consistent > performance characteristics. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8213) Cache costly subqueries asynchronously
[ https://issues.apache.org/jira/browse/LUCENE-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969631#comment-16969631 ] Adrien Grand commented on LUCENE-8213: -- [~ben.manes] I believe this would work. Regarding Caffeine, I commented on the other issue. > Cache costly subqueries asynchronously > -- > > Key: LUCENE-8213 > URL: https://issues.apache.org/jira/browse/LUCENE-8213 > Project: Lucene - Core > Issue Type: Improvement > Components: core/query/scoring >Affects Versions: 7.2.1 >Reporter: Amir Hadadi >Priority: Minor > Labels: performance > Attachments: > 0001-Reproduce-across-segment-caching-of-same-query.patch, > thetaphi_Lucene-Solr-master-Linux_24839.log.txt > > Time Spent: 20h 20m > Remaining Estimate: 0h > > IndexOrDocValuesQuery allows to combine costly range queries with a selective > lead iterator in an optimized way. However, the range query at some point > gets cached by a querying thread in LRUQueryCache, which negates the > optimization of IndexOrDocValuesQuery for that specific query. > It would be nice to see an asynchronous caching implementation in such cases, > so that queries involving IndexOrDocValuesQuery would have consistent > performance characteristics. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-site] adamwalz opened a new pull request #3: Refactor Pelican templates to reduce code duplication
adamwalz opened a new pull request #3: Refactor Pelican templates to reduce code duplication URL: https://github.com/apache/lucene-site/pull/3 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-site] adamwalz commented on issue #3: Refactor Pelican templates to reduce code duplication
adamwalz commented on issue #3: Refactor Pelican templates to reduce code duplication URL: https://github.com/apache/lucene-site/pull/3#issuecomment-551308022 +3,512 −33,736 That should say enough about code duplication. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9029) Deprecate SloppyMath toRadians/toDegrees in favor of Java Math
[ https://issues.apache.org/jira/browse/LUCENE-9029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969638#comment-16969638 ] Adrien Grand commented on LUCENE-9029: -- Thanks Jack, I'll merge soon! > Deprecate SloppyMath toRadians/toDegrees in favor of Java Math > -- > > Key: LUCENE-9029 > URL: https://issues.apache.org/jira/browse/LUCENE-9029 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jack Conradson >Priority: Trivial > Attachments: LUCENE-9029.patch, LUCENE-9029.patch > > > This change follows a TODO left in SloppyMath to remove toRadians/toDegrees > since from Java 9 forward Math toRadians/toDegrees is now identical. Since > these methods/constants are public, deprecation messages are added to each > one. Internally, in Lucene, all instances of the SloppyMath versions are > replaced with the standard Java Math versions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9012) Setup minimal site with working Pelican build
[ https://issues.apache.org/jira/browse/LUCENE-9012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969639#comment-16969639 ] Adam Walz commented on LUCENE-9012: --- [~janhoy] [https://github.com/apache/lucene-site/pull/3] is ready which reduces code duplication by using Pelican extends and include syntax. git diff is +3,512 −33,736 :D > Setup minimal site with working Pelican build > - > > Key: LUCENE-9012 > URL: https://issues.apache.org/jira/browse/LUCENE-9012 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Jan Høydahl >Assignee: Jan Høydahl >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9038) Evaluate Caffeine for LruQueryCache
[ https://issues.apache.org/jira/browse/LUCENE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969663#comment-16969663 ] Ben Manes commented on LUCENE-9038: --- On the train ride to work, I started to play with stubbing out an implementation to better understand what an implementation could look like. For now I'm just untangling things in my head due to lack of familiarity and not expecting anything to be adopted. > We want lucene-core to be dependency-free, so we couldn't add caffeine as a > dependency of lucene-core. I am certainly fine with that and worry about it if I can offer something promising. In addition to the options you mentioned, we could [shade|https://maven.apache.org/plugins/maven-shade-plugin] / [shadow|https://github.com/johnrengelman/shadow] the dependency to an internal package name. > One thing that is not obvious immediately and makes implementing a query > cache for Lucene a bit tricky is that it needs to be able to efficiently > evict all cache entries for a given segment. Thank you. I was trying to understand the {{LeafCache}} and was still under the impression that it was unnecessary complexity. Can you explain why caching of segments is needed? This certainly makes it a lot harder since they grow, as you cache the queries at the segment level. Is this so that when updates occur all of the related cached queries are invalidated, to avoid stale responses? If so, would some versioning / generation field be applicable to maintain a single level cache? In that model the generation id is part of the key, allowing a simple increment to cause all of the prior content to not be fetched. This is common in remote caches (e.g. memcached) and, if doable here, we could maintain an index to proactively remove those stale entries. > Evaluate Caffeine for LruQueryCache > --- > > Key: LUCENE-9038 > URL: https://issues.apache.org/jira/browse/LUCENE-9038 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ben Manes >Priority: Major > > [LRUQueryCache|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java] > appears to play a central role in Lucene's performance. There are many > issues discussing its performance, such as LUCENE-7235, LUCENE-7237, > LUCENE-8027, LUCENE-8213, and LUCENE-9002. It appears that the cache's > overhead can be just as much of a benefit as a liability, causing various > workarounds and complexity. > When reviewing the discussions and code, the following issues are concerning: > # The cache is guarded by a single lock for all reads and writes. > # All computations are performed outside of the any locking to avoid > penalizing other callers. This doesn't handle the cache stampedes meaning > that multiple threads may cache miss, compute the value, and try to store it. > That redundant work becomes expensive under load and can be mitigated with ~ > per-key locks. > # The cache queries the entry to see if it's even worth caching. At first > glance one assumes that is so that inexpensive entries don't bang on the lock > or thrash the LRU. However, this is also used to indicate data dependencies > for uncachable items (per JIRA), which perhaps shouldn't be invoking the > cache. > # The cache lookup is skipped if the global lock is held and the value is > computed, but not stored. This means a busy lock reduces performance across > all usages and the cache's effectiveness degrades. This is not counted in the > miss rate, giving a false impression. > # An attempt was made to perform computations asynchronously, due to their > heavy cost on tail latencies. That work was reverted due to test failures and > is being worked on. > # An [in-progress change|https://github.com/apache/lucene-solr/pull/940] > tries to avoid LRU thrashing due to large, infrequently used items being > cached. > # The cache is tightly intertwined with business logic, making it hard to > tease apart core algorithms and data structures from the usage scenarios. > It seems that more and more items skip being cached because of concurrency > and hit rate performance, causing special case fixes based on knowledge of > the external code flows. Since the developers are experts on search, not > caching, it seems justified to evaluate if an off-the-shelf library would be > more helpful in terms of developer time, code complexity, and performance. > Solr has already introduced [Caffeine|https://github.com/ben-manes/caffeine] > in SOLR-8241 and SOLR-13817. > The proposal is to replace the internals {{LruQueryCache}} so that external > usages are not affected in terms of the API. However, like in {{SolrCache}}, > a difference is that Caffeine only bounds by either the number of entries or > an accumulated
[jira] [Commented] (LUCENE-9038) Evaluate Caffeine for LruQueryCache
[ https://issues.apache.org/jira/browse/LUCENE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969675#comment-16969675 ] Adrien Grand commented on LUCENE-9038: -- This is not due to invalidation but to how Lucene groups data into segments, that get regularly merged together into fewer bigger segments. When segments get merged away, they are closed, which triggers a callback on the cache that tells it that it may remove all entries that are about these segments, since they will never be used again. Before Lucene introduced a query cache, Elasticsearch used to have its own query cache that was based on Guava and used (Query,CacheKey.Helper) pairs as keys, and used to evict all entries for a segment by iterating over all cached entries and removing those that were about this segment. It triggered some interesting behaviors when closing top-level readers, which in-turn closes all their segments in sequence, which in-turn iterates over all remaining cached entries. So if you want to cache Q queries and have S segments, then you may have up to QxS entries in your cache, and thus closing the reader runs in O(QxS^2), and we were seeing users whose clusters would take ages to close indices because of this. One could make the argument that is is not required to evict those entries and that we could wait for them to get evicted naturally, but I don't like the idea of spending some of the JVM memory on unused data. > Evaluate Caffeine for LruQueryCache > --- > > Key: LUCENE-9038 > URL: https://issues.apache.org/jira/browse/LUCENE-9038 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ben Manes >Priority: Major > > [LRUQueryCache|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java] > appears to play a central role in Lucene's performance. There are many > issues discussing its performance, such as LUCENE-7235, LUCENE-7237, > LUCENE-8027, LUCENE-8213, and LUCENE-9002. It appears that the cache's > overhead can be just as much of a benefit as a liability, causing various > workarounds and complexity. > When reviewing the discussions and code, the following issues are concerning: > # The cache is guarded by a single lock for all reads and writes. > # All computations are performed outside of the any locking to avoid > penalizing other callers. This doesn't handle the cache stampedes meaning > that multiple threads may cache miss, compute the value, and try to store it. > That redundant work becomes expensive under load and can be mitigated with ~ > per-key locks. > # The cache queries the entry to see if it's even worth caching. At first > glance one assumes that is so that inexpensive entries don't bang on the lock > or thrash the LRU. However, this is also used to indicate data dependencies > for uncachable items (per JIRA), which perhaps shouldn't be invoking the > cache. > # The cache lookup is skipped if the global lock is held and the value is > computed, but not stored. This means a busy lock reduces performance across > all usages and the cache's effectiveness degrades. This is not counted in the > miss rate, giving a false impression. > # An attempt was made to perform computations asynchronously, due to their > heavy cost on tail latencies. That work was reverted due to test failures and > is being worked on. > # An [in-progress change|https://github.com/apache/lucene-solr/pull/940] > tries to avoid LRU thrashing due to large, infrequently used items being > cached. > # The cache is tightly intertwined with business logic, making it hard to > tease apart core algorithms and data structures from the usage scenarios. > It seems that more and more items skip being cached because of concurrency > and hit rate performance, causing special case fixes based on knowledge of > the external code flows. Since the developers are experts on search, not > caching, it seems justified to evaluate if an off-the-shelf library would be > more helpful in terms of developer time, code complexity, and performance. > Solr has already introduced [Caffeine|https://github.com/ben-manes/caffeine] > in SOLR-8241 and SOLR-13817. > The proposal is to replace the internals {{LruQueryCache}} so that external > usages are not affected in terms of the API. However, like in {{SolrCache}}, > a difference is that Caffeine only bounds by either the number of entries or > an accumulated size (e.g. bytes), but not both constraints. This likely is an > acceptable divergence in how the configuration is honored. > cc [~ab], [~dsmiley] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For addi
[GitHub] [lucene-solr] noblepaul closed pull request #910: SOLR-13661 : SOLR-13661 A package management system for Solr
noblepaul closed pull request #910: SOLR-13661 : SOLR-13661 A package management system for Solr URL: https://github.com/apache/lucene-solr/pull/910 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9038) Evaluate Caffeine for LruQueryCache
[ https://issues.apache.org/jira/browse/LUCENE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969700#comment-16969700 ] Ben Manes commented on LUCENE-9038: --- Interesting. Off the cuff... It sounds like we'd want to orchestrate it such that a write to level-2 needs to communicates to level-1 that the instance was modified, e.g. {{replace(key, v1, v1)}}. That would trigger Guava/Caffeine to re-weigh the entry and trigger an eviction. If that write-back failed, e.g. removed or became {{v2}}, then the caller would have to manually call the entry's eviction logic (e.g. if closable). The L1 would be bounded to evict segments and L2 unbounded, which matches the current implementation. The coordination would need to be handled, but shouldn't be overly tricky, if I understand correctly. > Evaluate Caffeine for LruQueryCache > --- > > Key: LUCENE-9038 > URL: https://issues.apache.org/jira/browse/LUCENE-9038 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ben Manes >Priority: Major > > [LRUQueryCache|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java] > appears to play a central role in Lucene's performance. There are many > issues discussing its performance, such as LUCENE-7235, LUCENE-7237, > LUCENE-8027, LUCENE-8213, and LUCENE-9002. It appears that the cache's > overhead can be just as much of a benefit as a liability, causing various > workarounds and complexity. > When reviewing the discussions and code, the following issues are concerning: > # The cache is guarded by a single lock for all reads and writes. > # All computations are performed outside of the any locking to avoid > penalizing other callers. This doesn't handle the cache stampedes meaning > that multiple threads may cache miss, compute the value, and try to store it. > That redundant work becomes expensive under load and can be mitigated with ~ > per-key locks. > # The cache queries the entry to see if it's even worth caching. At first > glance one assumes that is so that inexpensive entries don't bang on the lock > or thrash the LRU. However, this is also used to indicate data dependencies > for uncachable items (per JIRA), which perhaps shouldn't be invoking the > cache. > # The cache lookup is skipped if the global lock is held and the value is > computed, but not stored. This means a busy lock reduces performance across > all usages and the cache's effectiveness degrades. This is not counted in the > miss rate, giving a false impression. > # An attempt was made to perform computations asynchronously, due to their > heavy cost on tail latencies. That work was reverted due to test failures and > is being worked on. > # An [in-progress change|https://github.com/apache/lucene-solr/pull/940] > tries to avoid LRU thrashing due to large, infrequently used items being > cached. > # The cache is tightly intertwined with business logic, making it hard to > tease apart core algorithms and data structures from the usage scenarios. > It seems that more and more items skip being cached because of concurrency > and hit rate performance, causing special case fixes based on knowledge of > the external code flows. Since the developers are experts on search, not > caching, it seems justified to evaluate if an off-the-shelf library would be > more helpful in terms of developer time, code complexity, and performance. > Solr has already introduced [Caffeine|https://github.com/ben-manes/caffeine] > in SOLR-8241 and SOLR-13817. > The proposal is to replace the internals {{LruQueryCache}} so that external > usages are not affected in terms of the API. However, like in {{SolrCache}}, > a difference is that Caffeine only bounds by either the number of entries or > an accumulated size (e.g. bytes), but not both constraints. This likely is an > acceptable divergence in how the configuration is honored. > cc [~ab], [~dsmiley] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13822) Isolated Classloading from packages
[ https://issues.apache.org/jira/browse/SOLR-13822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969735#comment-16969735 ] ASF subversion and git services commented on SOLR-13822: Commit edb5f63869dce3be1e9b08e8869b68bba952b47b in lucene-solr's branch refs/heads/branch_8x from Noble Paul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=edb5f63 ] SOLR-13822: precommit error fixed > Isolated Classloading from packages > --- > > Key: SOLR-13822 > URL: https://issues.apache.org/jira/browse/SOLR-13822 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Assignee: Noble Paul >Priority: Major > Attachments: SOLR-13822.patch > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Design is here: > [https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?ts=5d86a8ad#] > > main features: > * A new file for packages definition (/packages.json) in ZK > * Public APIs to edit/read the file > * The APIs are registered at {{/api/cluster/package}} > * Classes can be loaded from the package classloader using the > {{:}} syntax -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9036) ExitableDirectoryReader to interrupt DocValues as well
[ https://issues.apache.org/jira/browse/LUCENE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969783#comment-16969783 ] Lucene/Solr QA commented on LUCENE-9036: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 29s{color} | {color:green} core in the patch passed. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 15m 56s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | LUCENE-9036 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985164/LUCENE-9036.patch | | Optional Tests | compile javac unit ratsources checkforbiddenapis validatesourcepatterns | | uname | Linux lucene1-us-west 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | ant | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh | | git revision | master / 5c7215fabfc | | ant | version: Apache Ant(TM) version 1.10.5 compiled on March 28 2019 | | Default Java | LTS | | Test Results | https://builds.apache.org/job/PreCommit-LUCENE-Build/223/testReport/ | | modules | C: lucene/core U: lucene/core | | Console output | https://builds.apache.org/job/PreCommit-LUCENE-Build/223/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > ExitableDirectoryReader to interrupt DocValues as well > -- > > Key: LUCENE-9036 > URL: https://issues.apache.org/jira/browse/LUCENE-9036 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mikhail Khludnev >Priority: Major > Attachments: LUCENE-9036.patch > > > This allow to make AnalyticsComponent and json.facet sensitive to time > allowed. > Does it make sense? Is it enough to check on DV creation ie per field/segment > or it's worth to check every Nth doc? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13822) Isolated Classloading from packages
[ https://issues.apache.org/jira/browse/SOLR-13822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969806#comment-16969806 ] ASF subversion and git services commented on SOLR-13822: Commit a09f2df21665d5c368411207463cdd45ca2d55b7 in lucene-solr's branch refs/heads/branch_8x from Noble Paul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a09f2df ] SOLR-13822: Missing package-info files > Isolated Classloading from packages > --- > > Key: SOLR-13822 > URL: https://issues.apache.org/jira/browse/SOLR-13822 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Assignee: Noble Paul >Priority: Major > Attachments: SOLR-13822.patch > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Design is here: > [https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?ts=5d86a8ad#] > > main features: > * A new file for packages definition (/packages.json) in ZK > * Public APIs to edit/read the file > * The APIs are registered at {{/api/cluster/package}} > * Classes can be loaded from the package classloader using the > {{:}} syntax -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-13902) "ant precommit" is inconsistent in branch_8x vs. master
Noble Paul created SOLR-13902: - Summary: "ant precommit" is inconsistent in branch_8x vs. master Key: SOLR-13902 URL: https://issues.apache.org/jira/browse/SOLR-13902 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Reporter: Noble Paul Missing {{package-info.java}} does not fail in master , but it fails in {{branch_8x}}. If it's required , it should fail everywhere. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13822) Isolated Classloading from packages
[ https://issues.apache.org/jira/browse/SOLR-13822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969811#comment-16969811 ] Noble Paul commented on SOLR-13822: --- interestingly , {{ant precommit}} works differently in our master and branch_8x, I've opened a ticket SOLR-13902 > Isolated Classloading from packages > --- > > Key: SOLR-13822 > URL: https://issues.apache.org/jira/browse/SOLR-13822 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Assignee: Noble Paul >Priority: Major > Attachments: SOLR-13822.patch > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Design is here: > [https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?ts=5d86a8ad#] > > main features: > * A new file for packages definition (/packages.json) in ZK > * Public APIs to edit/read the file > * The APIs are registered at {{/api/cluster/package}} > * Classes can be loaded from the package classloader using the > {{:}} syntax -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-13903) Classification Model Confusion Matrix Discrepancy
Ahmed Adel created SOLR-13903: - Summary: Classification Model Confusion Matrix Discrepancy Key: SOLR-13903 URL: https://issues.apache.org/jira/browse/SOLR-13903 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: streaming expressions Affects Versions: 8.2 Reporter: Ahmed Adel Attachments: cellphones.csv Using features and train stream sources generate a model with TP, TN, FP, FN fields. For some reason, the summation of the values of these fields is sometimes less than the training set size. How to regenerate: # Create two collections: cellphones and cellphones-model # Indexing the attached dataset into cellphones # Run the following expression: {{commit(cellphones-model,update(cellphones-model,batchSize=500, }}{{ train(cellphones, }}{{ features(cellphones, q="*:*", featureSet="featureSet", field="title_t", outcome="brand_i", numTerms=25), }}{{ q="*:*", }}{{ name="cellphones-classification-model", }}{{ field="title_t", }}{{ outcome="brand_i", }}{{ maxIterations=100))) }} 4) Run the following query to retrieve confusion matrix: {{search q=*:*&collection=cellphones-model&fl=name_s,trueNegative_i,truePositive_i,falseNegative_i,falsePositive_i,iteration_i&sort=iteration_i%20desc&rows=100 }} The summation of the metrics TP, TN, FP, FN is always less than the training set size by one in this instance for all iterations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13903) Classification Model Confusion Matrix Discrepancy
[ https://issues.apache.org/jira/browse/SOLR-13903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Adel updated SOLR-13903: -- Description: Using features and train stream sources generate a model with TP, TN, FP, FN fields. For some reason, the summation of the values of these fields is sometimes less than the training set size. How to regenerate: # Create two collections: cellphones and cellphones-model # Indexing the attached dataset into cellphones # Run the following expression: {{commit(cellphones-model,update(cellphones-model,batchSize=500, train(cellphones, features(cellphones, q="*:*", featureSet="featureSet", field="title_t", outcome="brand_i", numTerms=25), q="*:*", name="cellphones-classification-model", field="title_t", outcome="brand_i", maxIterations=100))) }} # Run the following query to retrieve confusion matrix: {{search q=*:*&collection=cellphones-model&fl=name_s,trueNegative_i,truePositive_i,falseNegative_i,falsePositive_i,iteration_i&sort=iteration_i%20desc&rows=100}} The summation of the metrics TP, TN, FP, FN is always less than the training set size by one in this instance for all iterations. was: Using features and train stream sources generate a model with TP, TN, FP, FN fields. For some reason, the summation of the values of these fields is sometimes less than the training set size. How to regenerate: # Create two collections: cellphones and cellphones-model # Indexing the attached dataset into cellphones # Run the following expression: {{commit(cellphones-model,update(cellphones-model,batchSize=500, }}{{ train(cellphones, }}{{ features(cellphones, q="*:*", featureSet="featureSet", field="title_t", outcome="brand_i", numTerms=25), }}{{ q="*:*", }}{{ name="cellphones-classification-model", }}{{ field="title_t", }}{{ outcome="brand_i", }}{{ maxIterations=100))) }} 4) Run the following query to retrieve confusion matrix: {{search q=*:*&collection=cellphones-model&fl=name_s,trueNegative_i,truePositive_i,falseNegative_i,falsePositive_i,iteration_i&sort=iteration_i%20desc&rows=100 }} The summation of the metrics TP, TN, FP, FN is always less than the training set size by one in this instance for all iterations. > Classification Model Confusion Matrix Discrepancy > - > > Key: SOLR-13903 > URL: https://issues.apache.org/jira/browse/SOLR-13903 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: streaming expressions >Affects Versions: 8.2 >Reporter: Ahmed Adel >Priority: Major > Labels: classification > Attachments: cellphones.csv > > > Using features and train stream sources generate a model with TP, TN, FP, FN > fields. For some reason, the summation of the values of these fields is > sometimes less than the training set size. > How to regenerate: > # Create two collections: cellphones and cellphones-model > # Indexing the attached dataset into cellphones > # Run the following expression: > {{commit(cellphones-model,update(cellphones-model,batchSize=500, > train(cellphones, > features(cellphones, q="*:*", featureSet="featureSet", > field="title_t", > outcome="brand_i", numTerms=25), > q="*:*", > name="cellphones-classification-model", > field="title_t", > outcome="brand_i", > maxIterations=100))) > }} > # Run the following query to retrieve confusion matrix: > {{search > q=*:*&collection=cellphones-model&fl=name_s,trueNegative_i,truePositive_i,falseNegative_i,falsePositive_i,iteration_i&sort=iteration_i%20desc&rows=100}} > The summation of the metrics TP, TN, FP, FN is always less than the training > set size by one in this instance for all iterations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13903) Classification Model Confusion Matrix Discrepancy
[ https://issues.apache.org/jira/browse/SOLR-13903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Adel updated SOLR-13903: -- Description: Using features and train stream sources generate a model with TP, TN, FP, FN fields. For some reason, the summation of the values of these fields is sometimes less than the training set size. How to regenerate: 1. Create two collections: cellphones and cellphones-model 2. Indexing the attached dataset into cellphones 3. Run the following expression: commit(cellphones-model,update(cellphones-model,batchSize=500, {{ train(cellphones,}} {{ features(cellphones, q="*:*", featureSet="featureSet",}} {{ field="title_t",}} {{ outcome="brand_i", numTerms=25),}} {{ q="*:*",}} {{ name="cellphones-classification-model",}} {{ field="title_t",}} {{ outcome="brand_i",}} {{ maxIterations=100)))}} 4. Run the following query to retrieve confusion matrix: {{search q=*:*&collection=cellphones-model&fl=name_s,trueNegative_i,truePositive_i,falseNegative_i,falsePositive_i,iteration_i&sort=iteration_i%20desc&rows=100}} The summation of the metrics TP, TN, FP, FN is always less than the training set size by one in this instance for all iterations. was: Using features and train stream sources generate a model with TP, TN, FP, FN fields. For some reason, the summation of the values of these fields is sometimes less than the training set size. How to regenerate: 1. Create two collections: cellphones and cellphones-model 2. Indexing the attached dataset into cellphones 3. Run the following expression: {{commit(cellphones-model,update(cellphones-model,batchSize=500,}} {{ train(cellphones,}} {{ features(cellphones, q="*:*", featureSet="featureSet",}} {{ field="title_t",}} {{ outcome="brand_i", numTerms=25),}} {{ q="*:*",}} {{ name="cellphones-classification-model",}} {{ field="title_t",}} {{ outcome="brand_i",}} {{ maxIterations=100)))}} 4. Run the following query to retrieve confusion matrix: {{search q=*:*&collection=cellphones-model&fl=name_s,trueNegative_i,truePositive_i,falseNegative_i,falsePositive_i,iteration_i&sort=iteration_i%20desc&rows=100}} The summation of the metrics TP, TN, FP, FN is always less than the training set size by one in this instance for all iterations. > Classification Model Confusion Matrix Discrepancy > - > > Key: SOLR-13903 > URL: https://issues.apache.org/jira/browse/SOLR-13903 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: streaming expressions >Affects Versions: 8.2 >Reporter: Ahmed Adel >Priority: Major > Labels: classification > Attachments: cellphones.csv > > > Using features and train stream sources generate a model with TP, TN, FP, FN > fields. For some reason, the summation of the values of these fields is > sometimes less than the training set size. > How to regenerate: > 1. Create two collections: cellphones and cellphones-model > 2. Indexing the attached dataset into cellphones > 3. Run the following expression: > commit(cellphones-model,update(cellphones-model,batchSize=500, > {{ train(cellphones,}} > {{ features(cellphones, q="*:*", featureSet="featureSet",}} > {{ field="title_t",}} > {{ outcome="brand_i", numTerms=25),}} > {{ q="*:*",}} > {{ name="cellphones-classification-model",}} > {{ field="title_t",}} > {{ outcome="brand_i",}} > {{ maxIterations=100)))}} > 4. Run the following query to retrieve confusion matrix: > {{search > q=*:*&collection=cellphones-model&fl=name_s,trueNegative_i,truePositive_i,falseNegative_i,falsePositive_i,iteration_i&sort=iteration_i%20desc&rows=100}} > The summation of the metrics TP, TN, FP, FN is always less than the training > set size by one in this instance for all iterations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13903) Classification Model Confusion Matrix Discrepancy
[ https://issues.apache.org/jira/browse/SOLR-13903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Adel updated SOLR-13903: -- Description: Using features and train stream sources generate a model with TP, TN, FP, FN fields. For some reason, the summation of the values of these fields is sometimes less than the training set size. How to regenerate: 1. Create two collections: cellphones and cellphones-model 2. Indexing the attached dataset into cellphones 3. Run the following expression: {{commit(cellphones-model,update(cellphones-model,batchSize=500,}} {{ train(cellphones,}} {{ features(cellphones, q="*:*", featureSet="featureSet",}} {{ field="title_t",}} {{ outcome="brand_i", numTerms=25),}} {{ q="*:*",}} {{ name="cellphones-classification-model",}} {{ field="title_t",}} {{ outcome="brand_i",}} {{ maxIterations=100)))}} 4. Run the following query to retrieve confusion matrix: {{search q=*:*&collection=cellphones-model&fl=name_s,trueNegative_i,truePositive_i,falseNegative_i,falsePositive_i,iteration_i&sort=iteration_i%20desc&rows=100}} The summation of the metrics TP, TN, FP, FN is always less than the training set size by one in this instance for all iterations. was: Using features and train stream sources generate a model with TP, TN, FP, FN fields. For some reason, the summation of the values of these fields is sometimes less than the training set size. How to regenerate: # Create two collections: cellphones and cellphones-model # Indexing the attached dataset into cellphones # Run the following expression: {{commit(cellphones-model,update(cellphones-model,batchSize=500, train(cellphones, features(cellphones, q="*:*", featureSet="featureSet", field="title_t", outcome="brand_i", numTerms=25), q="*:*", name="cellphones-classification-model", field="title_t", outcome="brand_i", maxIterations=100))) }} # Run the following query to retrieve confusion matrix: {{search q=*:*&collection=cellphones-model&fl=name_s,trueNegative_i,truePositive_i,falseNegative_i,falsePositive_i,iteration_i&sort=iteration_i%20desc&rows=100}} The summation of the metrics TP, TN, FP, FN is always less than the training set size by one in this instance for all iterations. > Classification Model Confusion Matrix Discrepancy > - > > Key: SOLR-13903 > URL: https://issues.apache.org/jira/browse/SOLR-13903 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: streaming expressions >Affects Versions: 8.2 >Reporter: Ahmed Adel >Priority: Major > Labels: classification > Attachments: cellphones.csv > > > Using features and train stream sources generate a model with TP, TN, FP, FN > fields. For some reason, the summation of the values of these fields is > sometimes less than the training set size. > How to regenerate: > 1. Create two collections: cellphones and cellphones-model > 2. Indexing the attached dataset into cellphones > 3. Run the following expression: > {{commit(cellphones-model,update(cellphones-model,batchSize=500,}} > {{ train(cellphones,}} > {{ features(cellphones, q="*:*", featureSet="featureSet",}} > {{ field="title_t",}} > {{ outcome="brand_i", numTerms=25),}} > {{ q="*:*",}} > {{ name="cellphones-classification-model",}} > {{ field="title_t",}} > {{ outcome="brand_i",}} > {{ maxIterations=100)))}} > 4. Run the following query to retrieve confusion matrix: > {{search > q=*:*&collection=cellphones-model&fl=name_s,trueNegative_i,truePositive_i,falseNegative_i,falsePositive_i,iteration_i&sort=iteration_i%20desc&rows=100}} > The summation of the metrics TP, TN, FP, FN is always less than the training > set size by one in this instance for all iterations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13903) Classification Model Confusion Matrix Discrepancy
[ https://issues.apache.org/jira/browse/SOLR-13903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Adel updated SOLR-13903: -- Description: Using features and train stream sources generate a model with TP, TN, FP, FN fields. For some reason, the summation of the values of these fields is sometimes less than the training set size. How to regenerate: 1. Create two collections: cellphones and cellphones-model 2. Indexing the attached dataset into cellphones 3. Run the following expression: {{commit(cellphones-model,update(cellphones-model,batchSize=500, {{ train(cellphones,}} {{ features(cellphones, q="*:*", featureSet="featureSet",}} {{ field="title_t",}} {{ outcome="brand_i", numTerms=25),}} {{ q="*:*",}} {{ name="cellphones-classification-model",}} {{ field="title_t",}} {{ outcome="brand_i",}} {{ maxIterations=100)))}} 4. Run the following query to retrieve confusion matrix: {{search q=*:*&collection=cellphones-model&fl=name_s,trueNegative_i,truePositive_i,falseNegative_i,falsePositive_i,iteration_i&sort=iteration_i%20desc&rows=100}} The summation of the metrics TP, TN, FP, FN is always less than the training set size by one in this instance for all iterations. was: Using features and train stream sources generate a model with TP, TN, FP, FN fields. For some reason, the summation of the values of these fields is sometimes less than the training set size. How to regenerate: 1. Create two collections: cellphones and cellphones-model 2. Indexing the attached dataset into cellphones 3. Run the following expression: commit(cellphones-model,update(cellphones-model,batchSize=500, {{ train(cellphones,}} {{ features(cellphones, q="*:*", featureSet="featureSet",}} {{ field="title_t",}} {{ outcome="brand_i", numTerms=25),}} {{ q="*:*",}} {{ name="cellphones-classification-model",}} {{ field="title_t",}} {{ outcome="brand_i",}} {{ maxIterations=100)))}} 4. Run the following query to retrieve confusion matrix: {{search q=*:*&collection=cellphones-model&fl=name_s,trueNegative_i,truePositive_i,falseNegative_i,falsePositive_i,iteration_i&sort=iteration_i%20desc&rows=100}} The summation of the metrics TP, TN, FP, FN is always less than the training set size by one in this instance for all iterations. > Classification Model Confusion Matrix Discrepancy > - > > Key: SOLR-13903 > URL: https://issues.apache.org/jira/browse/SOLR-13903 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: streaming expressions >Affects Versions: 8.2 >Reporter: Ahmed Adel >Priority: Major > Labels: classification > Attachments: cellphones.csv > > > Using features and train stream sources generate a model with TP, TN, FP, FN > fields. For some reason, the summation of the values of these fields is > sometimes less than the training set size. > How to regenerate: > 1. Create two collections: cellphones and cellphones-model > 2. Indexing the attached dataset into cellphones > 3. Run the following expression: > {{commit(cellphones-model,update(cellphones-model,batchSize=500, > {{ train(cellphones,}} > {{ features(cellphones, q="*:*", featureSet="featureSet",}} > {{ field="title_t",}} > {{ outcome="brand_i", numTerms=25),}} > {{ q="*:*",}} > {{ name="cellphones-classification-model",}} > {{ field="title_t",}} > {{ outcome="brand_i",}} > {{ maxIterations=100)))}} > 4. Run the following query to retrieve confusion matrix: > {{search > q=*:*&collection=cellphones-model&fl=name_s,trueNegative_i,truePositive_i,falseNegative_i,falsePositive_i,iteration_i&sort=iteration_i%20desc&rows=100}} > The summation of the metrics TP, TN, FP, FN is always less than the training > set size by one in this instance for all iterations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13822) Isolated Classloading from packages
[ https://issues.apache.org/jira/browse/SOLR-13822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969825#comment-16969825 ] ASF subversion and git services commented on SOLR-13822: Commit 7a207a935373574956894516fa5e772844fa702e in lucene-solr's branch refs/heads/master from Noble Paul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7a207a9 ] SOLR-13822: Missing package-info files > Isolated Classloading from packages > --- > > Key: SOLR-13822 > URL: https://issues.apache.org/jira/browse/SOLR-13822 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Assignee: Noble Paul >Priority: Major > Attachments: SOLR-13822.patch > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Design is here: > [https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?ts=5d86a8ad#] > > main features: > * A new file for packages definition (/packages.json) in ZK > * Public APIs to edit/read the file > * The APIs are registered at {{/api/cluster/package}} > * Classes can be loaded from the package classloader using the > {{:}} syntax -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13888) SolrCloud 2
[ https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969839#comment-16969839 ] Mark Robert Miller commented on SOLR-13888: --- {quote}We are going to have SolrCloud test that do nothing but spin up one shard and index 1 doc. And then 10 shards. And then 100. {quote} This is not patronizing. It's literally a huge part of fixing things, once you clear some of the perf issues I've also already dropped. I'm being nearly as obtuse as you guys may think. > SolrCloud 2 > --- > > Key: SOLR-13888 > URL: https://issues.apache.org/jira/browse/SOLR-13888 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Major > Attachments: solrscreen.png > > > As devs discuss dropping the SolrCloud name on the dev list, here is an issue > titled SolrCloud 2. > A couple times now I've pulled on the sweater thread that is our broken > tests. It leads to one place - SolrCloud is sick and devs are adding spotty > code on top of it at a rate that will lead to the system falling in on > itself. As it is, it's a very slow, very inefficient, very unreliable, very > buggy system. > This is not why I am here. This is the opposite of why I am here. > So please, let's stop. We can't build on that thing as it is. > > I need some time, I lost a lot of work at one point, the scope has expanded > since I realized how problematic some things really are, but I have an > alternative path that is not so duct tape and straw. As the building climbs, > that foundation is going to kill us all. > > This i not about an architecture change - the architecture is fine. The > implementation is broken and getting worse. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13888) SolrCloud 2
[ https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969839#comment-16969839 ] Mark Robert Miller edited comment on SOLR-13888 at 11/8/19 5:10 AM: {quote}We are going to have SolrCloud test that do nothing but spin up one shard and index 1 doc. And then 10 shards. And then 100. {quote} This is not patronizing. It's literally a huge part of fixing things, once you clear some of the perf issues I've also already dropped. I'm not being nearly as obtuse as you guys may think. was (Author: markrmiller): {quote}We are going to have SolrCloud test that do nothing but spin up one shard and index 1 doc. And then 10 shards. And then 100. {quote} This is not patronizing. It's literally a huge part of fixing things, once you clear some of the perf issues I've also already dropped. I'm being nearly as obtuse as you guys may think. > SolrCloud 2 > --- > > Key: SOLR-13888 > URL: https://issues.apache.org/jira/browse/SOLR-13888 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Major > Attachments: solrscreen.png > > > As devs discuss dropping the SolrCloud name on the dev list, here is an issue > titled SolrCloud 2. > A couple times now I've pulled on the sweater thread that is our broken > tests. It leads to one place - SolrCloud is sick and devs are adding spotty > code on top of it at a rate that will lead to the system falling in on > itself. As it is, it's a very slow, very inefficient, very unreliable, very > buggy system. > This is not why I am here. This is the opposite of why I am here. > So please, let's stop. We can't build on that thing as it is. > > I need some time, I lost a lot of work at one point, the scope has expanded > since I realized how problematic some things really are, but I have an > alternative path that is not so duct tape and straw. As the building climbs, > that foundation is going to kill us all. > > This i not about an architecture change - the architecture is fine. The > implementation is broken and getting worse. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13888) SolrCloud 2
[ https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969840#comment-16969840 ] Mark Robert Miller commented on SOLR-13888: --- You can do a bunch of fun things with a working. I have some fun stuff too. With a working you can also build fun stuff. The fixing is boring. > SolrCloud 2 > --- > > Key: SOLR-13888 > URL: https://issues.apache.org/jira/browse/SOLR-13888 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Major > Attachments: solrscreen.png > > > As devs discuss dropping the SolrCloud name on the dev list, here is an issue > titled SolrCloud 2. > A couple times now I've pulled on the sweater thread that is our broken > tests. It leads to one place - SolrCloud is sick and devs are adding spotty > code on top of it at a rate that will lead to the system falling in on > itself. As it is, it's a very slow, very inefficient, very unreliable, very > buggy system. > This is not why I am here. This is the opposite of why I am here. > So please, let's stop. We can't build on that thing as it is. > > I need some time, I lost a lot of work at one point, the scope has expanded > since I realized how problematic some things really are, but I have an > alternative path that is not so duct tape and straw. As the building climbs, > that foundation is going to kill us all. > > This i not about an architecture change - the architecture is fine. The > implementation is broken and getting worse. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13888) SolrCloud 2
[ https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969840#comment-16969840 ] Mark Robert Miller edited comment on SOLR-13888 at 11/8/19 5:14 AM: You can do a bunch of fun things with a working system. I have some fun stuff too. With a working system *you* can also build fun stuff. The fixing is boring. was (Author: markrmiller): You can do a bunch of fun things with a working. I have some fun stuff too. With a working you can also build fun stuff. The fixing is boring. > SolrCloud 2 > --- > > Key: SOLR-13888 > URL: https://issues.apache.org/jira/browse/SOLR-13888 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Major > Attachments: solrscreen.png > > > As devs discuss dropping the SolrCloud name on the dev list, here is an issue > titled SolrCloud 2. > A couple times now I've pulled on the sweater thread that is our broken > tests. It leads to one place - SolrCloud is sick and devs are adding spotty > code on top of it at a rate that will lead to the system falling in on > itself. As it is, it's a very slow, very inefficient, very unreliable, very > buggy system. > This is not why I am here. This is the opposite of why I am here. > So please, let's stop. We can't build on that thing as it is. > > I need some time, I lost a lot of work at one point, the scope has expanded > since I realized how problematic some things really are, but I have an > alternative path that is not so duct tape and straw. As the building climbs, > that foundation is going to kill us all. > > This i not about an architecture change - the architecture is fine. The > implementation is broken and getting worse. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13888) SolrCloud 2
[ https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969845#comment-16969845 ] Mark Robert Miller commented on SOLR-13888: --- And if you spend a bunch of time, and go through a LOT of the fixes, maybe you get some other ideas that are causing problems and could. I've dropped a ton of that too. You want spoon, you don't, I don't know. I can tell if, I just give my full dump right now, you'll just think its more patronizing shit or something. You will get all the info, nobody is trying to hold it back from you. You have enough to cover a lot of hours of work already. > SolrCloud 2 > --- > > Key: SOLR-13888 > URL: https://issues.apache.org/jira/browse/SOLR-13888 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Major > Attachments: solrscreen.png > > > As devs discuss dropping the SolrCloud name on the dev list, here is an issue > titled SolrCloud 2. > A couple times now I've pulled on the sweater thread that is our broken > tests. It leads to one place - SolrCloud is sick and devs are adding spotty > code on top of it at a rate that will lead to the system falling in on > itself. As it is, it's a very slow, very inefficient, very unreliable, very > buggy system. > This is not why I am here. This is the opposite of why I am here. > So please, let's stop. We can't build on that thing as it is. > > I need some time, I lost a lot of work at one point, the scope has expanded > since I realized how problematic some things really are, but I have an > alternative path that is not so duct tape and straw. As the building climbs, > that foundation is going to kill us all. > > This i not about an architecture change - the architecture is fine. The > implementation is broken and getting worse. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13888) SolrCloud 2
[ https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969852#comment-16969852 ] Mark Robert Miller commented on SOLR-13888: --- Reimpl, reimpl, not full reimpl, we have parts that work. Shortest point to safe point. I can draw a map, but id have to rewrite a lot of words I already have. Sure you can, find thousands more bugs, but you start on the speed, and eventually you can squeeze whereever you want. > SolrCloud 2 > --- > > Key: SOLR-13888 > URL: https://issues.apache.org/jira/browse/SOLR-13888 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Major > Attachments: solrscreen.png > > > As devs discuss dropping the SolrCloud name on the dev list, here is an issue > titled SolrCloud 2. > A couple times now I've pulled on the sweater thread that is our broken > tests. It leads to one place - SolrCloud is sick and devs are adding spotty > code on top of it at a rate that will lead to the system falling in on > itself. As it is, it's a very slow, very inefficient, very unreliable, very > buggy system. > This is not why I am here. This is the opposite of why I am here. > So please, let's stop. We can't build on that thing as it is. > > I need some time, I lost a lot of work at one point, the scope has expanded > since I realized how problematic some things really are, but I have an > alternative path that is not so duct tape and straw. As the building climbs, > that foundation is going to kill us all. > > This i not about an architecture change - the architecture is fine. The > implementation is broken and getting worse. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9038) Evaluate Caffeine for LruQueryCache
[ https://issues.apache.org/jira/browse/LUCENE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Manes updated LUCENE-9038: -- Attachment: CaffeineQueryCache.java > Evaluate Caffeine for LruQueryCache > --- > > Key: LUCENE-9038 > URL: https://issues.apache.org/jira/browse/LUCENE-9038 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ben Manes >Priority: Major > Attachments: CaffeineQueryCache.java > > > [LRUQueryCache|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java] > appears to play a central role in Lucene's performance. There are many > issues discussing its performance, such as LUCENE-7235, LUCENE-7237, > LUCENE-8027, LUCENE-8213, and LUCENE-9002. It appears that the cache's > overhead can be just as much of a benefit as a liability, causing various > workarounds and complexity. > When reviewing the discussions and code, the following issues are concerning: > # The cache is guarded by a single lock for all reads and writes. > # All computations are performed outside of the any locking to avoid > penalizing other callers. This doesn't handle the cache stampedes meaning > that multiple threads may cache miss, compute the value, and try to store it. > That redundant work becomes expensive under load and can be mitigated with ~ > per-key locks. > # The cache queries the entry to see if it's even worth caching. At first > glance one assumes that is so that inexpensive entries don't bang on the lock > or thrash the LRU. However, this is also used to indicate data dependencies > for uncachable items (per JIRA), which perhaps shouldn't be invoking the > cache. > # The cache lookup is skipped if the global lock is held and the value is > computed, but not stored. This means a busy lock reduces performance across > all usages and the cache's effectiveness degrades. This is not counted in the > miss rate, giving a false impression. > # An attempt was made to perform computations asynchronously, due to their > heavy cost on tail latencies. That work was reverted due to test failures and > is being worked on. > # An [in-progress change|https://github.com/apache/lucene-solr/pull/940] > tries to avoid LRU thrashing due to large, infrequently used items being > cached. > # The cache is tightly intertwined with business logic, making it hard to > tease apart core algorithms and data structures from the usage scenarios. > It seems that more and more items skip being cached because of concurrency > and hit rate performance, causing special case fixes based on knowledge of > the external code flows. Since the developers are experts on search, not > caching, it seems justified to evaluate if an off-the-shelf library would be > more helpful in terms of developer time, code complexity, and performance. > Solr has already introduced [Caffeine|https://github.com/ben-manes/caffeine] > in SOLR-8241 and SOLR-13817. > The proposal is to replace the internals {{LruQueryCache}} so that external > usages are not affected in terms of the API. However, like in {{SolrCache}}, > a difference is that Caffeine only bounds by either the number of entries or > an accumulated size (e.g. bytes), but not both constraints. This likely is an > acceptable divergence in how the configuration is honored. > cc [~ab], [~dsmiley] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13888) SolrCloud 2
[ https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969866#comment-16969866 ] Mark Robert Miller commented on SOLR-13888: --- Oh, and the doc I made - also, not patronizing. Based on experience fixing things. > SolrCloud 2 > --- > > Key: SOLR-13888 > URL: https://issues.apache.org/jira/browse/SOLR-13888 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Major > Attachments: solrscreen.png > > > As devs discuss dropping the SolrCloud name on the dev list, here is an issue > titled SolrCloud 2. > A couple times now I've pulled on the sweater thread that is our broken > tests. It leads to one place - SolrCloud is sick and devs are adding spotty > code on top of it at a rate that will lead to the system falling in on > itself. As it is, it's a very slow, very inefficient, very unreliable, very > buggy system. > This is not why I am here. This is the opposite of why I am here. > So please, let's stop. We can't build on that thing as it is. > > I need some time, I lost a lot of work at one point, the scope has expanded > since I realized how problematic some things really are, but I have an > alternative path that is not so duct tape and straw. As the building climbs, > that foundation is going to kill us all. > > This i not about an architecture change - the architecture is fine. The > implementation is broken and getting worse. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9038) Evaluate Caffeine for LruQueryCache
[ https://issues.apache.org/jira/browse/LUCENE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969869#comment-16969869 ] Ben Manes commented on LUCENE-9038: --- Attached a very rough sketch of what this could look like. A cache hit would be lock-free and a miss would be performed under a per-segment {{computeIfAbsent}}. A cheap computation back would cause the segment to be re-weighed, perhaps triggering an eviction. A lot of {{LruQueryCache}} needs to be ported over, but I think that is straightforward. It may look a lot like the current cache in the end, but benefit from having concurrent data structures to work off of. Let me know if you think this is the right direction. > Evaluate Caffeine for LruQueryCache > --- > > Key: LUCENE-9038 > URL: https://issues.apache.org/jira/browse/LUCENE-9038 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ben Manes >Priority: Major > Attachments: CaffeineQueryCache.java > > > [LRUQueryCache|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java] > appears to play a central role in Lucene's performance. There are many > issues discussing its performance, such as LUCENE-7235, LUCENE-7237, > LUCENE-8027, LUCENE-8213, and LUCENE-9002. It appears that the cache's > overhead can be just as much of a benefit as a liability, causing various > workarounds and complexity. > When reviewing the discussions and code, the following issues are concerning: > # The cache is guarded by a single lock for all reads and writes. > # All computations are performed outside of the any locking to avoid > penalizing other callers. This doesn't handle the cache stampedes meaning > that multiple threads may cache miss, compute the value, and try to store it. > That redundant work becomes expensive under load and can be mitigated with ~ > per-key locks. > # The cache queries the entry to see if it's even worth caching. At first > glance one assumes that is so that inexpensive entries don't bang on the lock > or thrash the LRU. However, this is also used to indicate data dependencies > for uncachable items (per JIRA), which perhaps shouldn't be invoking the > cache. > # The cache lookup is skipped if the global lock is held and the value is > computed, but not stored. This means a busy lock reduces performance across > all usages and the cache's effectiveness degrades. This is not counted in the > miss rate, giving a false impression. > # An attempt was made to perform computations asynchronously, due to their > heavy cost on tail latencies. That work was reverted due to test failures and > is being worked on. > # An [in-progress change|https://github.com/apache/lucene-solr/pull/940] > tries to avoid LRU thrashing due to large, infrequently used items being > cached. > # The cache is tightly intertwined with business logic, making it hard to > tease apart core algorithms and data structures from the usage scenarios. > It seems that more and more items skip being cached because of concurrency > and hit rate performance, causing special case fixes based on knowledge of > the external code flows. Since the developers are experts on search, not > caching, it seems justified to evaluate if an off-the-shelf library would be > more helpful in terms of developer time, code complexity, and performance. > Solr has already introduced [Caffeine|https://github.com/ben-manes/caffeine] > in SOLR-8241 and SOLR-13817. > The proposal is to replace the internals {{LruQueryCache}} so that external > usages are not affected in terms of the API. However, like in {{SolrCache}}, > a difference is that Caffeine only bounds by either the number of entries or > an accumulated size (e.g. bytes), but not both constraints. This likely is an > acceptable divergence in how the configuration is honored. > cc [~ab], [~dsmiley] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13888) SolrCloud 2
[ https://issues.apache.org/jira/browse/SOLR-13888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969868#comment-16969868 ] Mark Robert Miller commented on SOLR-13888: --- and finally, my comment to Shalin - I had recently found that same bug that same day with the info I gave. > SolrCloud 2 > --- > > Key: SOLR-13888 > URL: https://issues.apache.org/jira/browse/SOLR-13888 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mark Miller >Assignee: Mark Miller >Priority: Major > Attachments: solrscreen.png > > > As devs discuss dropping the SolrCloud name on the dev list, here is an issue > titled SolrCloud 2. > A couple times now I've pulled on the sweater thread that is our broken > tests. It leads to one place - SolrCloud is sick and devs are adding spotty > code on top of it at a rate that will lead to the system falling in on > itself. As it is, it's a very slow, very inefficient, very unreliable, very > buggy system. > This is not why I am here. This is the opposite of why I am here. > So please, let's stop. We can't build on that thing as it is. > > I need some time, I lost a lot of work at one point, the scope has expanded > since I realized how problematic some things really are, but I have an > alternative path that is not so duct tape and straw. As the building climbs, > that foundation is going to kill us all. > > This i not about an architecture change - the architecture is fine. The > implementation is broken and getting worse. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9038) Evaluate Caffeine for LruQueryCache
[ https://issues.apache.org/jira/browse/LUCENE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Manes updated LUCENE-9038: -- Attachment: CaffeineQueryCache.java > Evaluate Caffeine for LruQueryCache > --- > > Key: LUCENE-9038 > URL: https://issues.apache.org/jira/browse/LUCENE-9038 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ben Manes >Priority: Major > Attachments: CaffeineQueryCache.java > > > [LRUQueryCache|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java] > appears to play a central role in Lucene's performance. There are many > issues discussing its performance, such as LUCENE-7235, LUCENE-7237, > LUCENE-8027, LUCENE-8213, and LUCENE-9002. It appears that the cache's > overhead can be just as much of a benefit as a liability, causing various > workarounds and complexity. > When reviewing the discussions and code, the following issues are concerning: > # The cache is guarded by a single lock for all reads and writes. > # All computations are performed outside of the any locking to avoid > penalizing other callers. This doesn't handle the cache stampedes meaning > that multiple threads may cache miss, compute the value, and try to store it. > That redundant work becomes expensive under load and can be mitigated with ~ > per-key locks. > # The cache queries the entry to see if it's even worth caching. At first > glance one assumes that is so that inexpensive entries don't bang on the lock > or thrash the LRU. However, this is also used to indicate data dependencies > for uncachable items (per JIRA), which perhaps shouldn't be invoking the > cache. > # The cache lookup is skipped if the global lock is held and the value is > computed, but not stored. This means a busy lock reduces performance across > all usages and the cache's effectiveness degrades. This is not counted in the > miss rate, giving a false impression. > # An attempt was made to perform computations asynchronously, due to their > heavy cost on tail latencies. That work was reverted due to test failures and > is being worked on. > # An [in-progress change|https://github.com/apache/lucene-solr/pull/940] > tries to avoid LRU thrashing due to large, infrequently used items being > cached. > # The cache is tightly intertwined with business logic, making it hard to > tease apart core algorithms and data structures from the usage scenarios. > It seems that more and more items skip being cached because of concurrency > and hit rate performance, causing special case fixes based on knowledge of > the external code flows. Since the developers are experts on search, not > caching, it seems justified to evaluate if an off-the-shelf library would be > more helpful in terms of developer time, code complexity, and performance. > Solr has already introduced [Caffeine|https://github.com/ben-manes/caffeine] > in SOLR-8241 and SOLR-13817. > The proposal is to replace the internals {{LruQueryCache}} so that external > usages are not affected in terms of the API. However, like in {{SolrCache}}, > a difference is that Caffeine only bounds by either the number of entries or > an accumulated size (e.g. bytes), but not both constraints. This likely is an > acceptable divergence in how the configuration is honored. > cc [~ab], [~dsmiley] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9038) Evaluate Caffeine for LruQueryCache
[ https://issues.apache.org/jira/browse/LUCENE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Manes updated LUCENE-9038: -- Attachment: (was: CaffeineQueryCache.java) > Evaluate Caffeine for LruQueryCache > --- > > Key: LUCENE-9038 > URL: https://issues.apache.org/jira/browse/LUCENE-9038 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Ben Manes >Priority: Major > Attachments: CaffeineQueryCache.java > > > [LRUQueryCache|https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java] > appears to play a central role in Lucene's performance. There are many > issues discussing its performance, such as LUCENE-7235, LUCENE-7237, > LUCENE-8027, LUCENE-8213, and LUCENE-9002. It appears that the cache's > overhead can be just as much of a benefit as a liability, causing various > workarounds and complexity. > When reviewing the discussions and code, the following issues are concerning: > # The cache is guarded by a single lock for all reads and writes. > # All computations are performed outside of the any locking to avoid > penalizing other callers. This doesn't handle the cache stampedes meaning > that multiple threads may cache miss, compute the value, and try to store it. > That redundant work becomes expensive under load and can be mitigated with ~ > per-key locks. > # The cache queries the entry to see if it's even worth caching. At first > glance one assumes that is so that inexpensive entries don't bang on the lock > or thrash the LRU. However, this is also used to indicate data dependencies > for uncachable items (per JIRA), which perhaps shouldn't be invoking the > cache. > # The cache lookup is skipped if the global lock is held and the value is > computed, but not stored. This means a busy lock reduces performance across > all usages and the cache's effectiveness degrades. This is not counted in the > miss rate, giving a false impression. > # An attempt was made to perform computations asynchronously, due to their > heavy cost on tail latencies. That work was reverted due to test failures and > is being worked on. > # An [in-progress change|https://github.com/apache/lucene-solr/pull/940] > tries to avoid LRU thrashing due to large, infrequently used items being > cached. > # The cache is tightly intertwined with business logic, making it hard to > tease apart core algorithms and data structures from the usage scenarios. > It seems that more and more items skip being cached because of concurrency > and hit rate performance, causing special case fixes based on knowledge of > the external code flows. Since the developers are experts on search, not > caching, it seems justified to evaluate if an off-the-shelf library would be > more helpful in terms of developer time, code complexity, and performance. > Solr has already introduced [Caffeine|https://github.com/ben-manes/caffeine] > in SOLR-8241 and SOLR-13817. > The proposal is to replace the internals {{LruQueryCache}} so that external > usages are not affected in terms of the API. However, like in {{SolrCache}}, > a difference is that Caffeine only bounds by either the number of entries or > an accumulated size (e.g. bytes), but not both constraints. This likely is an > acceptable divergence in how the configuration is honored. > cc [~ab], [~dsmiley] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9036) ExitableDirectoryReader to interrupt DocValues as well
[ https://issues.apache.org/jira/browse/LUCENE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev updated LUCENE-9036: - Attachment: LUCENE-9036.patch Status: Patch Available (was: Patch Available) > ExitableDirectoryReader to interrupt DocValues as well > -- > > Key: LUCENE-9036 > URL: https://issues.apache.org/jira/browse/LUCENE-9036 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mikhail Khludnev >Priority: Major > Attachments: LUCENE-9036.patch, LUCENE-9036.patch > > > This allow to make AnalyticsComponent and json.facet sensitive to time > allowed. > Does it make sense? Is it enough to check on DV creation ie per field/segment > or it's worth to check every Nth doc? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9036) ExitableDirectoryReader to interrupt DocValues as well
[ https://issues.apache.org/jira/browse/LUCENE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev updated LUCENE-9036: - Attachment: LUCENE-9036.patch Status: Patch Available (was: Patch Available) > ExitableDirectoryReader to interrupt DocValues as well > -- > > Key: LUCENE-9036 > URL: https://issues.apache.org/jira/browse/LUCENE-9036 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mikhail Khludnev >Priority: Major > Attachments: LUCENE-9036.patch, LUCENE-9036.patch, LUCENE-9036.patch > > > This allow to make AnalyticsComponent and json.facet sensitive to time > allowed. > Does it make sense? Is it enough to check on DV creation ie per field/segment > or it's worth to check every Nth doc? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9036) ExitableDirectoryReader to interrupt DocValues as well
[ https://issues.apache.org/jira/browse/LUCENE-9036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969906#comment-16969906 ] Mikhail Khludnev commented on LUCENE-9036: -- I decided to postpone Analytics component coverage, turns out it fails with NPE on timeout. I file an dedicated issue. > ExitableDirectoryReader to interrupt DocValues as well > -- > > Key: LUCENE-9036 > URL: https://issues.apache.org/jira/browse/LUCENE-9036 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mikhail Khludnev >Priority: Major > Attachments: LUCENE-9036.patch, LUCENE-9036.patch, LUCENE-9036.patch > > > This allow to make AnalyticsComponent and json.facet sensitive to time > allowed. > Does it make sense? Is it enough to check on DV creation ie per field/segment > or it's worth to check every Nth doc? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9039) Make Analytics component aware of timeAllowed
Mikhail Khludnev created LUCENE-9039: Summary: Make Analytics component aware of timeAllowed Key: LUCENE-9039 URL: https://issues.apache.org/jira/browse/LUCENE-9039 Project: Lucene - Core Issue Type: Sub-task Reporter: Mikhail Khludnev -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Moved] (SOLR-13904) Make Analytics component aware of timeAllowed
[ https://issues.apache.org/jira/browse/SOLR-13904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev moved LUCENE-9039 to SOLR-13904: - Key: SOLR-13904 (was: LUCENE-9039) Lucene Fields: (was: New) Project: Solr (was: Lucene - Core) Security: Public > Make Analytics component aware of timeAllowed > - > > Key: SOLR-13904 > URL: https://issues.apache.org/jira/browse/SOLR-13904 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mikhail Khludnev >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9039) Make Analytics component aware of timeAllowed
[ https://issues.apache.org/jira/browse/LUCENE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev updated LUCENE-9039: - Parent: (was: LUCENE-9036) Issue Type: Improvement (was: Sub-task) > Make Analytics component aware of timeAllowed > - > > Key: LUCENE-9039 > URL: https://issues.apache.org/jira/browse/LUCENE-9039 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mikhail Khludnev >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13904) Make Analytics component aware of timeAllowed
[ https://issues.apache.org/jira/browse/SOLR-13904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev updated SOLR-13904: Attachment: SOLR-13904.patch > Make Analytics component aware of timeAllowed > - > > Key: SOLR-13904 > URL: https://issues.apache.org/jira/browse/SOLR-13904 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mikhail Khludnev >Priority: Major > Attachments: SOLR-13904.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13904) Make Analytics component aware of timeAllowed
[ https://issues.apache.org/jira/browse/SOLR-13904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969913#comment-16969913 ] Mikhail Khludnev commented on SOLR-13904: - Attached patch let to interrupt long analytics computation, but now it causes 500 and NPE somewhere around {{AnalyticsShardResponseWriter.write()}} > Make Analytics component aware of timeAllowed > - > > Key: SOLR-13904 > URL: https://issues.apache.org/jira/browse/SOLR-13904 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mikhail Khludnev >Priority: Major > Attachments: SOLR-13904.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org