[jira] [Resolved] (LUCENE-9105) UniformSplit postings format should detect corrupted index
[ https://issues.apache.org/jira/browse/LUCENE-9105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno Roustant resolved LUCENE-9105. Fix Version/s: 8.5 Resolution: Fixed > UniformSplit postings format should detect corrupted index > -- > > Key: LUCENE-9105 > URL: https://issues.apache.org/jira/browse/LUCENE-9105 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Bruno Roustant >Assignee: Bruno Roustant >Priority: Major > Fix For: 8.5 > > Time Spent: 1h > Remaining Estimate: 0h > > BlockTree postings format has some checks when reading index metadata to > detect index corruption. UniformSplit should have the same. Additionally > UniformSplit has assertions in BlockReader that should be runtime checks to > also detect index corruption (this case has been encountered in production > environment). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13839) MaxScore is returned as NAN when group.query doesn't match any docs
[ https://issues.apache.org/jira/browse/SOLR-13839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005995#comment-17005995 ] Guna Sekhar Dora commented on SOLR-13839: - [~munendrasn] [~ichattopadhyaya] Added a fix for this at [https://github.com/apache/lucene-solr/pull/1132]. Can you please have a look at it? > MaxScore is returned as NAN when group.query doesn't match any docs > --- > > Key: SOLR-13839 > URL: https://issues.apache.org/jira/browse/SOLR-13839 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: search >Reporter: Munendra S N >Priority: Minor > Attachments: SOLR-13839.patch > > Time Spent: 10m > Remaining Estimate: 0h > > When the main query matches some products but group.query doesn't match any > docs then maxScore=NAN would be returned in the response. > * This happens only in standalone/single shard mode > * score needs to fetched in the response to encounter this issue -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] asfgit closed pull request #1106: LUCENE-9106: UniformSplit postings format allows extension of block/line serializers.
asfgit closed pull request #1106: LUCENE-9106: UniformSplit postings format allows extension of block/line serializers. URL: https://github.com/apache/lucene-solr/pull/1106 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9106) UniformSplit postings format should allow extension of block/line serializers.
[ https://issues.apache.org/jira/browse/LUCENE-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006008#comment-17006008 ] ASF subversion and git services commented on LUCENE-9106: - Commit 1851779ddbfd8ed3148b5d20114bcf2b3651459d in lucene-solr's branch refs/heads/master from Bruno Roustant [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1851779 ] LUCENE-9106: UniformSplit postings format allows extension of block/line serializers. Closes #1106 > UniformSplit postings format should allow extension of block/line serializers. > -- > > Key: LUCENE-9106 > URL: https://issues.apache.org/jira/browse/LUCENE-9106 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Bruno Roustant >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently UniformSplit postings format has static read methods for block / > line / header. So it is not possible to extend them to change slightly the > format. By introducing non-static serializers it will become possible to > extend easily the format to make changes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9106) UniformSplit postings format should allow extension of block/line serializers.
[ https://issues.apache.org/jira/browse/LUCENE-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006044#comment-17006044 ] ASF subversion and git services commented on LUCENE-9106: - Commit a97271fc521672bebfe629c7e09cd7fd2aca52d5 in lucene-solr's branch refs/heads/branch_8x from Bruno Roustant [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a97271f ] LUCENE-9106: UniformSplit postings format allows extension of block/line serializers. > UniformSplit postings format should allow extension of block/line serializers. > -- > > Key: LUCENE-9106 > URL: https://issues.apache.org/jira/browse/LUCENE-9106 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Bruno Roustant >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > Currently UniformSplit postings format has static read methods for block / > line / header. So it is not possible to extend them to change slightly the > format. By introducing non-static serializers it will become possible to > extend easily the format to make changes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9106) UniformSplit postings format should allow extension of block/line serializers.
[ https://issues.apache.org/jira/browse/LUCENE-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruno Roustant resolved LUCENE-9106. Fix Version/s: 8.5 Assignee: Bruno Roustant Resolution: Fixed > UniformSplit postings format should allow extension of block/line serializers. > -- > > Key: LUCENE-9106 > URL: https://issues.apache.org/jira/browse/LUCENE-9106 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Bruno Roustant >Assignee: Bruno Roustant >Priority: Minor > Fix For: 8.5 > > Time Spent: 40m > Remaining Estimate: 0h > > Currently UniformSplit postings format has static read methods for block / > line / header. So it is not possible to extend them to change slightly the > format. By introducing non-static serializers it will become possible to > extend easily the format to make changes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006076#comment-17006076 ] Markus Jelsma commented on LUCENE-9112: --- Hello [~sarowe], I first spotted the issue with a Dutch and an English sample using those ancient OpenNLP models from SourceForge. I just trained new English and Dutch models based on 250k line CONLLU data sets and tried again to see if the splitting behaviour is still there. I had to adjust the test only slightly but the splitting problem is still there, and in my local tests the problem persists in Dutch too. At some seemingly arbitrary point further in the text, a 'random' term is being split. I then tried the fresh models using OpenNLP's SentenceDetector and TokenizerME tools but i cannot reproduce the problem on the command line using these tools. Issue #1 is fixed using the new models though. I am quite new to custom Tokenizer implementations and certainly those extending SegmentingTokenizerBase. What do you think? Thanks, Markus > OpenNLP tokenizer is fooled by text containing spurious punctuation > --- > > Key: LUCENE-9112 > URL: https://issues.apache.org/jira/browse/LUCENE-9112 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (9.0) >Reporter: Markus Jelsma >Priority: Major > Labels: opennlp > Fix For: master (9.0) > > Attachments: LUCENE-9112-unittest.patch > > > The OpenNLP tokenizer show weird behaviour when text contains spurious > punctuation such as having triple dots trailing a sentence... > # the first dot becomes part of the token, having 'sentence.' becomes the > token > # much further down the text, a seemingly unrelated token is then suddenly > split up, in my example (see attached unit test) the name 'Baron' is split > into 'Baro' and 'n', this is the real problem > The problems never seem to occur when using small texts in unit tests but it > certainly does in real world examples. Depending on how many 'spurious' dots, > a completely different term can become split, or the same term in just a > different location. > I am not too sure if this is actually a problem in the Lucene code, but it is > a problem and i have a Lucene unit test proving the problem. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006089#comment-17006089 ] Markus Jelsma commented on LUCENE-9112: --- I now believe it is a problem in the Lucene code, namely it being fooled by a punctuation mark and then something is mishandled with the internal buffer in SegmentingTokenizerBase. The buffer is 1024 and the point where my term is being split is exactly at the 1024th character in the String. Simply increasing BUFFERMAX 'solves' the problem i have. But i don't know where the underlying problem really is. > OpenNLP tokenizer is fooled by text containing spurious punctuation > --- > > Key: LUCENE-9112 > URL: https://issues.apache.org/jira/browse/LUCENE-9112 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (9.0) >Reporter: Markus Jelsma >Priority: Major > Labels: opennlp > Fix For: master (9.0) > > Attachments: LUCENE-9112-unittest.patch > > > The OpenNLP tokenizer show weird behaviour when text contains spurious > punctuation such as having triple dots trailing a sentence... > # the first dot becomes part of the token, having 'sentence.' becomes the > token > # much further down the text, a seemingly unrelated token is then suddenly > split up, in my example (see attached unit test) the name 'Baron' is split > into 'Baro' and 'n', this is the real problem > The problems never seem to occur when using small texts in unit tests but it > certainly does in real world examples. Depending on how many 'spurious' dots, > a completely different term can become split, or the same term in just a > different location. > I am not too sure if this is actually a problem in the Lucene code, but it is > a problem and i have a Lucene unit test proving the problem. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-9112: -- Attachment: LUCENE-9112-unittest.patch > OpenNLP tokenizer is fooled by text containing spurious punctuation > --- > > Key: LUCENE-9112 > URL: https://issues.apache.org/jira/browse/LUCENE-9112 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (9.0) >Reporter: Markus Jelsma >Priority: Major > Labels: opennlp > Fix For: master (9.0) > > Attachments: LUCENE-9112-unittest.patch, LUCENE-9112-unittest.patch > > > The OpenNLP tokenizer show weird behaviour when text contains spurious > punctuation such as having triple dots trailing a sentence... > # the first dot becomes part of the token, having 'sentence.' becomes the > token > # much further down the text, a seemingly unrelated token is then suddenly > split up, in my example (see attached unit test) the name 'Baron' is split > into 'Baro' and 'n', this is the real problem > The problems never seem to occur when using small texts in unit tests but it > certainly does in real world examples. Depending on how many 'spurious' dots, > a completely different term can become split, or the same term in just a > different location. > I am not too sure if this is actually a problem in the Lucene code, but it is > a problem and i have a Lucene unit test proving the problem. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006089#comment-17006089 ] Markus Jelsma edited comment on LUCENE-9112 at 12/31/19 1:22 PM: - I now believe it is a problem in the Lucene code, namely -it being fooled by a punctuation mark and then something- is mishandled with the internal buffer in SegmentingTokenizerBase. The buffer is 1024 and the point where my term is being split is exactly at the 1024th character in the String. Simply increasing BUFFERMAX 'solves' the problem i have. But i don't know where the underlying problem really is. edit: i adjusted the text so it no longer needs spurious punctuations marks for it to split a term. It always splits the 1024th character, it is just that in some cases, that character already is a whitespace. was (Author: markus17): I now believe it is a problem in the Lucene code, namely it being fooled by a punctuation mark and then something is mishandled with the internal buffer in SegmentingTokenizerBase. The buffer is 1024 and the point where my term is being split is exactly at the 1024th character in the String. Simply increasing BUFFERMAX 'solves' the problem i have. But i don't know where the underlying problem really is. > OpenNLP tokenizer is fooled by text containing spurious punctuation > --- > > Key: LUCENE-9112 > URL: https://issues.apache.org/jira/browse/LUCENE-9112 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (9.0) >Reporter: Markus Jelsma >Priority: Major > Labels: opennlp > Fix For: master (9.0) > > Attachments: LUCENE-9112-unittest.patch, LUCENE-9112-unittest.patch > > > The OpenNLP tokenizer show weird behaviour when text contains spurious > punctuation such as having triple dots trailing a sentence... > # the first dot becomes part of the token, having 'sentence.' becomes the > token > # much further down the text, a seemingly unrelated token is then suddenly > split up, in my example (see attached unit test) the name 'Baron' is split > into 'Baro' and 'n', this is the real problem > The problems never seem to occur when using small texts in unit tests but it > certainly does in real world examples. Depending on how many 'spurious' dots, > a completely different term can become split, or the same term in just a > different location. > I am not too sure if this is actually a problem in the Lucene code, but it is > a problem and i have a Lucene unit test proving the problem. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-9112: -- Attachment: en-token.bin en-sent.bin > OpenNLP tokenizer is fooled by text containing spurious punctuation > --- > > Key: LUCENE-9112 > URL: https://issues.apache.org/jira/browse/LUCENE-9112 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (9.0) >Reporter: Markus Jelsma >Priority: Major > Labels: opennlp > Fix For: master (9.0) > > Attachments: LUCENE-9112-unittest.patch, LUCENE-9112-unittest.patch, > en-sent.bin, en-token.bin > > > The OpenNLP tokenizer show weird behaviour when text contains spurious > punctuation such as having triple dots trailing a sentence... > # the first dot becomes part of the token, having 'sentence.' becomes the > token > # much further down the text, a seemingly unrelated token is then suddenly > split up, in my example (see attached unit test) the name 'Baron' is split > into 'Baro' and 'n', this is the real problem > The problems never seem to occur when using small texts in unit tests but it > certainly does in real world examples. Depending on how many 'spurious' dots, > a completely different term can become split, or the same term in just a > different location. > I am not too sure if this is actually a problem in the Lucene code, but it is > a problem and i have a Lucene unit test proving the problem. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14156) SolrJ support for json.queries
Mikhail Khludnev created SOLR-14156: --- Summary: SolrJ support for json.queries Key: SOLR-14156 URL: https://issues.apache.org/jira/browse/SOLR-14156 Project: Solr Issue Type: Sub-task Reporter: Mikhail Khludnev After SOLR-12490 is done one can provide SolrJ Support and documentation as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006105#comment-17006105 ] Markus Jelsma commented on LUCENE-9112: --- There it is: {code} usableLength = findSafeEnd(); if (usableLength < 0) usableLength = length; /* * more than IOBUFFER of text without breaks, * gonna possibly truncate tokens */ {code} The text i send to be analyzed no longer has newlines, or any character that is found by findSafeEnd(). > OpenNLP tokenizer is fooled by text containing spurious punctuation > --- > > Key: LUCENE-9112 > URL: https://issues.apache.org/jira/browse/LUCENE-9112 > Project: Lucene - Core > Issue Type: Bug > Components: modules/analysis >Affects Versions: master (9.0) >Reporter: Markus Jelsma >Priority: Major > Labels: opennlp > Fix For: master (9.0) > > Attachments: LUCENE-9112-unittest.patch, LUCENE-9112-unittest.patch, > en-sent.bin, en-token.bin > > > The OpenNLP tokenizer show weird behaviour when text contains spurious > punctuation such as having triple dots trailing a sentence... > # the first dot becomes part of the token, having 'sentence.' becomes the > token > # much further down the text, a seemingly unrelated token is then suddenly > split up, in my example (see attached unit test) the name 'Baron' is split > into 'Baro' and 'n', this is the real problem > The problems never seem to occur when using small texts in unit tests but it > certainly does in real world examples. Depending on how many 'spurious' dots, > a completely different term can become split, or the same term in just a > different location. > I am not too sure if this is actually a problem in the Lucene code, but it is > a problem and i have a Lucene unit test proving the problem. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] epugh commented on a change in pull request #1033: SOLR-13965: Use Plugin to add new expressions to GraphHandler
epugh commented on a change in pull request #1033: SOLR-13965: Use Plugin to add new expressions to GraphHandler URL: https://github.com/apache/lucene-solr/pull/1033#discussion_r362241759 ## File path: solr/core/src/java/org/apache/solr/handler/GraphHandler.java ## @@ -92,24 +104,29 @@ public void inform(SolrCore core) { } // This pulls all the overrides and additions from the config +List pluginInfos = core.getSolrConfig().getPluginInfos(Expressible.class.getName()); + +// Check deprecated approach. Object functionMappingsObj = initArgs.get("streamFunctions"); if(null != functionMappingsObj){ + log.warn("solrconfig.xml: is deprecated for adding additional streaming functions to GraphHandler."); NamedList functionMappings = (NamedList)functionMappingsObj; for(Entry functionMapping : functionMappings) { String key = functionMapping.getKey(); PluginInfo pluginInfo = new PluginInfo(key, Collections.singletonMap("class", functionMapping.getValue())); - -if (pluginInfo.pkgName == null) { - Class clazz = core.getResourceLoader().findClass((String) functionMapping.getValue(), - Expressible.class); - streamFactory.withFunctionName(key, clazz); -} else { - StreamHandler.ExpressibleHolder holder = new StreamHandler.ExpressibleHolder(pluginInfo, core, SolrConfig.classVsSolrPluginInfo.get(Expressible.class)); - streamFactory.withFunctionName(key, () -> holder.getClazz()); -} - +pluginInfos.add(pluginInfo); } +} +for (PluginInfo pluginInfo : pluginInfos) { + if (pluginInfo.pkgName != null) { +ExpressibleHolder holder = new ExpressibleHolder(pluginInfo, core, SolrConfig.classVsSolrPluginInfo.get(Expressible.class)); +streamFactory.withFunctionName(pluginInfo.name, +() -> holder.getClazz()); + } else { +Class clazz = core.getMemClassLoader().findClass(pluginInfo.className, Expressible.class); Review comment: I'll be honest @cpoerschke I am mostly copying what is in StreamHandler, as this includes the new plugin framework support. StreamHandler uses the `core.getMemClassLoader`, which when I click into seems to be all about plugins ;-). My whole goal in this PR was to get StreamHandler and GraphHandler to be much more similar in how they work. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14152) Upgrade Derby dependency in DIH
[ https://issues.apache.org/jira/browse/SOLR-14152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006181#comment-17006181 ] Alexandre Rafalovitch commented on SOLR-14152: -- Are you sure Derby is actually used anywhere. My (forever postponed) work on SOLR-10312 seems to indicate that it may not actually be referenced anymore. But I did not dig any further. > Upgrade Derby dependency in DIH > --- > > Key: SOLR-14152 > URL: https://issues.apache.org/jira/browse/SOLR-14152 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - DataImportHandler >Reporter: Jan Høydahl >Priority: Major > > Upgrade Derby to 10.15.1.3 (used in DIH example) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14157) "name" parameter missing from Collection Backup/Restore RefGuide
David Eric Pugh created SOLR-14157: -- Summary: "name" parameter missing from Collection Backup/Restore RefGuide Key: SOLR-14157 URL: https://issues.apache.org/jira/browse/SOLR-14157 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: documentation Reporter: David Eric Pugh The refguide documentation on backup and restore is missing the required parameter "name", though it does have it on the examples. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] epugh opened a new pull request #1133: SOLR-14157-backup-restore-docs-missing-parameter
epugh opened a new pull request #1133: SOLR-14157-backup-restore-docs-missing-parameter URL: https://github.com/apache/lucene-solr/pull/1133 # Description The `name` parameter is madatory on BACKUP and RESTORE commands, but isn't listed, though it is in the example URLs... # Solution Added the parameter. # Tests n/a, though manually tested the commands to confirm needed, and reviewed the Java code. # Checklist Please review the following and check all that apply: - [ X] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [ X] I have created a Jira issue and added the issue ID to my pull request title. - [ X] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [ X] I have developed this patch against the `master` branch. - [ X] I have run `ant precommit` and the appropriate test suite. - [ ] I have added tests for my changes. - [ X] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-13844) Remove recovering shard term with corresponding core shard term.
[ https://issues.apache.org/jira/browse/SOLR-13844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Houston Putman resolved SOLR-13844. --- Fix Version/s: 8.4 Resolution: Fixed Thanks for pointing this out Cassandra! > Remove recovering shard term with corresponding core shard term. > > > Key: SOLR-13844 > URL: https://issues.apache.org/jira/browse/SOLR-13844 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: master (9.0) >Reporter: Houston Putman >Assignee: Cao Manh Dat >Priority: Minor > Fix For: master (9.0), 8.4 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Currently if a recovering replica (solr 7.3+) is deleted, the term for that > core in the shard's terms in ZK is removed. However the {{_recovering}} > term is not removed as well. This can create clutter and confusion in the > shard terms ZK node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14154) Return correct isolation level when retrieving it from the SQL Connection
[ https://issues.apache.org/jira/browse/SOLR-14154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006230#comment-17006230 ] Kevin Risden commented on SOLR-14154: - [~vercani] can you provide a PR for this? Should be a simple change. Mostly just should implement get/set here [1] at the same time following the Connection javadocs [2] I am curious about what SQL client is running into this. For reference, the database metadata itself shows no support for transactions [3] and [4]. [1] https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/io/sql/ConnectionImpl.java#L188 [2] https://docs.oracle.com/javase/8/docs/api/java/sql/Connection.html#getTransactionIsolation-- [3] https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/io/sql/DatabaseMetaDataImpl.java#L672 [4] https://docs.oracle.com/javase/8/docs/api/java/sql/DatabaseMetaData.html#supportsTransactionIsolationLevel-int- > Return correct isolation level when retrieving it from the SQL Connection > - > > Key: SOLR-14154 > URL: https://issues.apache.org/jira/browse/SOLR-14154 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Parallel SQL >Affects Versions: 8.4 >Reporter: Nick Vercammen >Priority: Minor > Fix For: master (9.0) > > > When calling the getTransactionIsolation() on the Sql.ConnectionImpl an > UnsupportedException is thrown. It would be better to return TRANSACTION_NONE > so clients can determine themselves it is not supported without receiving an > exception -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14154) Return correct isolation level when retrieving it from the SQL Connection
[ https://issues.apache.org/jira/browse/SOLR-14154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Risden updated SOLR-14154: Fix Version/s: (was: master (9.0)) > Return correct isolation level when retrieving it from the SQL Connection > - > > Key: SOLR-14154 > URL: https://issues.apache.org/jira/browse/SOLR-14154 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Parallel SQL >Affects Versions: 8.4 >Reporter: Nick Vercammen >Priority: Minor > > When calling the getTransactionIsolation() on the Sql.ConnectionImpl an > UnsupportedException is thrown. It would be better to return TRANSACTION_NONE > so clients can determine themselves it is not supported without receiving an > exception -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14152) Upgrade Derby dependency in DIH
[ https://issues.apache.org/jira/browse/SOLR-14152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006239#comment-17006239 ] Jan Høydahl commented on SOLR-14152: Even better - just remove it :) Thought it was used in DIH tests. > Upgrade Derby dependency in DIH > --- > > Key: SOLR-14152 > URL: https://issues.apache.org/jira/browse/SOLR-14152 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - DataImportHandler >Reporter: Jan Høydahl >Priority: Major > > Upgrade Derby to 10.15.1.3 (used in DIH example) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13199) NPE due to unexpected null return value from QueryBitSetProducer.getBitSet
[ https://issues.apache.org/jira/browse/SOLR-13199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006270#comment-17006270 ] Andrew Engelbrecht commented on SOLR-13199: --- In the master branch, the error is no longer a null pointer exception: {code:json} { "responseHeader":{ "status":400, "QTime":1, "params":{ "q":"*:*", "fl":"[child parentFilter=ge]"}}, "error":{ "metadata":[ "error-class","org.apache.solr.common.SolrException", "root-error-class","org.apache.solr.common.SolrException"], "msg":"Parent filter should not be sent when the schema is nested", "code":400}} {code} {code:java} 2020-01-01 00:32:03.201 ERROR (qtp1845623216-18) [ x:films] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Parent filter should not be sent when the schema is nested at org.apache.solr.response.transform.ChildDocTransformerFactory.createChildDocTransformer(ChildDocTransformerFactory.java:105) at org.apache.solr.response.transform.ChildDocTransformerFactory.create(ChildDocTransformerFactory.java:75) at org.apache.solr.search.SolrReturnFields.add(SolrReturnFields.java:301) at org.apache.solr.search.SolrReturnFields.parseFieldList(SolrReturnFields.java:145) at org.apache.solr.search.SolrReturnFields.(SolrReturnFields.java:123) at org.apache.solr.search.SolrReturnFields.(SolrReturnFields.java:99) at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:140) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:302) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:208) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2582) ... {code} > NPE due to unexpected null return value from QueryBitSetProducer.getBitSet > -- > > Key: SOLR-13199 > URL: https://issues.apache.org/jira/browse/SOLR-13199 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: master (9.0) > Environment: h1. Steps to reproduce > * Use a Linux machine. > * Build commit {{ea2c8ba}} of Solr as described in the section below. > * Build the films collection as described below. > * Start the server using the command {{./bin/solr start -f -p 8983 -s > /tmp/home}} > * Request the URL given in the bug description. > h1. Compiling the server > {noformat} > git clone https://github.com/apache/lucene-solr > cd lucene-solr > git checkout ea2c8ba > ant compile > cd solr > ant server > {noformat} > h1. Building the collection > We followed [Exercise > 2|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-2] from > the [Solr > Tutorial|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html]. The > attached file ({{home.zip}}) gives the contents of folder {{/tmp/home}} that > you will obtain by following the steps below: > {noformat} > mkdir -p /tmp/home > echo '' > > /tmp/home/solr.xml > {noformat} > In one terminal start a Solr instance in foreground: > {noformat} > ./bin/solr start -f -p 8983 -s /tmp/home > {noformat} > In another terminal, create a collection of movies, with no shards and no > replication, and initialize it: > {noformat} > bin/solr create -c films > curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": > {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' > http://localhost:8983/solr/films/schema > curl -X POST -H 'Content-type:application/json' --data-binary > '{"add-copy-field" : {"source":"*","dest":"_text_"}}' > http://localhost:8983/solr/films/schema > ./bin/post -c films example/films/films.json > {noformat} >Reporter: Johannes Kloos >Priority: Minor > Labels: diffblue, newdev > Attachments: home.zip > > > Requesting the following URL causes Solr to return an HTTP 500 error response: > {noformat} > http://localhost:8983/solr/films/select?fl=[child%20parentFilter=ge]&q=*:* > {noformat} > The error response seems to be caused by the following uncaught exception: > {noformat} > java.lang.NullPointerException > at > org.apache.solr.response.transform.ChildDocTransformer.transform(ChildDocTransformer.java:92) > at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:103) > at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:1) > at > org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:184) > at > org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:136) > at > org.apache.solr.common.util.JsonTextWriter.writeNamedListAsMapWithDups(JsonTextWriter.java:386) > at > org.apache.solr.common.util.JsonTextWriter.writeNamedList(JsonTextWriter.java:292) >
[jira] [Commented] (LUCENE-9093) Unified highlighter with word separator never gives context to the left
[ https://issues.apache.org/jira/browse/LUCENE-9093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006312#comment-17006312 ] ASF subversion and git services commented on LUCENE-9093: - Commit 4c9cc2cefd7f3593c4b4e1e5a087e3d206298989 in lucene-solr's branch refs/heads/master from Nándor Mátravölgyi [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4c9cc2c ] LUCENE-9093: UnifiedHighlighter LengthGoalBreakIterator frag align Matches in passages should be centered better on average. Closes #1123 > Unified highlighter with word separator never gives context to the left > --- > > Key: LUCENE-9093 > URL: https://issues.apache.org/jira/browse/LUCENE-9093 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/highlighter >Reporter: Tim Retout >Priority: Major > Attachments: LUCENE-9093.patch > > Time Spent: 8h 10m > Remaining Estimate: 0h > > When using the unified highlighter with hl.bs.type=WORD, I am not able to get > context to the left of the matches returned; only words to the right of each > match are shown. I see this behaviour on both Solr 6.4 and Solr 7.1. > Without context to the left of a match, the highlighted snippets are much > less useful for understanding where the match appears in a document. > As an example, using the techproducts data with Solr 7.1, given a search for > "apple", highlighting the "features" field: > http://localhost:8983/solr/techproducts/select?hl.fl=features&hl=on&q=apple&hl.bs.type=WORD&hl.fragsize=30&hl.method=unified > I see this snippet: > "Apple Lossless, H.264 video" > Note that "Apple" is anchored to the left. Compare with the original > highlighter: > http://localhost:8983/solr/techproducts/select?hl.fl=features&hl=on&q=apple&hl.fragsize=30 > And the match has context either side: > ", Audible, Apple Lossless, H.264 video" > (To complicate this, in general I am not sure that the unified highlighter is > respecting the hl.fragsize parameter, although [SOLR-9935] suggests support > was added. I included the hl.fragsize param in the unified URL too, but it's > making no difference unless set to 0.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley closed pull request #1123: LUCENE-9093: Unified highlighter with word separator never gives context to the left
dsmiley closed pull request #1123: LUCENE-9093: Unified highlighter with word separator never gives context to the left URL: https://github.com/apache/lucene-solr/pull/1123 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9093) Unified highlighter with word separator never gives context to the left
[ https://issues.apache.org/jira/browse/LUCENE-9093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006316#comment-17006316 ] ASF subversion and git services commented on LUCENE-9093: - Commit 5874b9c7933233712da14c5a5b9bb4f916eb77f8 in lucene-solr's branch refs/heads/branch_8x from Nándor Mátravölgyi [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5874b9c ] LUCENE-9093: UnifiedHighlighter LengthGoalBreakIterator frag align Matches in passages should be centered better on average. Closes #1123 (cherry picked from commit 4c9cc2cefd7f3593c4b4e1e5a087e3d206298989) > Unified highlighter with word separator never gives context to the left > --- > > Key: LUCENE-9093 > URL: https://issues.apache.org/jira/browse/LUCENE-9093 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/highlighter >Reporter: Tim Retout >Priority: Major > Attachments: LUCENE-9093.patch > > Time Spent: 8h 10m > Remaining Estimate: 0h > > When using the unified highlighter with hl.bs.type=WORD, I am not able to get > context to the left of the matches returned; only words to the right of each > match are shown. I see this behaviour on both Solr 6.4 and Solr 7.1. > Without context to the left of a match, the highlighted snippets are much > less useful for understanding where the match appears in a document. > As an example, using the techproducts data with Solr 7.1, given a search for > "apple", highlighting the "features" field: > http://localhost:8983/solr/techproducts/select?hl.fl=features&hl=on&q=apple&hl.bs.type=WORD&hl.fragsize=30&hl.method=unified > I see this snippet: > "Apple Lossless, H.264 video" > Note that "Apple" is anchored to the left. Compare with the original > highlighter: > http://localhost:8983/solr/techproducts/select?hl.fl=features&hl=on&q=apple&hl.fragsize=30 > And the match has context either side: > ", Audible, Apple Lossless, H.264 video" > (To complicate this, in general I am not sure that the unified highlighter is > respecting the hl.fragsize parameter, although [SOLR-9935] suggests support > was added. I included the hl.fragsize param in the unified URL too, but it's > making no difference unless set to 0.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9093) Unified highlighter with word separator never gives context to the left
[ https://issues.apache.org/jira/browse/LUCENE-9093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated LUCENE-9093: - Fix Version/s: 8.4 Assignee: David Smiley Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for contributing Nándor Mátravölgyi! The UnifiedHighlighter is even better now. > Unified highlighter with word separator never gives context to the left > --- > > Key: LUCENE-9093 > URL: https://issues.apache.org/jira/browse/LUCENE-9093 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/highlighter >Reporter: Tim Retout >Assignee: David Smiley >Priority: Major > Fix For: 8.4 > > Attachments: LUCENE-9093.patch > > Time Spent: 8h 10m > Remaining Estimate: 0h > > When using the unified highlighter with hl.bs.type=WORD, I am not able to get > context to the left of the matches returned; only words to the right of each > match are shown. I see this behaviour on both Solr 6.4 and Solr 7.1. > Without context to the left of a match, the highlighted snippets are much > less useful for understanding where the match appears in a document. > As an example, using the techproducts data with Solr 7.1, given a search for > "apple", highlighting the "features" field: > http://localhost:8983/solr/techproducts/select?hl.fl=features&hl=on&q=apple&hl.bs.type=WORD&hl.fragsize=30&hl.method=unified > I see this snippet: > "Apple Lossless, H.264 video" > Note that "Apple" is anchored to the left. Compare with the original > highlighter: > http://localhost:8983/solr/techproducts/select?hl.fl=features&hl=on&q=apple&hl.fragsize=30 > And the match has context either side: > ", Audible, Apple Lossless, H.264 video" > (To complicate this, in general I am not sure that the unified highlighter is > respecting the hl.fragsize parameter, although [SOLR-9935] suggests support > was added. I included the hl.fragsize param in the unified URL too, but it's > making no difference unless set to 0.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9093) Unified highlighter with word separator never gives context to the left
[ https://issues.apache.org/jira/browse/LUCENE-9093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated LUCENE-9093: - Fix Version/s: (was: 8.4) 8.5 > Unified highlighter with word separator never gives context to the left > --- > > Key: LUCENE-9093 > URL: https://issues.apache.org/jira/browse/LUCENE-9093 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/highlighter >Reporter: Tim Retout >Assignee: David Smiley >Priority: Major > Fix For: 8.5 > > Attachments: LUCENE-9093.patch > > Time Spent: 8h 10m > Remaining Estimate: 0h > > When using the unified highlighter with hl.bs.type=WORD, I am not able to get > context to the left of the matches returned; only words to the right of each > match are shown. I see this behaviour on both Solr 6.4 and Solr 7.1. > Without context to the left of a match, the highlighted snippets are much > less useful for understanding where the match appears in a document. > As an example, using the techproducts data with Solr 7.1, given a search for > "apple", highlighting the "features" field: > http://localhost:8983/solr/techproducts/select?hl.fl=features&hl=on&q=apple&hl.bs.type=WORD&hl.fragsize=30&hl.method=unified > I see this snippet: > "Apple Lossless, H.264 video" > Note that "Apple" is anchored to the left. Compare with the original > highlighter: > http://localhost:8983/solr/techproducts/select?hl.fl=features&hl=on&q=apple&hl.fragsize=30 > And the match has context either side: > ", Audible, Apple Lossless, H.264 video" > (To complicate this, in general I am not sure that the unified highlighter is > respecting the hl.fragsize parameter, although [SOLR-9935] suggests support > was added. I included the hl.fragsize param in the unified URL too, but it's > making no difference unless set to 0.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14154) Return correct isolation level when retrieving it from the SQL Connection
[ https://issues.apache.org/jira/browse/SOLR-14154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006330#comment-17006330 ] Nick Vercammen commented on SOLR-14154: --- Sure ...setting up the development environment is more work than making the change ;) Do I make PR's for each version I want this to be backported to? I'm trying to write a Solr driver for [Metabase|[https://www.metabase.com/]] and the JDBC route seemed the way to go for me. Unfortunately metabase creates a pooled connection wich uses the getTransactionIsolation() method and crashes: {noformat} 01-01 08:09:56 WARN resourcepool.BasicResourcePool :: com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@188861ed -- Acquisition Attempt Failed!!! Clearing pending acquires. While trying to acquire a needed new resource, we failed to succeed more than the maximum number of allowed acquisition attempts (30). Last acquisition attempt exception: java.lang.UnsupportedOperationException at org.apache.solr.client.solrj.io.sql.ConnectionImpl.getTransactionIsolation(ConnectionImpl.java:189) at com.mchange.v2.c3p0.impl.NewPooledConnection.(NewPooledConnection.java:120) at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:181) at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:147) at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:202) at com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1176) at com.mchange.v2.resourcepool.BasicResourcePool.doAcquireAndDecrementPendingAcquiresWithinLockOnSuccess(BasicResourcePool.java:1163) at com.mchange.v2.resourcepool.BasicResourcePool.access$700(BasicResourcePool.java:44) at com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask.run(BasicResourcePool.java:1908) at com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696){noformat} > Return correct isolation level when retrieving it from the SQL Connection > - > > Key: SOLR-14154 > URL: https://issues.apache.org/jira/browse/SOLR-14154 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Parallel SQL >Affects Versions: 8.4 >Reporter: Nick Vercammen >Priority: Minor > > When calling the getTransactionIsolation() on the Sql.ConnectionImpl an > UnsupportedException is thrown. It would be better to return TRANSACTION_NONE > so clients can determine themselves it is not supported without receiving an > exception -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org