[jira] [Commented] (LUCENE-9536) Optimize OrdinalMap when one segment contains all distinct values?
[ https://issues.apache.org/jira/browse/LUCENE-9536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229052#comment-17229052 ] ASF subversion and git services commented on LUCENE-9536: - Commit 849a28e539bb1a33074d1bda32685c3ad67fd374 in lucene-solr's branch refs/heads/master from Julie Tibshirani [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=849a28e ] LUCENE-9536: Correct the OrdinalMap optimization. (#2070) Previously we only checked that the first segment's ordinal deltas were all zero. This didn't account for some rare cases where the segment's ordinals matched the global ones, but it did not contain all terms. This can happen when using a FilteredTermsEnum, for example when merging a segment with deletions. > Optimize OrdinalMap when one segment contains all distinct values? > -- > > Key: LUCENE-9536 > URL: https://issues.apache.org/jira/browse/LUCENE-9536 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Julie Tibshirani >Priority: Minor > Fix For: 8.8 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > For doc values that are not too high cardinality, it seems common to have > some large segments that contain all distinct values (plus many small > segments who are missing some values). In this case, we could check if the > first segment ords map perfectly to global ords and if so store > `globalOrdDeltas` and `firstSegments` as `LongValues.ZEROES`. This could save > a small amount of space. > I don’t think it would help a huge amount, especially since the optimization > might only kick in with small/ medium cardinalities, which don’t create huge > `OrdinalMap` instances anyways? But it is simple and seemed worth mentioning. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz merged pull request #2070: LUCENE-9536: Correct the OrdinalMap optimization.
jpountz merged pull request #2070: URL: https://github.com/apache/lucene-solr/pull/2070 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] anshumg commented on a change in pull request #2010: SOLR-12182: Don't persist base_url in ZK as the scheme is variable, compute from node_name instead
anshumg commented on a change in pull request #2010: URL: https://github.com/apache/lucene-solr/pull/2010#discussion_r520351889 ## File path: solr/solrj/src/java/org/apache/solr/common/cloud/UrlScheme.java ## @@ -0,0 +1,260 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.solr.common.cloud; + +import java.io.IOException; +import java.lang.invoke.MethodHandles; +import java.net.URLEncoder; +import java.nio.charset.StandardCharsets; +import java.util.Map; +import java.util.Objects; +import java.util.SortedSet; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.ConcurrentMap; + +import org.apache.commons.lang.StringUtils; +import org.apache.solr.common.util.Utils; +import org.apache.zookeeper.KeeperException; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.solr.common.cloud.ZkStateReader.URL_SCHEME; + +/** + * Singleton access to global urlScheme, which although is stored in ZK as a cluster property + * really should be treated like a static global that is set at initialization and not altered after. + * + * Client applications should not use this class directly; it is only included in SolrJ because Replica + * and ZkNodeProps depend on it. + */ +public enum UrlScheme implements LiveNodesListener, ClusterPropertiesListener { + INSTANCE; + + private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + + public static final String HTTP = "http"; + public static final String HTTPS = "https"; + public static final String HTTPS_PORT_PROP = "solr.jetty.https.port"; + public static final String USE_LIVENODES_URL_SCHEME = "ext.useLiveNodesUrlScheme"; + + private volatile String urlScheme = HTTP; + private volatile boolean useLiveNodesUrlScheme = false; + private volatile SortedSet liveNodes = null; + private volatile SolrZkClient zkClient = null; + + private final ConcurrentMap nodeSchemeCache = new ConcurrentHashMap<>(); + + /** + * Called during ZkController initialization to set the urlScheme based on cluster properties. + * @param client The SolrZkClient needed to read cluster properties from ZK. + * @throws IOException If a connection or other I/O related error occurs while reading from ZK. + */ + public void initFromClusterProps(final SolrZkClient client) throws IOException { +this.zkClient = client; + +// Have to go directly to the cluster props b/c this needs to happen before ZkStateReader does its thing +ClusterProperties clusterProps = new ClusterProperties(client); +this.useLiveNodesUrlScheme = + "true".equals(clusterProps.getClusterProperty(UrlScheme.USE_LIVENODES_URL_SCHEME, "false")); +setUrlSchemeFromClusterProps(clusterProps.getClusterProperties()); + } + + private void setUrlSchemeFromClusterProps(Map props) { +// Set the global urlScheme from cluster prop or if that is not set, look at the urlScheme sys prop +final String scheme = (String)props.get(ZkStateReader.URL_SCHEME); +if (StringUtils.isNotEmpty(scheme)) { + // track the urlScheme in a global so we can use it during ZK read / write operations for cluster state objects + this.urlScheme = HTTPS.equals(scheme) ? HTTPS : HTTP; +} else { + String urlSchemeFromSysProp = System.getProperty(URL_SCHEME, HTTP); + if (HTTPS.equals(urlSchemeFromSysProp)) { +log.warn("Cluster property 'urlScheme' not set but system property is set to 'https'. " + +"You should set the cluster property and restart all nodes for consistency."); Review comment: Should we instead mention that Solr doesn't support partial TLS enabled clusters? ## File path: solr/solrj/src/java/org/apache/solr/common/cloud/UrlScheme.java ## @@ -0,0 +1,260 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain
[jira] [Commented] (LUCENE-9536) Optimize OrdinalMap when one segment contains all distinct values?
[ https://issues.apache.org/jira/browse/LUCENE-9536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229057#comment-17229057 ] ASF subversion and git services commented on LUCENE-9536: - Commit 36ca8595a9fe09697e402de0864bbcdc52f9a8c4 in lucene-solr's branch refs/heads/branch_8x from Julie Tibshirani [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=36ca859 ] LUCENE-9536: Correct the OrdinalMap optimization. (#2070) Previously we only checked that the first segment's ordinal deltas were all zero. This didn't account for some rare cases where the segment's ordinals matched the global ones, but it did not contain all terms. This can happen when using a FilteredTermsEnum, for example when merging a segment with deletions. > Optimize OrdinalMap when one segment contains all distinct values? > -- > > Key: LUCENE-9536 > URL: https://issues.apache.org/jira/browse/LUCENE-9536 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Julie Tibshirani >Priority: Minor > Fix For: 8.8 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > For doc values that are not too high cardinality, it seems common to have > some large segments that contain all distinct values (plus many small > segments who are missing some values). In this case, we could check if the > first segment ords map perfectly to global ords and if so store > `globalOrdDeltas` and `firstSegments` as `LongValues.ZEROES`. This could save > a small amount of space. > I don’t think it would help a huge amount, especially since the optimization > might only kick in with small/ medium cardinalities, which don’t create huge > `OrdinalMap` instances anyways? But it is simple and seemed worth mentioning. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9322) Discussing a unified vectors format API
[ https://issues.apache.org/jira/browse/LUCENE-9322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229058#comment-17229058 ] ASF subversion and git services commented on LUCENE-9322: - Commit 514c363f1d82b801234b16ef16804f08da86dc7a in lucene-solr's branch refs/heads/master from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=514c363 ] LUCENE-9322: Move Solr to Lucene90Codec. And drop configurability of Lucene87Codec since it shouldn't be used for writing anymore. > Discussing a unified vectors format API > --- > > Key: LUCENE-9322 > URL: https://issues.apache.org/jira/browse/LUCENE-9322 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Julie Tibshirani >Priority: Major > Fix For: master (9.0) > > Time Spent: 7h 20m > Remaining Estimate: 0h > > Two different approximate nearest neighbor approaches are currently being > developed, one based on HNSW (LUCENE-9004) and another based on coarse > quantization ([#LUCENE-9136]). Each prototype proposes to add a new format to > handle vectors. In LUCENE-9136 we discussed the possibility of a unified API > that could support both approaches. The two ANN strategies give different > trade-offs in terms of speed, memory, and complexity, and it’s likely that > we’ll want to support both. Vector search is also an active research area, > and it would be great to be able to prototype and incorporate new approaches > without introducing more formats. > To me it seems like a good time to begin discussing a unified API. The > prototype for coarse quantization > ([https://github.com/apache/lucene-solr/pull/1314]) could be ready to commit > soon (this depends on everyone's feedback of course). The approach is simple > and shows solid search performance, as seen > [here|https://github.com/apache/lucene-solr/pull/1314#issuecomment-608645326]. > I think this API discussion is an important step in moving that > implementation forward. > The goals of the API would be > # Support for storing and retrieving individual float vectors. > # Support for approximate nearest neighbor search -- given a query vector, > return the indexed vectors that are closest to it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14749) Provide a clean API for cluster-level event processing
[ https://issues.apache.org/jira/browse/SOLR-14749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229059#comment-17229059 ] ASF subversion and git services commented on SOLR-14749: Commit bac43093265c56996d35f2d5d9f93c4323a7b7e5 in lucene-solr's branch refs/heads/master from Adrien Grand [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=bac4309 ] SOLR-14749: Use h2 instead of h3 so that the javadoc tool doesn't complain about out-or-sequence headers. > Provide a clean API for cluster-level event processing > -- > > Key: SOLR-14749 > URL: https://issues.apache.org/jira/browse/SOLR-14749 > Project: Solr > Issue Type: Improvement > Components: AutoScaling >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Labels: clean-api > Fix For: master (9.0) > > Time Spent: 22h > Remaining Estimate: 0h > > This is a companion issue to SOLR-14613 and it aims at providing a clean, > strongly typed API for the functionality formerly known as "triggers" - that > is, a component for generating cluster-level events corresponding to changes > in the cluster state, and a pluggable API for processing these events. > The 8x triggers have been removed so this functionality is currently missing > in 9.0. However, this functionality is crucial for implementing the automatic > collection repair and re-balancing as the cluster state changes (nodes going > down / up, becoming overloaded / unused / decommissioned, etc). > For this reason we need this API and a default implementation of triggers > that at least can perform automatic collection repair (maintaining the > desired replication factor in presence of live node changes). > As before, the actual changes to the collections will be executed using > existing CollectionAdmin API, which in turn may use the placement plugins > from SOLR-14613. > h3. Division of responsibility > * built-in Solr components (non-pluggable): > ** cluster state monitoring and event generation, > ** simple scheduler to periodically generate scheduled events > * plugins: > ** automatic collection repair on {{nodeLost}} events (provided by default) > ** re-balancing of replicas (periodic or on {{nodeAdded}} events) > ** reporting (eg. requesting additional node provisioning) > ** scheduled maintenance (eg. removing inactive shards after split) > h3. Other considerations > These plugins (unlike the placement plugins) need to execute on one > designated node in the cluster. Currently the easiest way to implement this > is to run them on the Overseer leader node. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #2068: LUCENE-8982: Separate out native code to another module to allow cpp build with gradle
dweiss commented on a change in pull request #2068: URL: https://github.com/apache/lucene-solr/pull/2068#discussion_r520386010 ## File path: lucene/packaging/build.gradle ## @@ -32,7 +32,9 @@ def includeInBinaries = project(":lucene").subprojects.findAll {subproject -> ":lucene:packaging", ":lucene:documentation", // Exclude parent container project of analysis modules (no artifacts). -":lucene:analysis" +":lucene:analysis", +// Exclude native module, which requires manual copying and enabling +":lucene:native" Review comment: This reference needs an updated path - it'll probably fail when building the whole project. ## File path: lucene/misc/src/java/org/apache/lucene/store/NativeUnixDirectory.java ## @@ -47,10 +47,10 @@ * * To use this you must compile * NativePosixUtil.cpp (exposes Linux-specific APIs through - * JNI) for your platform, by running ant - * build-native-unix, and then putting the resulting - * libNativePosixUtil.so (from - * lucene/build/native) onto your dynamic + * JNI) for your platform, by running + * ./gradlew build -Pbuild.native=true, and then putting the resulting Review comment: I removed the flag and forgot to update the docs here. ## File path: lucene/misc/native/build.gradle ## @@ -0,0 +1,69 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/* + * This gets separated out from misc module into a native module due to incompatibility between cpp-library and java-library plugins. + * For details, please see https://github.com/gradle/gradle-native/issues/352#issuecomment-461724948 + */ +import org.apache.tools.ant.taskdefs.condition.Os + +description = 'Module for native code' + +apply plugin: 'cpp-library' + +library { + baseName = 'NativePosixUtil' Review comment: Sure. Maybe LuceneNativeIO though so that it's clear what dll it is? ## File path: lucene/misc/native/build.gradle ## @@ -0,0 +1,69 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/* + * This gets separated out from misc module into a native module due to incompatibility between cpp-library and java-library plugins. + * For details, please see https://github.com/gradle/gradle-native/issues/352#issuecomment-461724948 + */ +import org.apache.tools.ant.taskdefs.condition.Os + +description = 'Module for native code' + +apply plugin: 'cpp-library' + +library { + baseName = 'NativePosixUtil' + + // Native build for Windows platform will be added in later stage + targetMachines = [ + machines.linux.x86_64, + machines.macOS.x86_64, + machines.windows.x86_64 + ] + + // Point at platform-specific sources. Other platforms will be ignored + // (plugin won't find the toolchain). + if (Os.isFamily(Os.FAMILY_WINDOWS)) { +source.from file("${projectDir}/src/main/windows") + } else if (Os.isFamily(Os.FAMILY_UNIX) || Os.isFamily(Os.FAMILY_MAC)) { +source.from file("${projectDir}/src/main/posix") + } +} + +tasks.withType(CppCompile).configureEach { + def javaHome = rootProject.ext.runtimeJava.getInstallationDirectory().getAsFile().getPath() + + // Assume standard openjdk layout. This means only one architecture-specific include folder + // is present. + systemIncludes.from file("${javaHome}/include") + + for (def path : [ + file("${javaHome}/include/win32"), + file("${javaHome}/include/darwin"), + fi
[jira] [Commented] (SOLR-14788) Solr: The Next Big Thing
[ https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229071#comment-17229071 ] Shawn Heisey commented on SOLR-14788: - Mark, I have something of extremely narrow focus that I would like to bring up. I mentioned it briefly on SOLR-7191 ... now that we are seeing real work towards a major overhaul, I think it's worth mentioning again. I don't know if anyone is going to agree with me. :) One of the problems noted with the overseer is that it is very easy for its work queue to accrue so many items that the zookeeper max packet length (default about one megabyte) is exceeded. Increasing the folder depth of messages in the overseer work queue by one level will allow the default ZK packet size to work successfully with over one billion items in the queue. It would be accomplished by creating folders inside the queue and limiting the number of items in each folder to 32768. 32768 folders that each hold 32768 messages is over one billion total messages, and reading it should fit inside the default ZK packet length. Naturally such a new method of overseer operation should require a config option to enable, or there would be backward-compatibility issues with existing releases. Or maybe there's a way to detect the Solr version of all nodes in the cloud and automatically enable it if they are all new enough. It is arguable that situations which accumulate so many overseer work items are themselves something that should be prevented. Maybe your rewrite of the overseer takes care of that ... I haven't looked. If a work queue is still part of the new overseer design, then I think it should be modified as I described above. Or we could hope that ZOOKEEPER-1162 is taken seriously and fixed. I wish I had a clue about the code that makes SolrCloud go. I don't know where to begin looking. > Solr: The Next Big Thing > > > Key: SOLR-14788 > URL: https://issues.apache.org/jira/browse/SOLR-14788 > Project: Solr > Issue Type: Task >Reporter: Mark Robert Miller >Assignee: Mark Robert Miller >Priority: Critical > > h3. > [!https://www.unicode.org/consortium/aacimg/1F46E.png!|https://www.unicode.org/consortium/adopted-characters.html#b1F46E]{color:#00875a}*The > Policeman is on duty!*{color} > {quote}_{color:#de350b}*When The Policeman is on duty, sit back, relax, and > have some fun. Try to make some progress. Don't stress too much about the > impact of your changes or maintaining stability and performance and > correctness so much. Until the end of phase 1, I've got your back. I have a > variety of tools and contraptions I have been building over the years and I > will continue training them on this branch. I will review your changes and > peer out across the land and course correct where needed. As Mike D will be > thinking, "Sounds like a bottleneck Mark." And indeed it will be to some > extent. Which is why once stage one is completed, I will flip The Policeman > to off duty. When off duty, I'm always* {color:#de350b}*occasionally*{color} > *down for some vigilante justice, but I won't be walking the beat, all that > stuff about sit back and relax goes out the window.*{color}_ > {quote} > > I have stolen this title from Ishan or Noble and Ishan. > This issue is meant to capture the work of a small team that is forming to > push Solr and SolrCloud to the next phase. > I have kicked off the work with an effort to create a very fast and solid > base. That work is not 100% done, but it's ready to join the fight. > Tim Potter has started giving me a tremendous hand in finishing up. Ishan and > Noble have already contributed support and testing and have plans for > additional work to shore up some of our current shortcomings. > Others have expressed an interest in helping and hopefully they will pop up > here as well. > Let's organize and discuss our efforts here and in various sub issues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #2051: LUCENE-9594 Add linear function for FeatureField
jpountz commented on a change in pull request #2051: URL: https://github.com/apache/lucene-solr/pull/2051#discussion_r520408605 ## File path: lucene/core/src/java/org/apache/lucene/document/FeatureField.java ## @@ -66,8 +66,11 @@ * 2-8 = 0.00390625. * * Given a scoring factor {@code S > 0} and its weight {@code w > 0}, there - * are three ways that S can be turned into a score: + * are four ways that S can be turned into a score: * + * {@link #newLinearQuery w * S}. This is the simplest function + * where no transformation is applied on the feature value, and + * the feature value itself multiplied by weight defines the score. Review comment: "the simplest" might suggest that this would be a good one to start with, when in fact I think this is the most expert function in my opinion given that it expects the feature to already be encoded in the index in a way that makes sense for scoring. Maybe javadocs should better convey that this is expert functionality? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9581) Clarify discardCompoundToken behavior in the JapaneseTokenizer
[ https://issues.apache.org/jira/browse/LUCENE-9581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi updated LUCENE-9581: - Attachment: LUCENE-9581.patch Status: Open (was: Open) Here's another patch that restores the default value for discardCompoundToken to true in master. I'll commit soon and will change back the default value when backporting to the 8x branch. > Clarify discardCompoundToken behavior in the JapaneseTokenizer > -- > > Key: LUCENE-9581 > URL: https://issues.apache.org/jira/browse/LUCENE-9581 > Project: Lucene - Core > Issue Type: Bug >Reporter: Jim Ferenczi >Priority: Minor > Attachments: LUCENE-9581.patch, LUCENE-9581.patch, LUCENE-9581.patch > > > At first sight, the discardCompoundToken option added in LUCENE-9123 seems > redundant with the NORMAL mode of the Japanese tokenizer. When set to true, > the current behavior is to disable the decomposition for compounds, that's > exactly what the NORMAL mode does. > So I wonder if the right semantic of the option would be to keep only the > decomposition of the compound or if it's really needed. If the goal is to > make the output compatible with a graph token filter, the current workaround > to set the mode to NORMAL should be enough. > That's consistent with the mode that should be used to preserve positions in > the index since we don't handle position length on the indexing side. > Am I missing something regarding the new option ? Is there a compelling case > where it differs from the NORMAL mode ? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14683) Review the metrics API to ensure consistent placeholders for missing values
[ https://issues.apache.org/jira/browse/SOLR-14683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki resolved SOLR-14683. - Resolution: Fixed > Review the metrics API to ensure consistent placeholders for missing values > --- > > Key: SOLR-14683 > URL: https://issues.apache.org/jira/browse/SOLR-14683 > Project: Solr > Issue Type: Improvement > Components: metrics >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: master (9.0) > > Attachments: SOLR-14683.patch, SOLR-14683.patch > > > Spin-off from SOLR-14657. Some gauges can legitimately be missing or in an > unknown state at some points in time, eg. during SolrCore startup or shutdown. > Currently the API returns placeholders with either impossible values for > numeric gauges (such as index size -1) or empty maps / strings for other > non-numeric gauges. > [~hossman] noticed that the values for these placeholders may be misleading, > depending on how the user treats them - if the client has no special logic to > treat them as "missing values" it may erroneously treat them as valid data. > E.g. numeric values of -1 or 0 may severely skew averages and produce > misleading peaks / valleys in metrics histories. > On the other hand returning a literal {{null}} value instead of the expected > number may also cause unexpected client issues - although in this case it's > clearer that there's actually no data available, so long-term this may be a > better strategy than returning impossible values, even if it means that the > client should learn to handle {{null}} values appropriately. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14683) Review the metrics API to ensure consistent placeholders for missing values
[ https://issues.apache.org/jira/browse/SOLR-14683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229124#comment-17229124 ] ASF subversion and git services commented on SOLR-14683: Commit 7ec17376bef9ca4dd932f57a966b4614ab949855 in lucene-solr's branch refs/heads/master from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7ec1737 ] SOLR-14683: Metrics API should ensure consistent placeholders for missing values. > Review the metrics API to ensure consistent placeholders for missing values > --- > > Key: SOLR-14683 > URL: https://issues.apache.org/jira/browse/SOLR-14683 > Project: Solr > Issue Type: Improvement > Components: metrics >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Attachments: SOLR-14683.patch, SOLR-14683.patch > > > Spin-off from SOLR-14657. Some gauges can legitimately be missing or in an > unknown state at some points in time, eg. during SolrCore startup or shutdown. > Currently the API returns placeholders with either impossible values for > numeric gauges (such as index size -1) or empty maps / strings for other > non-numeric gauges. > [~hossman] noticed that the values for these placeholders may be misleading, > depending on how the user treats them - if the client has no special logic to > treat them as "missing values" it may erroneously treat them as valid data. > E.g. numeric values of -1 or 0 may severely skew averages and produce > misleading peaks / valleys in metrics histories. > On the other hand returning a literal {{null}} value instead of the expected > number may also cause unexpected client issues - although in this case it's > clearer that there's actually no data available, so long-term this may be a > better strategy than returning impossible values, even if it means that the > client should learn to handle {{null}} values appropriately. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14683) Review the metrics API to ensure consistent placeholders for missing values
[ https://issues.apache.org/jira/browse/SOLR-14683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated SOLR-14683: Fix Version/s: master (9.0) > Review the metrics API to ensure consistent placeholders for missing values > --- > > Key: SOLR-14683 > URL: https://issues.apache.org/jira/browse/SOLR-14683 > Project: Solr > Issue Type: Improvement > Components: metrics >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: master (9.0) > > Attachments: SOLR-14683.patch, SOLR-14683.patch > > > Spin-off from SOLR-14657. Some gauges can legitimately be missing or in an > unknown state at some points in time, eg. during SolrCore startup or shutdown. > Currently the API returns placeholders with either impossible values for > numeric gauges (such as index size -1) or empty maps / strings for other > non-numeric gauges. > [~hossman] noticed that the values for these placeholders may be misleading, > depending on how the user treats them - if the client has no special logic to > treat them as "missing values" it may erroneously treat them as valid data. > E.g. numeric values of -1 or 0 may severely skew averages and produce > misleading peaks / valleys in metrics histories. > On the other hand returning a literal {{null}} value instead of the expected > number may also cause unexpected client issues - although in this case it's > clearer that there's actually no data available, so long-term this may be a > better strategy than returning impossible values, even if it means that the > client should learn to handle {{null}} values appropriately. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on a change in pull request #2065: SOLR-14977 : ContainerPlugins should be configurable
sigram commented on a change in pull request #2065: URL: https://github.com/apache/lucene-solr/pull/2065#discussion_r520483853 ## File path: solr/core/src/java/org/apache/solr/api/ConfigurablePlugin.java ## @@ -0,0 +1,31 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.api; + +/**Implement this interface if your plugin needs to accept some configuration + * + * @param the configuration Object type + */ +public interface ConfigurablePlugin { + + /**This is invoked soon after the Object is initialized + * + * @param cfg value deserialized from JSON + */ + void initConfig(T cfg); Review comment: Maybe `configure(T cfg)`? the current name looks awkward. ## File path: solr/core/src/java/org/apache/solr/api/AnnotatedApi.java ## @@ -222,7 +223,8 @@ public void call(SolrQueryRequest req, SolrQueryResponse rsp) { final String command; final MethodHandle method; final Object obj; -ObjectMapper mapper = SolrJacksonAnnotationInspector.createObjectMapper(); +ObjectMapper mapper = SolrJacksonAnnotationInspector.createObjectMapper() Review comment: `ObjectMapper` is a relatively heavy object, we should not create new instances in every class that needs it - maybe put a common instance somewhere in Utils? ## File path: solr/core/src/test/org/apache/solr/handler/TestContainerPlugin.java ## @@ -312,6 +340,45 @@ public void testApiFromPackage() throws Exception { } } + public static class CC1 extends CC { + + } + public static class CC2 extends CC1 { + + } + public static class CC implements ConfigurablePlugin { +private CConfig cfg; + + + +@Override +public void initConfig(CConfig cfg) { + this.cfg = cfg; + +} + +@EndPoint(method = GET, +path = "/hello/plugin", +permission = PermissionNameProvider.Name.READ_PERM) +public void m2(SolrQueryRequest req, SolrQueryResponse rsp) { + rsp.add("config", cfg); +} + + } + + public static class CConfig extends PluginMeta { Review comment: This example may be confusing because in general case configuration classes don't have to subclass `PluginMeta`. I propose removing this subclassing here to make it clear that's the case. ## File path: solr/core/src/java/org/apache/solr/api/ContainerPluginsRegistry.java ## @@ -360,6 +376,14 @@ public void init() throws Exception { } else { throw new RuntimeException("Must have a no-arg constructor or CoreContainer constructor "); } + if (instance instanceof ConfigurablePlugin) { +Class c = getConfigClass((ConfigurablePlugin) instance); +if(c != null) { Review comment: Whitespace. ## File path: solr/core/src/java/org/apache/solr/api/ContainerPluginsRegistry.java ## @@ -114,6 +118,16 @@ public synchronized ApiInfo getPlugin(String name) { return currentPlugins.get(name); } + static class PluginMetaHolder { +private final Map original; +private final PluginMeta meta; Review comment: Does `PluginMeta` still need a separate `pathPrefix`? I think this could become a config property. ## File path: solr/core/src/java/org/apache/solr/api/ContainerPluginsRegistry.java ## @@ -349,7 +365,7 @@ public ApiInfo(PluginMeta info, List errs) { } } -@SuppressWarnings({"rawtypes"}) +@SuppressWarnings({"rawtypes","unchecked"}) Review comment: Whitespace. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] alessandrobenedetti commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank
alessandrobenedetti commented on a change in pull request #1571: URL: https://github.com/apache/lucene-solr/pull/1571#discussion_r520509933 ## File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/response/transform/LTRFeatureLoggerTransformerFactory.java ## @@ -210,50 +216,59 @@ public void setContext(ResultContext context) { } // Setup LTRScoringQuery - scoringQuery = SolrQueryRequestContextUtils.getScoringQuery(req); - docsWereNotReranked = (scoringQuery == null); - String featureStoreName = SolrQueryRequestContextUtils.getFvStoreName(req); - if (docsWereNotReranked || (featureStoreName != null && (!featureStoreName.equals(scoringQuery.getScoringModel().getFeatureStoreName() { -// if store is set in the transformer we should overwrite the logger - -final ManagedFeatureStore fr = ManagedFeatureStore.getManagedFeatureStore(req.getCore()); - -final FeatureStore store = fr.getFeatureStore(featureStoreName); -featureStoreName = store.getName(); // if featureStoreName was null before this gets actual name - -try { - final LoggingModel lm = new LoggingModel(loggingModelName, - featureStoreName, store.getFeatures()); - - scoringQuery = new LTRScoringQuery(lm, - LTRQParserPlugin.extractEFIParams(localparams), - true, - threadManager); // request feature weights to be created for all features - -}catch (final Exception e) { - throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, - "retrieving the feature store "+featureStoreName, e); -} - } + rerankingQueries = SolrQueryRequestContextUtils.getScoringQueries(req); - if (scoringQuery.getOriginalQuery() == null) { -scoringQuery.setOriginalQuery(context.getQuery()); + docsWereNotReranked = (rerankingQueries == null || rerankingQueries.length == 0); + if (docsWereNotReranked) { +rerankingQueries = new LTRScoringQuery[]{null}; } - if (scoringQuery.getFeatureLogger() == null){ -scoringQuery.setFeatureLogger( SolrQueryRequestContextUtils.getFeatureLogger(req) ); - } - scoringQuery.setRequest(req); - - featureLogger = scoringQuery.getFeatureLogger(); + modelWeights = new LTRScoringQuery.ModelWeight[rerankingQueries.length]; + String featureStoreName = SolrQueryRequestContextUtils.getFvStoreName(req); + for (int i = 0; i < rerankingQueries.length; i++) { +LTRScoringQuery scoringQuery = rerankingQueries[i]; +if ((scoringQuery == null || !(scoringQuery instanceof OriginalRankingLTRScoringQuery)) && (docsWereNotReranked || (featureStoreName != null && !featureStoreName.equals(scoringQuery.getScoringModel().getFeatureStoreName() { Review comment: Ok, I have done an extensive refactor of this bit, following your scenarios guideline, I believe the code is much more readable now. I added few tests as well. Thank you very much for the insight, now that part is extremely clear. Before resolving this discussion, working on this, another consideration sparkled: Currently when we use the feature logger transformer, all features are extracted (even the ones not used by the reranking model, if any). Are we sure we want this behavior ? We could re-use the extracted feature vector cached also for logging if we just log the features actually used by the model. I am just wondering why I would be interested in logging for a document, all the features in a featureStore, including potentially features just used by other models. This could actually lead to confusion. If you agree I create a separate Jira for that and I'll implement a solution soon, to avoid to forget and context switch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] alessandrobenedetti commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank
alessandrobenedetti commented on a change in pull request #1571: URL: https://github.com/apache/lucene-solr/pull/1571#discussion_r520513604 ## File path: solr/solr-ref-guide/src/learning-to-rank.adoc ## @@ -247,6 +254,81 @@ The output XML will include feature values as a comma-separated list, resembling }} +=== Running a Rerank Query Interleaving Two Models + +To rerank the results of a query, interleaving two models (myModelA, myModelB) add the `rq` parameter to your search, passing two models in input, for example: + +[source,text] +http://localhost:8983/solr/techproducts/query?q=test&rq={!ltr model=myModelA model=myModelB reRankDocs=100}&fl=id,score + +To obtain the model that interleaving picked for a search result, computed during reranking, add `[interleaving]` to the `fl` parameter, for example: + +[source,text] +http://localhost:8983/solr/techproducts/query?q=test&rq={!ltr model=myModelA model=myModelB reRankDocs=100}&fl=id,score,[interleaving] + +The output XML will include the model picked for each search result, resembling the output shown here: + +[source,json] + +{ + "responseHeader":{ +"status":0, +"QTime":0, +"params":{ + "q":"test", + "fl":"id,score,[interleaving]", + "rq":"{!ltr model=myModelA model=myModelB reRankDocs=100}"}}, + "response":{"numFound":2,"start":0,"maxScore":1.0005897,"docs":[ + { +"id":"GB18030TEST", +"score":1.0005897, +"[interleaving]":"myModelB"}, + { +"id":"UTF8TEST", +"score":0.79656565, +"[interleaving]":"myModelA"}] + }} + + +=== Running a Rerank Query Interleaving a model with the original ranking +When approaching Search Quality Evaluation with interleaving it may be useful to compare a model with the original ranking. +To rerank the results of a query, interleaving a model with the original ranking, add the `rq` parameter to your search, with a model in input and activating the original ranking interleaving, for example: + + +[source,text] +http://localhost:8983/solr/techproducts/query?q=test&rq={!ltr model=myModel model=_OriginalRanking_ reRankDocs=100}&fl=id,score Review comment: Just fixed that, it's not critical as the order is not important, but if it's more readable, let's go for it! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi merged pull request #964: LUCENE-9023: GlobalOrdinalsWithScore should not compute occurrences when the provided min is 1
jimczi merged pull request #964: URL: https://github.com/apache/lucene-solr/pull/964 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] alessandrobenedetti commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank
alessandrobenedetti commented on a change in pull request #1571: URL: https://github.com/apache/lucene-solr/pull/1571#discussion_r520515596 ## File path: solr/solr-ref-guide/src/learning-to-rank.adoc ## @@ -247,6 +254,81 @@ The output XML will include feature values as a comma-separated list, resembling }} +=== Running a Rerank Query Interleaving Two Models + +To rerank the results of a query, interleaving two models (myModelA, myModelB) add the `rq` parameter to your search, passing two models in input, for example: + +[source,text] +http://localhost:8983/solr/techproducts/query?q=test&rq={!ltr model=myModelA model=myModelB reRankDocs=100}&fl=id,score + +To obtain the model that interleaving picked for a search result, computed during reranking, add `[interleaving]` to the `fl` parameter, for example: Review comment: 1 is what is currently implemented and it aligns with the TeamDraft Interleaving papers and evaluation methods. Your observation is interesting though, but to implement that we should invent a new type of Interleaving algorithm that will do that when interleaving the results and will evaluate the user clicks accordingly later on. Your observation on the features to log applies as well. So far no change is needed in this regard in my opinion. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9023) GlobalOrdinalsWithScore should not compute occurrences when the provided min is 1
[ https://issues.apache.org/jira/browse/LUCENE-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229165#comment-17229165 ] ASF subversion and git services commented on LUCENE-9023: - Commit 36f6359fe4b337ab37e97f63fb036a41b5c14b68 in lucene-solr's branch refs/heads/master from Jim Ferenczi [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=36f6359 ] LUCENE-9023: GlobalOrdinalsWithScore should not compute occurrences when the provided min is 1 (#964) > GlobalOrdinalsWithScore should not compute occurrences when the provided min > is 1 > - > > Key: LUCENE-9023 > URL: https://issues.apache.org/jira/browse/LUCENE-9023 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jim Ferenczi >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > This is a continuation of https://issues.apache.org/jira/browse/LUCENE-9022 > Today the GlobalOrdinalsWithScore collector and query checks the number of > matching docs per parent if the provided min is greater than 0. However we > should also not compute the occurrences of children when min is equals to 1 > since this is the minimum requirement for a document to match. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9023) GlobalOrdinalsWithScore should not compute occurrences when the provided min is 1
[ https://issues.apache.org/jira/browse/LUCENE-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229173#comment-17229173 ] ASF subversion and git services commented on LUCENE-9023: - Commit 61068a4b9948adc2d3649d1dbfbe2538990024a6 in lucene-solr's branch refs/heads/branch_8x from Jim Ferenczi [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=61068a4 ] LUCENE-9023: GlobalOrdinalsWithScore should not compute occurrences when the provided min is 1 (#964) > GlobalOrdinalsWithScore should not compute occurrences when the provided min > is 1 > - > > Key: LUCENE-9023 > URL: https://issues.apache.org/jira/browse/LUCENE-9023 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jim Ferenczi >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > This is a continuation of https://issues.apache.org/jira/browse/LUCENE-9022 > Today the GlobalOrdinalsWithScore collector and query checks the number of > matching docs per parent if the provided min is greater than 0. However we > should also not compute the occurrences of children when min is equals to 1 > since this is the minimum requirement for a document to match. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mocobeta merged pull request #2064: LUCENE-9600: Clean up package name conflicts between misc and core modules
mocobeta merged pull request #2064: URL: https://github.com/apache/lucene-solr/pull/2064 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9600) Clean up package name conflicts for misc module
[ https://issues.apache.org/jira/browse/LUCENE-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229213#comment-17229213 ] ASF subversion and git services commented on LUCENE-9600: - Commit d1110394e9c963c999b261b8ac6dac1df518628d in lucene-solr's branch refs/heads/master from Tomoko Uchida [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d111039 ] LUCENE-9600: Clean up package name conflicts between misc and core modules (#2064) > Clean up package name conflicts for misc module > --- > > Key: LUCENE-9600 > URL: https://issues.apache.org/jira/browse/LUCENE-9600 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Minor > Time Spent: 1h 50m > Remaining Estimate: 0h > > misc module shares the package names o.a.l.document, o.a.l.index, > o.a.l.search, o.a.l.store, and o.a.l.util with lucene-core. Those should be > moved under o.a.l.misc (or some classed should be moved to core?). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9600) Clean up package name conflicts for misc module
[ https://issues.apache.org/jira/browse/LUCENE-9600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomoko Uchida resolved LUCENE-9600. --- Fix Version/s: master (9.0) Resolution: Fixed > Clean up package name conflicts for misc module > --- > > Key: LUCENE-9600 > URL: https://issues.apache.org/jira/browse/LUCENE-9600 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Minor > Fix For: master (9.0) > > Time Spent: 1h 50m > Remaining Estimate: 0h > > misc module shares the package names o.a.l.document, o.a.l.index, > o.a.l.search, o.a.l.store, and o.a.l.util with lucene-core. Those should be > moved under o.a.l.misc (or some classed should be moved to core?). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9499) Clean up package name conflicts between modules (split packages)
[ https://issues.apache.org/jira/browse/LUCENE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229218#comment-17229218 ] Tomoko Uchida commented on LUCENE-9499: --- I think we've resolved all split packages in lucene (except for test-framework). Maybe we will be able to migrate package.html files into package-info.java? > Clean up package name conflicts between modules (split packages) > > > Key: LUCENE-9499 > URL: https://issues.apache.org/jira/browse/LUCENE-9499 > Project: Lucene - Core > Issue Type: Improvement >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Major > Fix For: master (9.0) > > > We have lots of package name conflicts (shared package names) between modules > in the source tree. It is not only annoying for devs/users but also indeed > bad practice since Java 9 (according to my understanding), and we already > have some problems with Javadocs due to these splitted packages as some of us > would know. Also split packages make migrating to the Java 9 module system > impossible. > This is the placeholder to fix all package name conflicts in Lucene. > See the dev list thread for more background. > > [https://lists.apache.org/thread.html/r6496963e89a5e0615e53206429b6843cc5d3e923a2045cc7b7a1eb03%40%3Cdev.lucene.apache.org%3E] > Modules that need to be fixed / cleaned up: > - analyzers-common (LUCENE-9317) > - analyzers-icu (LUCENE-9558) > - backward-codecs (LUCENE-9318) > - sandbox (LUCENE-9319) > - misc (LUCENE-9600) > - (test-framework: this can be excluded for the moment) > Also lucene-core will be heavily affected (some classes have to be moved into > {{core}}, or some classes' and methods' in {{core}} visibility have to be > relaxed). > Probably most work would be done in a parallel manner, but conflicts can > happen. If someone want to help out, please open an issue before working and > share your thoughts with me and others. > I set "Fix version" to 9.0 - means once we make a commit on here, this will > be a blocker for release 9.0.0. (I don't think the changes should be > delivered across two major releases; all changes have to be out at once in a > major release.) If there are any objections or concerns, please leave > comments. For now I have no idea about the total volume of changes or > technical obstacles that have to be handled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] muse-dev[bot] commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank
muse-dev[bot] commented on a change in pull request #1571: URL: https://github.com/apache/lucene-solr/pull/1571#discussion_r520578586 ## File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/interleaving/algorithms/TeamDraftInterleaving.java ## @@ -0,0 +1,123 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.ltr.interleaving.algorithms; + +import java.util.ArrayList; +import java.util.HashSet; +import java.util.LinkedHashSet; +import java.util.Random; +import java.util.Set; + +import org.apache.lucene.search.ScoreDoc; +import org.apache.solr.ltr.interleaving.Interleaving; +import org.apache.solr.ltr.interleaving.InterleavingResult; + +/** + * Interleaving was introduced the first time by Joachims in [1, 2]. + * Team Draft Interleaving is among the most successful and used interleaving approaches[3]. + * Team Draft Interleaving implements a method similar to the way in which captains select their players in team-matches. + * Team Draft Interleaving produces a fair distribution of ranking models’ elements in the final interleaved list. + * "Team draft interleaving" has also proved to overcome an issue of the "Balanced interleaving" approach, in determining the winning model[4]. + * + * [1] T. Joachims. Optimizing search engines using clickthrough data. KDD (2002) + * [2] T.Joachims.Evaluatingretrievalperformanceusingclickthroughdata.InJ.Franke, G. Nakhaeizadeh, and I. Renz, editors, + * Text Mining, pages 79–96. Physica/Springer (2003) + * [3] F. Radlinski, M. Kurup, and T. Joachims. How does clickthrough data reflect re- + * trieval quality? In CIKM, pages 43–52. ACM Press (2008) + * [4] O. Chapelle, T. Joachims, F. Radlinski, and Y. Yue. + * Large-scale validation and analysis of interleaved search evaluation. ACM TOIS, 30(1):1–41, Feb. (2012) + */ +public class TeamDraftInterleaving implements Interleaving { + public static Random RANDOM; + + static { +// We try to make things reproducible in the context of our tests by initializing the random instance +// based on the current seed +String seed = System.getProperty("tests.seed"); +if (seed == null) { + RANDOM = new Random(); +} else { + RANDOM = new Random(seed.hashCode()); Review comment: *PREDICTABLE_RANDOM:* This random generator (java.util.Random) is predictable [(details)](https://find-sec-bugs.github.io/bugs.htm#PREDICTABLE_RANDOM) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9590) Add javadoc for Lucene86PointsFormat class
[ https://issues.apache.org/jira/browse/LUCENE-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229228#comment-17229228 ] Adrien Grand commented on LUCENE-9590: -- Wow you've been working hard on this! I wonder what is the best way to deal with all the images as we generally avoid checking in too much binary content. [~uschindler] Do you have an opinion on this? > Add javadoc for Lucene86PointsFormat class > --- > > Key: LUCENE-9590 > URL: https://issues.apache.org/jira/browse/LUCENE-9590 > Project: Lucene - Core > Issue Type: Wish > Components: core/codecs >Reporter: Lu Xugang >Priority: Minor > Attachments: 1.png > > > I would like to add javadoc for Lucene86PointsFormat class, it is really > helpful for source reader to understand the data structure with point value, > is anyone doing this or plan? > The attachment list part of the data structure (filled with color means it > has sub data structure) > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14788) Solr: The Next Big Thing
[ https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229227#comment-17229227 ] David Smiley commented on SOLR-14788: - Shawn, others (namely Ilan), are working on making the Overseer go away ( SOLR-14927 ), which I think is a far better investment of time instead of making the Overseer more scalable. > Solr: The Next Big Thing > > > Key: SOLR-14788 > URL: https://issues.apache.org/jira/browse/SOLR-14788 > Project: Solr > Issue Type: Task >Reporter: Mark Robert Miller >Assignee: Mark Robert Miller >Priority: Critical > > h3. > [!https://www.unicode.org/consortium/aacimg/1F46E.png!|https://www.unicode.org/consortium/adopted-characters.html#b1F46E]{color:#00875a}*The > Policeman is on duty!*{color} > {quote}_{color:#de350b}*When The Policeman is on duty, sit back, relax, and > have some fun. Try to make some progress. Don't stress too much about the > impact of your changes or maintaining stability and performance and > correctness so much. Until the end of phase 1, I've got your back. I have a > variety of tools and contraptions I have been building over the years and I > will continue training them on this branch. I will review your changes and > peer out across the land and course correct where needed. As Mike D will be > thinking, "Sounds like a bottleneck Mark." And indeed it will be to some > extent. Which is why once stage one is completed, I will flip The Policeman > to off duty. When off duty, I'm always* {color:#de350b}*occasionally*{color} > *down for some vigilante justice, but I won't be walking the beat, all that > stuff about sit back and relax goes out the window.*{color}_ > {quote} > > I have stolen this title from Ishan or Noble and Ishan. > This issue is meant to capture the work of a small team that is forming to > push Solr and SolrCloud to the next phase. > I have kicked off the work with an effort to create a very fast and solid > base. That work is not 100% done, but it's ready to join the fight. > Tim Potter has started giving me a tremendous hand in finishing up. Ishan and > Noble have already contributed support and testing and have plans for > additional work to shore up some of our current shortcomings. > Others have expressed an interest in helping and hopefully they will pop up > here as well. > Let's organize and discuss our efforts here and in various sub issues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14993) Unable to download zookeeper files of 1byte in size
[ https://issues.apache.org/jira/browse/SOLR-14993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229230#comment-17229230 ] Erick Erickson commented on SOLR-14993: --- What's the use-case here? And do you have any idea why? Or steps to reproduce? Or even a test program? > Unable to download zookeeper files of 1byte in size > --- > > Key: SOLR-14993 > URL: https://issues.apache.org/jira/browse/SOLR-14993 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud, SolrJ >Affects Versions: 8.5.1 >Reporter: Allen Sooredoo >Priority: Minor > > When downloading a file from Zookeeper using the Solrj client, files of size > 1 byte are ignored. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9590) Add javadoc for Lucene86PointsFormat class
[ https://issues.apache.org/jira/browse/LUCENE-9590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229241#comment-17229241 ] Uwe Schindler commented on LUCENE-9590: --- Back in time I added some formulas and images for the NumericRangeQuery. It is still alive in the Solr Javadocs: https://github.com/apache/lucene-solr/tree/master/solr/core/src/java/org/apache/solr/legacy/doc-files The trick is to use a subfolder "doc-files". The javadoc copies the files automatically. You reference like this: https://github.com/apache/lucene-solr/blob/d1110394e9c963c999b261b8ac6dac1df518628d/solr/core/src/java/org/apache/solr/legacy/LegacyNumericRangeQuery.java#L128 About the size: use highly compressed and transparent (!!) PNG files. Alternative (much better) is SVG. Would this be an option? Browsers support SVG out of box nowadays with HTML5. > Add javadoc for Lucene86PointsFormat class > --- > > Key: LUCENE-9590 > URL: https://issues.apache.org/jira/browse/LUCENE-9590 > Project: Lucene - Core > Issue Type: Wish > Components: core/codecs >Reporter: Lu Xugang >Priority: Minor > Attachments: 1.png > > > I would like to add javadoc for Lucene86PointsFormat class, it is really > helpful for source reader to understand the data structure with point value, > is anyone doing this or plan? > The attachment list part of the data structure (filled with color means it > has sub data structure) > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14788) Solr: The Next Big Thing
[ https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229246#comment-17229246 ] Mark Robert Miller edited comment on SOLR-14788 at 11/10/20, 2:21 PM: -- I’m not sold on that yet. Because the overseer is implemented to poorly for the system design, there is a bias that it should be removed, but a CAS approach is actually going to have a lot of trouble competing for lots of reasons. Getting rid of the overseer solves a bad overseer impl perhaps. It doesn’t solve the problems the overseer was originally moved towards for. was (Author: markrmiller): I’m not sold that yet. Because the overseer is implemented to poorly for the system design, there is a bias that it should be removed, but a CAS approach is actually going to have a lot of trouble competing for lots of reasons. Getting rid of the overseer solves a bad overseer impl perhaps. It doesn’t solve the problems the overseer was originally moved towards for. > Solr: The Next Big Thing > > > Key: SOLR-14788 > URL: https://issues.apache.org/jira/browse/SOLR-14788 > Project: Solr > Issue Type: Task >Reporter: Mark Robert Miller >Assignee: Mark Robert Miller >Priority: Critical > > h3. > [!https://www.unicode.org/consortium/aacimg/1F46E.png!|https://www.unicode.org/consortium/adopted-characters.html#b1F46E]{color:#00875a}*The > Policeman is on duty!*{color} > {quote}_{color:#de350b}*When The Policeman is on duty, sit back, relax, and > have some fun. Try to make some progress. Don't stress too much about the > impact of your changes or maintaining stability and performance and > correctness so much. Until the end of phase 1, I've got your back. I have a > variety of tools and contraptions I have been building over the years and I > will continue training them on this branch. I will review your changes and > peer out across the land and course correct where needed. As Mike D will be > thinking, "Sounds like a bottleneck Mark." And indeed it will be to some > extent. Which is why once stage one is completed, I will flip The Policeman > to off duty. When off duty, I'm always* {color:#de350b}*occasionally*{color} > *down for some vigilante justice, but I won't be walking the beat, all that > stuff about sit back and relax goes out the window.*{color}_ > {quote} > > I have stolen this title from Ishan or Noble and Ishan. > This issue is meant to capture the work of a small team that is forming to > push Solr and SolrCloud to the next phase. > I have kicked off the work with an effort to create a very fast and solid > base. That work is not 100% done, but it's ready to join the fight. > Tim Potter has started giving me a tremendous hand in finishing up. Ishan and > Noble have already contributed support and testing and have plans for > additional work to shore up some of our current shortcomings. > Others have expressed an interest in helping and hopefully they will pop up > here as well. > Let's organize and discuss our efforts here and in various sub issues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14788) Solr: The Next Big Thing
[ https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229246#comment-17229246 ] Mark Robert Miller commented on SOLR-14788: --- I’m not sold that yet. Because the overseer is implemented to poorly for the system design, there is a bias that it should be removed, but a CAS approach is actually going to have a lot of trouble competing for lots of reasons. Getting rid of the overseer solves a bad overseer impl perhaps. It doesn’t solve the problems the overseer was originally moved towards for. > Solr: The Next Big Thing > > > Key: SOLR-14788 > URL: https://issues.apache.org/jira/browse/SOLR-14788 > Project: Solr > Issue Type: Task >Reporter: Mark Robert Miller >Assignee: Mark Robert Miller >Priority: Critical > > h3. > [!https://www.unicode.org/consortium/aacimg/1F46E.png!|https://www.unicode.org/consortium/adopted-characters.html#b1F46E]{color:#00875a}*The > Policeman is on duty!*{color} > {quote}_{color:#de350b}*When The Policeman is on duty, sit back, relax, and > have some fun. Try to make some progress. Don't stress too much about the > impact of your changes or maintaining stability and performance and > correctness so much. Until the end of phase 1, I've got your back. I have a > variety of tools and contraptions I have been building over the years and I > will continue training them on this branch. I will review your changes and > peer out across the land and course correct where needed. As Mike D will be > thinking, "Sounds like a bottleneck Mark." And indeed it will be to some > extent. Which is why once stage one is completed, I will flip The Policeman > to off duty. When off duty, I'm always* {color:#de350b}*occasionally*{color} > *down for some vigilante justice, but I won't be walking the beat, all that > stuff about sit back and relax goes out the window.*{color}_ > {quote} > > I have stolen this title from Ishan or Noble and Ishan. > This issue is meant to capture the work of a small team that is forming to > push Solr and SolrCloud to the next phase. > I have kicked off the work with an effort to create a very fast and solid > base. That work is not 100% done, but it's ready to join the fight. > Tim Potter has started giving me a tremendous hand in finishing up. Ishan and > Noble have already contributed support and testing and have plans for > additional work to shore up some of our current shortcomings. > Others have expressed an interest in helping and hopefully they will pop up > here as well. > Let's organize and discuss our efforts here and in various sub issues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14927) Remove Overseer
[ https://issues.apache.org/jira/browse/SOLR-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229252#comment-17229252 ] Mark Robert Miller commented on SOLR-14927: --- It’s the bad impl (due to tech debt and variety of reasons), not the design. You are right that the zookeeper already owns the state, and that is why our overseer is so silly. The solution is not to embrace zk more, that’s actually the non scalable solution. The Overseer actually has the advantage for state updates, the cas approach with zk as the state owner is actually the non scalable approach. The disadvantage is actually what’s stated as the advantage. Zookeeper owning the state, state updates are not scalable. The approach will not compete well with ab overseer approach in multiple areas, including cluster scalability. > Remove Overseer > --- > > Key: SOLR-14927 > URL: https://issues.apache.org/jira/browse/SOLR-14927 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Reporter: Ilan Ginzburg >Assignee: Ilan Ginzburg >Priority: Major > Labels: cluster, collection-api, overseer, solrcloud, zookeeper > > This Jira is intended to capture sub jiras on the path to remove the Overseer > component from SolrCloud and move to all nodes being able to do the work > currently done by Overseer. > See detailed description in [this > doc|https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/]. > Copying (edited) from the above doc: > The motivation for removing Overseer include: > * Mono threaded state change is slow and doesn’t scale, > * Communication between cluster nodes and the Overseer use Zookeeper as a > queueing mechanism, this is not a good idea, > * Nodes talking to Overseer (then Overseer talking to itself) via Zookeeper > is inefficient and adds latency, > * Collection API scalability is poor, because not only a single node > processes commands for all Collections, but it also depends on the mono > threaded state change queue consumption, > * The code supporting Overseer in SolrCloud is complex (election, queue > management, recovery etc). > The general idea is that there’s already a central point in the SolrCloud > cluster and it’s Zookeeper. It might not be necessary to have a second > central point (Overseer) because nodes can interact directly with Zookeeper > and synchronize more efficiently by optimistic locking using “conditional > updates” (a.k.a compare and swap or CAS). > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14927) Remove Overseer
[ https://issues.apache.org/jira/browse/SOLR-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229252#comment-17229252 ] Mark Robert Miller edited comment on SOLR-14927 at 11/10/20, 2:27 PM: -- It’s the bad impl that limits the overseer (due to tech debt and variety of reasons), not the design. You are right that the zookeeper already owns the state, and that is why our overseer is so silly. The solution is not to embrace zk more, that’s actually the non scalable solution. The Overseer actually has the advantage for state updates, the cas approach with zk as the state owner is actually the non scalable approach. The disadvantage is actually what’s stated as the advantage. Zookeeper owning the state, state updates are not scalable. The approach will not compete well with ab overseer approach in multiple areas, including cluster scalability. was (Author: markrmiller): It’s the bad impl (due to tech debt and variety of reasons), not the design. You are right that the zookeeper already owns the state, and that is why our overseer is so silly. The solution is not to embrace zk more, that’s actually the non scalable solution. The Overseer actually has the advantage for state updates, the cas approach with zk as the state owner is actually the non scalable approach. The disadvantage is actually what’s stated as the advantage. Zookeeper owning the state, state updates are not scalable. The approach will not compete well with ab overseer approach in multiple areas, including cluster scalability. > Remove Overseer > --- > > Key: SOLR-14927 > URL: https://issues.apache.org/jira/browse/SOLR-14927 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Reporter: Ilan Ginzburg >Assignee: Ilan Ginzburg >Priority: Major > Labels: cluster, collection-api, overseer, solrcloud, zookeeper > > This Jira is intended to capture sub jiras on the path to remove the Overseer > component from SolrCloud and move to all nodes being able to do the work > currently done by Overseer. > See detailed description in [this > doc|https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/]. > Copying (edited) from the above doc: > The motivation for removing Overseer include: > * Mono threaded state change is slow and doesn’t scale, > * Communication between cluster nodes and the Overseer use Zookeeper as a > queueing mechanism, this is not a good idea, > * Nodes talking to Overseer (then Overseer talking to itself) via Zookeeper > is inefficient and adds latency, > * Collection API scalability is poor, because not only a single node > processes commands for all Collections, but it also depends on the mono > threaded state change queue consumption, > * The code supporting Overseer in SolrCloud is complex (election, queue > management, recovery etc). > The general idea is that there’s already a central point in the SolrCloud > cluster and it’s Zookeeper. It might not be necessary to have a second > central point (Overseer) because nodes can interact directly with Zookeeper > and synchronize more efficiently by optimistic locking using “conditional > updates” (a.k.a compare and swap or CAS). > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mocobeta opened a new pull request #2072: LUCENE-9499: migrate package.html files into package-info.java
mocobeta opened a new pull request #2072: URL: https://github.com/apache/lucene-solr/pull/2072 We should be able to migrate old-style package.html files into package-info.java, since we have no split packages in lucene (except for test-framework). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14927) Remove Overseer
[ https://issues.apache.org/jira/browse/SOLR-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229252#comment-17229252 ] Mark Robert Miller edited comment on SOLR-14927 at 11/10/20, 2:31 PM: -- It’s the bad impl that limits the overseer (due to tech debt and variety of reasons), not the design. You are right that the zookeeper already owns the state, and that is why our overseer is so silly. The solution is not to embrace zk more, that’s actually the non scalable solution and the root of a lot of our base instability. The Overseer actually has the advantage for state updates, the cas approach with zk as the state owner is actually the non scalable approach. If you compare our impl, anything beats it, if you compare the design, CAS updates of state.json is a few steps back. The disadvantage to this approach is actually what’s stated as the advantage. Zookeeper owning the state, state updates and cluster information distribution are not scalable This approach will not compete well with the overseer approach in multiple areas, including cluster scalability. I have a branch you can try to compare with when some code is ready. This approach will have a hard time making it. was (Author: markrmiller): It’s the bad impl that limits the overseer (due to tech debt and variety of reasons), not the design. You are right that the zookeeper already owns the state, and that is why our overseer is so silly. The solution is not to embrace zk more, that’s actually the non scalable solution. The Overseer actually has the advantage for state updates, the cas approach with zk as the state owner is actually the non scalable approach. The disadvantage is actually what’s stated as the advantage. Zookeeper owning the state, state updates are not scalable. The approach will not compete well with ab overseer approach in multiple areas, including cluster scalability. > Remove Overseer > --- > > Key: SOLR-14927 > URL: https://issues.apache.org/jira/browse/SOLR-14927 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Reporter: Ilan Ginzburg >Assignee: Ilan Ginzburg >Priority: Major > Labels: cluster, collection-api, overseer, solrcloud, zookeeper > > This Jira is intended to capture sub jiras on the path to remove the Overseer > component from SolrCloud and move to all nodes being able to do the work > currently done by Overseer. > See detailed description in [this > doc|https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/]. > Copying (edited) from the above doc: > The motivation for removing Overseer include: > * Mono threaded state change is slow and doesn’t scale, > * Communication between cluster nodes and the Overseer use Zookeeper as a > queueing mechanism, this is not a good idea, > * Nodes talking to Overseer (then Overseer talking to itself) via Zookeeper > is inefficient and adds latency, > * Collection API scalability is poor, because not only a single node > processes commands for all Collections, but it also depends on the mono > threaded state change queue consumption, > * The code supporting Overseer in SolrCloud is complex (election, queue > management, recovery etc). > The general idea is that there’s already a central point in the SolrCloud > cluster and it’s Zookeeper. It might not be necessary to have a second > central point (Overseer) because nodes can interact directly with Zookeeper > and synchronize more efficiently by optimistic locking using “conditional > updates” (a.k.a compare and swap or CAS). > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova commented on pull request #2051: LUCENE-9594 Add linear function for FeatureField
mayya-sharipova commented on pull request #2051: URL: https://github.com/apache/lucene-solr/pull/2051#issuecomment-724738715 @jpountz Thank you for the feedback, it makes sense. Addressed in 2dfa5511 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz opened a new pull request #2073: LUCENE-9602: Add backward-compatibility tests for indices created with BEST_COMPRESSION.
jpountz opened a new pull request #2073: URL: https://github.com/apache/lucene-solr/pull/2073 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9499) Clean up package name conflicts between modules (split packages)
[ https://issues.apache.org/jira/browse/LUCENE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229255#comment-17229255 ] Tomoko Uchida commented on LUCENE-9499: --- [https://github.com/apache/lucene-solr/pull/2072] removed package.html and instead added corresponding package-info.java. > Clean up package name conflicts between modules (split packages) > > > Key: LUCENE-9499 > URL: https://issues.apache.org/jira/browse/LUCENE-9499 > Project: Lucene - Core > Issue Type: Improvement >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Major > Fix For: master (9.0) > > Time Spent: 10m > Remaining Estimate: 0h > > We have lots of package name conflicts (shared package names) between modules > in the source tree. It is not only annoying for devs/users but also indeed > bad practice since Java 9 (according to my understanding), and we already > have some problems with Javadocs due to these splitted packages as some of us > would know. Also split packages make migrating to the Java 9 module system > impossible. > This is the placeholder to fix all package name conflicts in Lucene. > See the dev list thread for more background. > > [https://lists.apache.org/thread.html/r6496963e89a5e0615e53206429b6843cc5d3e923a2045cc7b7a1eb03%40%3Cdev.lucene.apache.org%3E] > Modules that need to be fixed / cleaned up: > - analyzers-common (LUCENE-9317) > - analyzers-icu (LUCENE-9558) > - backward-codecs (LUCENE-9318) > - sandbox (LUCENE-9319) > - misc (LUCENE-9600) > - (test-framework: this can be excluded for the moment) > Also lucene-core will be heavily affected (some classes have to be moved into > {{core}}, or some classes' and methods' in {{core}} visibility have to be > relaxed). > Probably most work would be done in a parallel manner, but conflicts can > happen. If someone want to help out, please open an issue before working and > share your thoughts with me and others. > I set "Fix version" to 9.0 - means once we make a commit on here, this will > be a blocker for release 9.0.0. (I don't think the changes should be > delivered across two major releases; all changes have to be out at once in a > major release.) If there are any objections or concerns, please leave > comments. For now I have no idea about the total volume of changes or > technical obstacles that have to be handled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14788) Solr: The Next Big Thing
[ https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229257#comment-17229257 ] Mark Robert Miller commented on SOLR-14788: --- Yeah, I looked at that approach, I’m familiar with the idea. That you say it’s a far better investment of time tells me you are a bit more confident in that direction than you likely have information or reason to be. > Solr: The Next Big Thing > > > Key: SOLR-14788 > URL: https://issues.apache.org/jira/browse/SOLR-14788 > Project: Solr > Issue Type: Task >Reporter: Mark Robert Miller >Assignee: Mark Robert Miller >Priority: Critical > > h3. > [!https://www.unicode.org/consortium/aacimg/1F46E.png!|https://www.unicode.org/consortium/adopted-characters.html#b1F46E]{color:#00875a}*The > Policeman is on duty!*{color} > {quote}_{color:#de350b}*When The Policeman is on duty, sit back, relax, and > have some fun. Try to make some progress. Don't stress too much about the > impact of your changes or maintaining stability and performance and > correctness so much. Until the end of phase 1, I've got your back. I have a > variety of tools and contraptions I have been building over the years and I > will continue training them on this branch. I will review your changes and > peer out across the land and course correct where needed. As Mike D will be > thinking, "Sounds like a bottleneck Mark." And indeed it will be to some > extent. Which is why once stage one is completed, I will flip The Policeman > to off duty. When off duty, I'm always* {color:#de350b}*occasionally*{color} > *down for some vigilante justice, but I won't be walking the beat, all that > stuff about sit back and relax goes out the window.*{color}_ > {quote} > > I have stolen this title from Ishan or Noble and Ishan. > This issue is meant to capture the work of a small team that is forming to > push Solr and SolrCloud to the next phase. > I have kicked off the work with an effort to create a very fast and solid > base. That work is not 100% done, but it's ready to join the fight. > Tim Potter has started giving me a tremendous hand in finishing up. Ishan and > Noble have already contributed support and testing and have plans for > additional work to shore up some of our current shortcomings. > Others have expressed an interest in helping and hopefully they will pop up > here as well. > Let's organize and discuss our efforts here and in various sub issues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14986) Restrict the properties possible to define with "property.name=value" when creating a collection
[ https://issues.apache.org/jira/browse/SOLR-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229266#comment-17229266 ] Gus Heck commented on SOLR-14986: - Yeah, It seems to me that any property specified in the create command that would conflict with the actual properties of the create command should just fail with a message about overlapping properties. > Restrict the properties possible to define with "property.name=value" when > creating a collection > > > Key: SOLR-14986 > URL: https://issues.apache.org/jira/browse/SOLR-14986 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > This came to light when I was looking at two user-list questions where people > try to manually define core.properties to define _replicas_ in SolrCloud. > There are two related issues: > 1> You can do things like "action=CREATE&name=eoe&property.collection=blivet" > which results in an opaque error about "could not create replica." I > propose we return a better error here like "property.collection should not be > specified when creating a collection". What do people think about the rest of > the auto-created properties on collection creation? > coreNodeName > collection.configName > name > numShards > shard > collection > replicaType > "name" seems to be OK to change, although i don't see anyplace anyone can > actually see it afterwards > 2> Change the ref guide to steer people away from attempting to manually > create a core.properties file to define cores/replicas in SolrCloud. There's > no warning on the "defining-core-properties.adoc" for instance. Additionally > there should be some kind of message on the collections API documentation > about not trying to set the properties in <1> on the CREATE command. > <2> used to actually work (apparently) with legacyCloud... -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mocobeta merged pull request #2072: LUCENE-9499: migrate package.html files into package-info.java
mocobeta merged pull request #2072: URL: https://github.com/apache/lucene-solr/pull/2072 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9499) Clean up package name conflicts between modules (split packages)
[ https://issues.apache.org/jira/browse/LUCENE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229272#comment-17229272 ] ASF subversion and git services commented on LUCENE-9499: - Commit 426a9c25c241e174f64f522f87de2fe169a452ca in lucene-solr's branch refs/heads/master from Tomoko Uchida [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=426a9c2 ] LUCENE-9499: migrate package.html files into package-info.java (#2072) > Clean up package name conflicts between modules (split packages) > > > Key: LUCENE-9499 > URL: https://issues.apache.org/jira/browse/LUCENE-9499 > Project: Lucene - Core > Issue Type: Improvement >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Major > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > > We have lots of package name conflicts (shared package names) between modules > in the source tree. It is not only annoying for devs/users but also indeed > bad practice since Java 9 (according to my understanding), and we already > have some problems with Javadocs due to these splitted packages as some of us > would know. Also split packages make migrating to the Java 9 module system > impossible. > This is the placeholder to fix all package name conflicts in Lucene. > See the dev list thread for more background. > > [https://lists.apache.org/thread.html/r6496963e89a5e0615e53206429b6843cc5d3e923a2045cc7b7a1eb03%40%3Cdev.lucene.apache.org%3E] > Modules that need to be fixed / cleaned up: > - analyzers-common (LUCENE-9317) > - analyzers-icu (LUCENE-9558) > - backward-codecs (LUCENE-9318) > - sandbox (LUCENE-9319) > - misc (LUCENE-9600) > - (test-framework: this can be excluded for the moment) > Also lucene-core will be heavily affected (some classes have to be moved into > {{core}}, or some classes' and methods' in {{core}} visibility have to be > relaxed). > Probably most work would be done in a parallel manner, but conflicts can > happen. If someone want to help out, please open an issue before working and > share your thoughts with me and others. > I set "Fix version" to 9.0 - means once we make a commit on here, this will > be a blocker for release 9.0.0. (I don't think the changes should be > delivered across two major releases; all changes have to be out at once in a > major release.) If there are any objections or concerns, please leave > comments. For now I have no idea about the total volume of changes or > technical obstacles that have to be handled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9499) Clean up package name conflicts between modules (split packages)
[ https://issues.apache.org/jira/browse/LUCENE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229280#comment-17229280 ] ASF subversion and git services commented on LUCENE-9499: - Commit 426a9c25c241e174f64f522f87de2fe169a452ca in lucene-solr's branch refs/heads/master from Tomoko Uchida [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=426a9c2 ] LUCENE-9499: migrate package.html files into package-info.java (#2072) > Clean up package name conflicts between modules (split packages) > > > Key: LUCENE-9499 > URL: https://issues.apache.org/jira/browse/LUCENE-9499 > Project: Lucene - Core > Issue Type: Improvement >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Major > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > > We have lots of package name conflicts (shared package names) between modules > in the source tree. It is not only annoying for devs/users but also indeed > bad practice since Java 9 (according to my understanding), and we already > have some problems with Javadocs due to these splitted packages as some of us > would know. Also split packages make migrating to the Java 9 module system > impossible. > This is the placeholder to fix all package name conflicts in Lucene. > See the dev list thread for more background. > > [https://lists.apache.org/thread.html/r6496963e89a5e0615e53206429b6843cc5d3e923a2045cc7b7a1eb03%40%3Cdev.lucene.apache.org%3E] > Modules that need to be fixed / cleaned up: > - analyzers-common (LUCENE-9317) > - analyzers-icu (LUCENE-9558) > - backward-codecs (LUCENE-9318) > - sandbox (LUCENE-9319) > - misc (LUCENE-9600) > - (test-framework: this can be excluded for the moment) > Also lucene-core will be heavily affected (some classes have to be moved into > {{core}}, or some classes' and methods' in {{core}} visibility have to be > relaxed). > Probably most work would be done in a parallel manner, but conflicts can > happen. If someone want to help out, please open an issue before working and > share your thoughts with me and others. > I set "Fix version" to 9.0 - means once we make a commit on here, this will > be a blocker for release 9.0.0. (I don't think the changes should be > delivered across two major releases; all changes have to be out at once in a > major release.) If there are any objections or concerns, please leave > comments. For now I have no idea about the total volume of changes or > technical obstacles that have to be handled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9499) Clean up package name conflicts between modules (split packages)
[ https://issues.apache.org/jira/browse/LUCENE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229292#comment-17229292 ] ASF subversion and git services commented on LUCENE-9499: - Commit 426a9c25c241e174f64f522f87de2fe169a452ca in lucene-solr's branch refs/heads/master from Tomoko Uchida [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=426a9c2 ] LUCENE-9499: migrate package.html files into package-info.java (#2072) > Clean up package name conflicts between modules (split packages) > > > Key: LUCENE-9499 > URL: https://issues.apache.org/jira/browse/LUCENE-9499 > Project: Lucene - Core > Issue Type: Improvement >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Major > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > > We have lots of package name conflicts (shared package names) between modules > in the source tree. It is not only annoying for devs/users but also indeed > bad practice since Java 9 (according to my understanding), and we already > have some problems with Javadocs due to these splitted packages as some of us > would know. Also split packages make migrating to the Java 9 module system > impossible. > This is the placeholder to fix all package name conflicts in Lucene. > See the dev list thread for more background. > > [https://lists.apache.org/thread.html/r6496963e89a5e0615e53206429b6843cc5d3e923a2045cc7b7a1eb03%40%3Cdev.lucene.apache.org%3E] > Modules that need to be fixed / cleaned up: > - analyzers-common (LUCENE-9317) > - analyzers-icu (LUCENE-9558) > - backward-codecs (LUCENE-9318) > - sandbox (LUCENE-9319) > - misc (LUCENE-9600) > - (test-framework: this can be excluded for the moment) > Also lucene-core will be heavily affected (some classes have to be moved into > {{core}}, or some classes' and methods' in {{core}} visibility have to be > relaxed). > Probably most work would be done in a parallel manner, but conflicts can > happen. If someone want to help out, please open an issue before working and > share your thoughts with me and others. > I set "Fix version" to 9.0 - means once we make a commit on here, this will > be a blocker for release 9.0.0. (I don't think the changes should be > delivered across two major releases; all changes have to be out at once in a > major release.) If there are any objections or concerns, please leave > comments. For now I have no idea about the total volume of changes or > technical obstacles that have to be handled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9499) Clean up package name conflicts between modules (split packages)
[ https://issues.apache.org/jira/browse/LUCENE-9499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229295#comment-17229295 ] ASF subversion and git services commented on LUCENE-9499: - Commit 426a9c25c241e174f64f522f87de2fe169a452ca in lucene-solr's branch refs/heads/master from Tomoko Uchida [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=426a9c2 ] LUCENE-9499: migrate package.html files into package-info.java (#2072) > Clean up package name conflicts between modules (split packages) > > > Key: LUCENE-9499 > URL: https://issues.apache.org/jira/browse/LUCENE-9499 > Project: Lucene - Core > Issue Type: Improvement >Affects Versions: master (9.0) >Reporter: Tomoko Uchida >Assignee: Tomoko Uchida >Priority: Major > Fix For: master (9.0) > > Time Spent: 20m > Remaining Estimate: 0h > > We have lots of package name conflicts (shared package names) between modules > in the source tree. It is not only annoying for devs/users but also indeed > bad practice since Java 9 (according to my understanding), and we already > have some problems with Javadocs due to these splitted packages as some of us > would know. Also split packages make migrating to the Java 9 module system > impossible. > This is the placeholder to fix all package name conflicts in Lucene. > See the dev list thread for more background. > > [https://lists.apache.org/thread.html/r6496963e89a5e0615e53206429b6843cc5d3e923a2045cc7b7a1eb03%40%3Cdev.lucene.apache.org%3E] > Modules that need to be fixed / cleaned up: > - analyzers-common (LUCENE-9317) > - analyzers-icu (LUCENE-9558) > - backward-codecs (LUCENE-9318) > - sandbox (LUCENE-9319) > - misc (LUCENE-9600) > - (test-framework: this can be excluded for the moment) > Also lucene-core will be heavily affected (some classes have to be moved into > {{core}}, or some classes' and methods' in {{core}} visibility have to be > relaxed). > Probably most work would be done in a parallel manner, but conflicts can > happen. If someone want to help out, please open an issue before working and > share your thoughts with me and others. > I set "Fix version" to 9.0 - means once we make a commit on here, this will > be a blocker for release 9.0.0. (I don't think the changes should be > delivered across two major releases; all changes have to be out at once in a > major release.) If there are any objections or concerns, please leave > comments. For now I have no idea about the total volume of changes or > technical obstacles that have to be handled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-11767) Please create SolrCloud Helm Chart or Controller for Kubernetes
[ https://issues.apache.org/jira/browse/SOLR-11767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Houston Putman resolved SOLR-11767. --- Resolution: Done The [Solr Operator|https://github.com/bloomberg/solr-operator] is being adopted by the Apache Lucene project. > Please create SolrCloud Helm Chart or Controller for Kubernetes > --- > > Key: SOLR-11767 > URL: https://issues.apache.org/jira/browse/SOLR-11767 > Project: Solr > Issue Type: Improvement > Components: AutoScaling >Affects Versions: 7.1 > Environment: Azure AKS, On-Prem Kuberenetes 1.8 >Reporter: Rodney Aaron Stainback >Priority: Major > Original Estimate: 168h > Remaining Estimate: 168h > > Please creates a highly avialable auto-scaling Kubernetes Helm Chart or > Controller/Custom Resource for easy deployment of SolrCloud in Kubernetes in > any environement. Thanks. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] HoustonPutman merged pull request #2020: SOLR-14949: Ability to customize Solr Docker build
HoustonPutman merged pull request #2020: URL: https://github.com/apache/lucene-solr/pull/2020 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14949) Ability to customize docker image name/base image
[ https://issues.apache.org/jira/browse/SOLR-14949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229313#comment-17229313 ] ASF subversion and git services commented on SOLR-14949: Commit 212b0f8657029b31979b12da9330ec0613c5b271 in lucene-solr's branch refs/heads/master from Houston Putman [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=212b0f8 ] SOLR-14949: Ability to customize Solr Docker build (#2020) Also added a gradlew helpDocker page. > Ability to customize docker image name/base image > - > > Key: SOLR-14949 > URL: https://issues.apache.org/jira/browse/SOLR-14949 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Docker >Affects Versions: master (9.0) >Reporter: Houston Putman >Assignee: Houston Putman >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > The current docker build will generate an image with the name > {{apache/solr:}}. If users want to build custom images and push them > to their own docker orgs, then this should be more customizable. > The following inputs should be customizable in the first pass at least: > * Docker Image Repo - default "apache/solr" > * Docker Image Tag - default to the project version > * Docker Image Name (This will set the entire thing, overriding the previous > two options) - Defaults to ":" > * Base Docker Image (This is the docker image that Solr Builds itself on top > of) - Defaults to "openjdk:11-jre-slim" > All will be optional. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14949) Ability to customize docker image name/base image
[ https://issues.apache.org/jira/browse/SOLR-14949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Houston Putman resolved SOLR-14949. --- Fix Version/s: master (9.0) Resolution: Fixed > Ability to customize docker image name/base image > - > > Key: SOLR-14949 > URL: https://issues.apache.org/jira/browse/SOLR-14949 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Docker >Affects Versions: master (9.0) >Reporter: Houston Putman >Assignee: Houston Putman >Priority: Major > Fix For: master (9.0) > > Time Spent: 1h 40m > Remaining Estimate: 0h > > The current docker build will generate an image with the name > {{apache/solr:}}. If users want to build custom images and push them > to their own docker orgs, then this should be more customizable. > The following inputs should be customizable in the first pass at least: > * Docker Image Repo - default "apache/solr" > * Docker Image Tag - default to the project version > * Docker Image Name (This will set the entire thing, overriding the previous > two options) - Defaults to ":" > * Base Docker Image (This is the docker image that Solr Builds itself on top > of) - Defaults to "openjdk:11-jre-slim" > All will be optional. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14949) Ability to customize docker image name/base image
[ https://issues.apache.org/jira/browse/SOLR-14949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229317#comment-17229317 ] ASF subversion and git services commented on SOLR-14949: Commit 212b0f8657029b31979b12da9330ec0613c5b271 in lucene-solr's branch refs/heads/master from Houston Putman [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=212b0f8 ] SOLR-14949: Ability to customize Solr Docker build (#2020) Also added a gradlew helpDocker page. > Ability to customize docker image name/base image > - > > Key: SOLR-14949 > URL: https://issues.apache.org/jira/browse/SOLR-14949 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Docker >Affects Versions: master (9.0) >Reporter: Houston Putman >Assignee: Houston Putman >Priority: Major > Fix For: master (9.0) > > Time Spent: 1h 40m > Remaining Estimate: 0h > > The current docker build will generate an image with the name > {{apache/solr:}}. If users want to build custom images and push them > to their own docker orgs, then this should be more customizable. > The following inputs should be customizable in the first pass at least: > * Docker Image Repo - default "apache/solr" > * Docker Image Tag - default to the project version > * Docker Image Name (This will set the entire thing, overriding the previous > two options) - Defaults to ":" > * Base Docker Image (This is the docker image that Solr Builds itself on top > of) - Defaults to "openjdk:11-jre-slim" > All will be optional. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14949) Ability to customize docker image name/base image
[ https://issues.apache.org/jira/browse/SOLR-14949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229327#comment-17229327 ] ASF subversion and git services commented on SOLR-14949: Commit 212b0f8657029b31979b12da9330ec0613c5b271 in lucene-solr's branch refs/heads/master from Houston Putman [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=212b0f8 ] SOLR-14949: Ability to customize Solr Docker build (#2020) Also added a gradlew helpDocker page. > Ability to customize docker image name/base image > - > > Key: SOLR-14949 > URL: https://issues.apache.org/jira/browse/SOLR-14949 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Docker >Affects Versions: master (9.0) >Reporter: Houston Putman >Assignee: Houston Putman >Priority: Major > Fix For: master (9.0) > > Time Spent: 1h 40m > Remaining Estimate: 0h > > The current docker build will generate an image with the name > {{apache/solr:}}. If users want to build custom images and push them > to their own docker orgs, then this should be more customizable. > The following inputs should be customizable in the first pass at least: > * Docker Image Repo - default "apache/solr" > * Docker Image Tag - default to the project version > * Docker Image Name (This will set the entire thing, overriding the previous > two options) - Defaults to ":" > * Base Docker Image (This is the docker image that Solr Builds itself on top > of) - Defaults to "openjdk:11-jre-slim" > All will be optional. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14949) Ability to customize docker image name/base image
[ https://issues.apache.org/jira/browse/SOLR-14949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Houston Putman updated SOLR-14949: -- Description: The current docker build will generate an image with the name {{apache/solr:}}. If users want to build custom images and push them to their own docker orgs, then this should be more customizable. The following inputs should be customizable in the first pass at least: * Docker Image Repo - default "apache/solr" * Docker Image Tag - default to the project version * Docker Image Name (This will set the entire thing, overriding the previous two options) - Defaults to ":" * Base Docker Image (This is the docker image that Solr Builds itself on top of) - Defaults to "openjdk:11-jre-slim" * Github URL ("github.com" or a mirror for github releases. This allows for building the solr docker image behind a firewall that does not have access to github.com) All will be optional. was: The current docker build will generate an image with the name {{apache/solr:}}. If users want to build custom images and push them to their own docker orgs, then this should be more customizable. The following inputs should be customizable in the first pass at least: * Docker Image Repo - default "apache/solr" * Docker Image Tag - default to the project version * Docker Image Name (This will set the entire thing, overriding the previous two options) - Defaults to ":" * Base Docker Image (This is the docker image that Solr Builds itself on top of) - Defaults to "openjdk:11-jre-slim" All will be optional. > Ability to customize docker image name/base image > - > > Key: SOLR-14949 > URL: https://issues.apache.org/jira/browse/SOLR-14949 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Docker >Affects Versions: master (9.0) >Reporter: Houston Putman >Assignee: Houston Putman >Priority: Major > Fix For: master (9.0) > > Time Spent: 1h 40m > Remaining Estimate: 0h > > The current docker build will generate an image with the name > {{apache/solr:}}. If users want to build custom images and push them > to their own docker orgs, then this should be more customizable. > The following inputs should be customizable in the first pass at least: > * Docker Image Repo - default "apache/solr" > * Docker Image Tag - default to the project version > * Docker Image Name (This will set the entire thing, overriding the previous > two options) - Defaults to ":" > * Base Docker Image (This is the docker image that Solr Builds itself on top > of) - Defaults to "openjdk:11-jre-slim" > * Github URL ("github.com" or a mirror for github releases. This allows for > building the solr docker image behind a firewall that does not have access to > github.com) > All will be optional. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14927) Remove Overseer
[ https://issues.apache.org/jira/browse/SOLR-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229336#comment-17229336 ] Ilan Ginzburg commented on SOLR-14927: -- I see three types of (cluster state) updates done by Overseer and a different behavior for each by moving to CAS (Compare and Swap): # State change for a single collection done directly by the Collection API command execution # Replicas advertising themselves as having been created by nodes # Replica up/down updates for multiple collection when nodes go up/down. I believe CAS should be a relatively easy win for case 1. There is likely low contention on updates to the corresponding {{state.json}}. Case 2 (as [~thelabdude] [commented)|https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/edit?disco=HGxplOw] might be a tricky one. There will no longer be batching of the updates as Overseer currently does. Each replica will try to do the update to ZK directly. We could then introduce batching by SolrCloud node for large collections. We might also split the collection Zookeeper state into a collection level state ({{state.json}}) and per shard sub states ({{state-__.json}} for example) so that replicas of different shards do not compete (nor watch...) state changes related to other shards. Case 3 has to be addressed in a slightly different way: make replica state (in its collection's {{state.json}} or its shard state if one eventually exists) not depend on (i.e. change with) its node state, but rather check node state and replica state to determine the "real" replica state: replica considered up if replica advertised as up and node up as well. > Remove Overseer > --- > > Key: SOLR-14927 > URL: https://issues.apache.org/jira/browse/SOLR-14927 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Reporter: Ilan Ginzburg >Assignee: Ilan Ginzburg >Priority: Major > Labels: cluster, collection-api, overseer, solrcloud, zookeeper > > This Jira is intended to capture sub jiras on the path to remove the Overseer > component from SolrCloud and move to all nodes being able to do the work > currently done by Overseer. > See detailed description in [this > doc|https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/]. > Copying (edited) from the above doc: > The motivation for removing Overseer include: > * Mono threaded state change is slow and doesn’t scale, > * Communication between cluster nodes and the Overseer use Zookeeper as a > queueing mechanism, this is not a good idea, > * Nodes talking to Overseer (then Overseer talking to itself) via Zookeeper > is inefficient and adds latency, > * Collection API scalability is poor, because not only a single node > processes commands for all Collections, but it also depends on the mono > threaded state change queue consumption, > * The code supporting Overseer in SolrCloud is complex (election, queue > management, recovery etc). > The general idea is that there’s already a central point in the SolrCloud > cluster and it’s Zookeeper. It might not be necessary to have a second > central point (Overseer) because nodes can interact directly with Zookeeper > and synchronize more efficiently by optimistic locking using “conditional > updates” (a.k.a compare and swap or CAS). > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] HoustonPutman opened a new pull request #2074: SOLR-14949: Adding githubUrl option for docker build.
HoustonPutman opened a new pull request #2074: URL: https://github.com/apache/lucene-solr/pull/2074 https://issues.apache.org/jira/browse/SOLR-14949 Missed one option for customizing the building of docker images. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14927) Remove Overseer
[ https://issues.apache.org/jira/browse/SOLR-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229336#comment-17229336 ] Ilan Ginzburg edited comment on SOLR-14927 at 11/10/20, 4:37 PM: - I see three types of (cluster state) updates done by Overseer and a different behavior for each by moving to CAS (Compare and Swap): # State change for a single collection done directly by the Collection API command execution # Replicas advertising themselves as having been created by nodes # Replica up/down state updates for multiple collections when nodes go up/down. I believe CAS should be a relatively easy win for case 1. There is likely low contention on updates to the corresponding {{state.json}}. Case 2 (as [~thelabdude] [commented)|https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/edit?disco=HGxplOw] might be a tricky one. There will no longer be batching of the updates as Overseer currently does. Each replica will try to do the update to ZK directly. We could then introduce batching by SolrCloud node for large collections. We might also split the collection Zookeeper state into a collection level state ({{state.json}}) and per shard sub states ({{state-__.json}} for example) so that replicas of different shards do not compete with (nor watch...) state changes related to other shards. Case 3 has to be addressed in a slightly different way: make replica state (in its collection's {{state.json}} or its shard state if one eventually exists) not depend on (i.e. change with) its node state, but rather check node state and replica state independently to determine the "real" replica state: replica considered up if replica advertised as up and node up as well. was (Author: murblanc): I see three types of (cluster state) updates done by Overseer and a different behavior for each by moving to CAS (Compare and Swap): # State change for a single collection done directly by the Collection API command execution # Replicas advertising themselves as having been created by nodes # Replica up/down updates for multiple collection when nodes go up/down. I believe CAS should be a relatively easy win for case 1. There is likely low contention on updates to the corresponding {{state.json}}. Case 2 (as [~thelabdude] [commented)|https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/edit?disco=HGxplOw] might be a tricky one. There will no longer be batching of the updates as Overseer currently does. Each replica will try to do the update to ZK directly. We could then introduce batching by SolrCloud node for large collections. We might also split the collection Zookeeper state into a collection level state ({{state.json}}) and per shard sub states ({{state-__.json}} for example) so that replicas of different shards do not compete (nor watch...) state changes related to other shards. Case 3 has to be addressed in a slightly different way: make replica state (in its collection's {{state.json}} or its shard state if one eventually exists) not depend on (i.e. change with) its node state, but rather check node state and replica state to determine the "real" replica state: replica considered up if replica advertised as up and node up as well. > Remove Overseer > --- > > Key: SOLR-14927 > URL: https://issues.apache.org/jira/browse/SOLR-14927 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Reporter: Ilan Ginzburg >Assignee: Ilan Ginzburg >Priority: Major > Labels: cluster, collection-api, overseer, solrcloud, zookeeper > > This Jira is intended to capture sub jiras on the path to remove the Overseer > component from SolrCloud and move to all nodes being able to do the work > currently done by Overseer. > See detailed description in [this > doc|https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/]. > Copying (edited) from the above doc: > The motivation for removing Overseer include: > * Mono threaded state change is slow and doesn’t scale, > * Communication between cluster nodes and the Overseer use Zookeeper as a > queueing mechanism, this is not a good idea, > * Nodes talking to Overseer (then Overseer talking to itself) via Zookeeper > is inefficient and adds latency, > * Collection API scalability is poor, because not only a single node > processes commands for all Collections, but it also depends on the mono > threaded state change queue consumption, > * The code supporting Overseer in SolrCloud is complex (election, queue > management, recovery etc). > The general idea is that there’s already a central point in the SolrCloud > cluster and it’s Zookeeper. It might n
[GitHub] [lucene-solr] gerlowskija commented on pull request #2056: SOLR-14971: Handle atomic-removes on uncommitted docs
gerlowskija commented on pull request #2056: URL: https://github.com/apache/lucene-solr/pull/2056#issuecomment-724828898 Hey @munendrasn new code should address your concerns: - handling for the string values that come through when requests are made using XML. - handling for the "add-distinct" class of operations. - additional tests to cover the two cases above. I ended up _not_ going with the approach you suggested above of using `toNativeType()` to convert the whole list of "original" values, primarily for the performance concern I mentioned above. I'm not in love with the code I've got now though - the type checks are pretty ugly. So maybe I'll flip on this after a bit more consideration. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14683) Review the metrics API to ensure consistent placeholders for missing values
[ https://issues.apache.org/jira/browse/SOLR-14683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229359#comment-17229359 ] ASF subversion and git services commented on SOLR-14683: Commit 863a388fe71b209f142b7ed991294caf837a20bd in lucene-solr's branch refs/heads/master from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=863a388 ] SOLR-14683: Move the CHANGES.txt entry to the right place. Fix wrong type of null value. > Review the metrics API to ensure consistent placeholders for missing values > --- > > Key: SOLR-14683 > URL: https://issues.apache.org/jira/browse/SOLR-14683 > Project: Solr > Issue Type: Improvement > Components: metrics >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: master (9.0) > > Attachments: SOLR-14683.patch, SOLR-14683.patch > > > Spin-off from SOLR-14657. Some gauges can legitimately be missing or in an > unknown state at some points in time, eg. during SolrCore startup or shutdown. > Currently the API returns placeholders with either impossible values for > numeric gauges (such as index size -1) or empty maps / strings for other > non-numeric gauges. > [~hossman] noticed that the values for these placeholders may be misleading, > depending on how the user treats them - if the client has no special logic to > treat them as "missing values" it may erroneously treat them as valid data. > E.g. numeric values of -1 or 0 may severely skew averages and produce > misleading peaks / valleys in metrics histories. > On the other hand returning a literal {{null}} value instead of the expected > number may also cause unexpected client issues - although in this case it's > clearer that there's actually no data available, so long-term this may be a > better strategy than returning impossible values, even if it means that the > client should learn to handle {{null}} values appropriately. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] alessandrobenedetti commented on a change in pull request #1571: SOLR-14560: Interleaving for Learning To Rank
alessandrobenedetti commented on a change in pull request #1571: URL: https://github.com/apache/lucene-solr/pull/1571#discussion_r520756153 ## File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/interleaving/TeamDraftInterleaving.java ## @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.solr.ltr.interleaving; + +import java.util.ArrayList; +import java.util.HashSet; +import java.util.LinkedHashSet; +import java.util.Random; +import java.util.Set; + +import org.apache.lucene.search.ScoreDoc; + +/** + * Interleaving was introduced the first time by Joachims in [1, 2]. + * Team Draft Interleaving is among the most successful and used interleaving approaches[3]. + * Here the authors implement a method similar to the way in which captains select their players in team-matches. + * Team Draft Interleaving produces a fair distribution of ranking models’ elements in the final interleaved list. + * It has also proved to overcome an issue of the previous implemented approach, Balanced interleaving, in determining the winning model[4]. + * + * [1] T. Joachims. Optimizing search engines using clickthrough data. KDD (2002) + * [2] T.Joachims.Evaluatingretrievalperformanceusingclickthroughdata.InJ.Franke, G. Nakhaeizadeh, and I. Renz, editors, + * Text Mining, pages 79–96. Physica/Springer (2003) + * [3] F. Radlinski, M. Kurup, and T. Joachims. How does clickthrough data reflect re- + * trieval quality? In CIKM, pages 43–52. ACM Press (2008) + * [4] O. Chapelle, T. Joachims, F. Radlinski, and Y. Yue. + * Large-scale validation and analysis of interleaved search evaluation. ACM TOIS, 30(1):1–41, Feb. (2012) + */ +public class TeamDraftInterleaving implements Interleaving{ + public static Random RANDOM; + + static { +// We try to make things reproducible in the context of our tests by initializing the random instance +// based on the current seed +String seed = System.getProperty("tests.seed"); +if (seed == null) { + RANDOM = new Random(); +} else { + RANDOM = new Random(seed.hashCode()); +} + } + + /** + * Team Draft Interleaving considers two ranking models: modelA and modelB. + * For a given query, each model returns its ranked list of documents La = (a1,a2,...) and Lb = (b1, b2, ...). + * The algorithm creates a unique ranked list I = (i1, i2, ...). + * This list is created by interleaving elements from the two lists la and lb as described by Chapelle et al.[1]. + * Each element Ij is labelled TeamA if it is selected from La and TeamB if it is selected from Lb. + * + * [1] O. Chapelle, T. Joachims, F. Radlinski, and Y. Yue. + * Large-scale validation and analysis of interleaved search evaluation. ACM TOIS, 30(1):1–41, Feb. (2012) + * + * Assumptions: + * - rerankedA and rerankedB has the same length. + * They contains the same search results, ranked differently by two ranking models + * - each reranked list can not contain the same search result more than once. + * + * @param rerankedA a ranked list of search results produced by a ranking model A + * @param rerankedB a ranked list of search results produced by a ranking model B + * @return the interleaved ranking list + */ + public InterleavingResult interleave(ScoreDoc[] rerankedA, ScoreDoc[] rerankedB) { +LinkedHashSet interleavedResults = new LinkedHashSet<>(); +ScoreDoc[] interleavedResultArray = new ScoreDoc[rerankedA.length]; +ArrayList> interleavingPicks = new ArrayList<>(2); +Set teamA = new HashSet<>(); +Set teamB = new HashSet<>(); +int topN = rerankedA.length; +int indexA = 0, indexB = 0; + +while (interleavedResults.size() < topN && indexA < rerankedA.length && indexB < rerankedB.length) { + if(teamA.size() interleaved, int index, ScoreDoc[] reranked) { +boolean foundElementToAdd = false; +while (index < reranked.length && !foundElementToAdd) { + ScoreDoc elementToCheck = reranked[index]; + if (interleaved.contains(elementToCheck)) { Review comment: You are right, ScoreDoc.equals alone may be less obvious. The reason I originally did that was to support sharding, but to be hon
[GitHub] [lucene-solr] alessandrobenedetti commented on pull request #1571: SOLR-14560: Interleaving for Learning To Rank
alessandrobenedetti commented on pull request #1571: URL: https://github.com/apache/lucene-solr/pull/1571#issuecomment-724864522 @cpoerschke I think I finished implementing all the required changes and discussions, let me know with no rush and I think we can move this to the commit phase :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9583) How should we expose VectorValues.RandomAccess?
[ https://issues.apache.org/jira/browse/LUCENE-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229418#comment-17229418 ] Julie Tibshirani commented on LUCENE-9583: -- It's great that the 'search' method is now on the main VectorValues interface. > By "wrong message" I mean that we require two implementations where only one > is needed. It will be difficult to optimize one type of access without > hurting the other so I'd lean toward a single pattern. > So I would propose revisiting this once LUCENE-9004 lands. To me there's still an open question around whether the public interface should support both access patterns (random and iterator-based), and if not which one should be chosen. Perhaps we could revisit this issue once the first ANN implementation is completed? I think it will be easier to understand the options from a code-level, and easier for us to contribute/ assess refactoring ideas. > How should we expose VectorValues.RandomAccess? > --- > > Key: LUCENE-9583 > URL: https://issues.apache.org/jira/browse/LUCENE-9583 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael Sokolov >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > In the newly-added {{VectorValues}} API, we have a {{RandomAccess}} > sub-interface. [~jtibshirani] pointed out this is not needed by some > vector-indexing strategies which can operate solely using a forward-iterator > (it is needed by HNSW), and so in the interest of simplifying the public API > we should not expose this internal detail (which by the way surfaces internal > ordinals that are somewhat uninteresting outside the random access API). > I looked into how to move this inside the HNSW-specific code and remembered > that we do also currently make use of the RA API when merging vector fields > over sorted indexes. Without it, we would need to load all vectors into RAM > while flushing/merging, as we currently do in > {{BinaryDocValuesWriter.BinaryDVs}}. I wonder if it's worth paying this cost > for the simpler API. > Another thing I noticed while reviewing this is that I moved the KNN > {{search(float[] target, int topK, int fanout)}} method from {{VectorValues}} > to {{VectorValues.RandomAccess}}. This I think we could move back, and > handle the HNSW requirements for search elsewhere. I wonder if that would > alleviate the major concern here? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] asfgit closed pull request #915: SOLR-13661: Package management APIs, Package loading, Package store
asfgit closed pull request #915: URL: https://github.com/apache/lucene-solr/pull/915 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] asfgit closed pull request #942: SOLR-13834: ZkController#getSolrCloudManager() now uses the same ZkStateReader
asfgit closed pull request #942: URL: https://github.com/apache/lucene-solr/pull/942 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14949) Ability to customize docker image name/base image
[ https://issues.apache.org/jira/browse/SOLR-14949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229427#comment-17229427 ] ASF subversion and git services commented on SOLR-14949: Commit d65041359ea1e2b2db623d73c22a7ffd95fc88d1 in lucene-solr's branch refs/heads/master from Houston Putman [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d650413 ] SOLR-14949: Adding githubUrl option for docker build. (#2074) > Ability to customize docker image name/base image > - > > Key: SOLR-14949 > URL: https://issues.apache.org/jira/browse/SOLR-14949 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Docker >Affects Versions: master (9.0) >Reporter: Houston Putman >Assignee: Houston Putman >Priority: Major > Fix For: master (9.0) > > Time Spent: 2h > Remaining Estimate: 0h > > The current docker build will generate an image with the name > {{apache/solr:}}. If users want to build custom images and push them > to their own docker orgs, then this should be more customizable. > The following inputs should be customizable in the first pass at least: > * Docker Image Repo - default "apache/solr" > * Docker Image Tag - default to the project version > * Docker Image Name (This will set the entire thing, overriding the previous > two options) - Defaults to ":" > * Base Docker Image (This is the docker image that Solr Builds itself on top > of) - Defaults to "openjdk:11-jre-slim" > * Github URL ("github.com" or a mirror for github releases. This allows for > building the solr docker image behind a firewall that does not have access to > github.com) > All will be optional. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] HoustonPutman merged pull request #2074: SOLR-14949: Adding githubUrl option for docker build.
HoustonPutman merged pull request #2074: URL: https://github.com/apache/lucene-solr/pull/2074 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter reassigned SOLR-14614: - Assignee: Timothy Potter > Add Simplified Aggregation Interface to Streaming Expression > > > Key: SOLR-14614 > URL: https://issues.apache.org/jira/browse/SOLR-14614 > Project: Solr > Issue Type: Improvement > Components: query, query parsers, streaming expressions >Affects Versions: 7.7.2, 8.4.1 >Reporter: Aroop >Assignee: Timothy Potter >Priority: Major > > For the Data Analytics use cases the standard use case is: > # Find a pattern > # Then Aggregate by certain dimensions > # Then compute metrics (like count, sum, avg) > # Sort by a dimension or metric > # look at top-n > This functionality has been available over many different interfaces in the > past on solr, but only streaming expressions have the ability to deliver > results in a scalable, performant and stable manner for systems that have > large data to the tune of Big data systems. > However, one barrier to entry is the query interface, not being simple enough > in streaming expressions. > to give an example of how involved the corresponding streaming expression can > get, to get it to work on large scale systems,{color:#4c9aff} _find top 10 > cities where someone named Alex works with the respective counts_{color} > {code:java} > qt=/stream&aggregationMode=facet&expr= > select( top( rollup(sort(by%3D"city+asc", >+plist( > > select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa), > > select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa) > )), > +over%3D"city",+sum(Nj3bXa)), > +n%3D"10",+sort%3D"sum(Nj3bXa)+desc"), > +city,+sum(Nj3bXa)+as+Nj3bXa) > {code} > This is a query on an alias with 2 collections behind it representing 2 data > partitions, which is a requirement of sorts in big data systems. This is one > of the only ways to get information from Billions of records in a matter of > seconds. This is awesome in terms of capability and performance. > But one can see how involved this syntax can be in the current scheme and is > a barrier to entry for new adopters. > > This Jira is to track the work of creating a simplified analytics endpoint > augmenting streaming expressions. > a starting proposal is to have the endpoint have these query parameters: > {code:java} > /analytics?action=aggregate&q=*:*&fq=name:alex&dimensions=city&metrics=count&sort=count&sortOrder=desc&limit=10{code} > This is equivalent to a sql that an analyst would write: > {code:java} > select city, count(*) from collection where name = 'alex' > group by city order by count(*) desc limit 10;{code} > On the solr side this would get translated to the best possible streaming > expression using *rollups, top, sort, plist* etc.; but all done transparently > to the user. > Heres to making the power of Streaming expressions simpler to use for all. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14927) Remove Overseer
[ https://issues.apache.org/jira/browse/SOLR-14927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229452#comment-17229452 ] David Smiley commented on SOLR-14927: - +1 I like this plan most of all because SolrCloud becomes simpler -- ZK + Overseer is fundamentally more complex than ZK alone. This is your point at the end of the issue description; I just wanted to amplify it. Perhaps it's more scalable as well, but that's secondary to me. > Remove Overseer > --- > > Key: SOLR-14927 > URL: https://issues.apache.org/jira/browse/SOLR-14927 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Reporter: Ilan Ginzburg >Assignee: Ilan Ginzburg >Priority: Major > Labels: cluster, collection-api, overseer, solrcloud, zookeeper > > This Jira is intended to capture sub jiras on the path to remove the Overseer > component from SolrCloud and move to all nodes being able to do the work > currently done by Overseer. > See detailed description in [this > doc|https://docs.google.com/document/d/1u4QHsIHuIxlglIW6hekYlXGNOP0HjLGVX5N6inkj6Ok/]. > Copying (edited) from the above doc: > The motivation for removing Overseer include: > * Mono threaded state change is slow and doesn’t scale, > * Communication between cluster nodes and the Overseer use Zookeeper as a > queueing mechanism, this is not a good idea, > * Nodes talking to Overseer (then Overseer talking to itself) via Zookeeper > is inefficient and adds latency, > * Collection API scalability is poor, because not only a single node > processes commands for all Collections, but it also depends on the mono > threaded state change queue consumption, > * The code supporting Overseer in SolrCloud is complex (election, queue > management, recovery etc). > The general idea is that there’s already a central point in the SolrCloud > cluster and it’s Zookeeper. It might not be necessary to have a second > central point (Overseer) because nodes can interact directly with Zookeeper > and synchronize more efficiently by optimistic locking using “conditional > updates” (a.k.a compare and swap or CAS). > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229525#comment-17229525 ] Varun Thacker commented on SOLR-14614: -- nice! > Add Simplified Aggregation Interface to Streaming Expression > > > Key: SOLR-14614 > URL: https://issues.apache.org/jira/browse/SOLR-14614 > Project: Solr > Issue Type: Improvement > Components: query, query parsers, streaming expressions >Affects Versions: 7.7.2, 8.4.1 >Reporter: Aroop >Assignee: Timothy Potter >Priority: Major > > For the Data Analytics use cases the standard use case is: > # Find a pattern > # Then Aggregate by certain dimensions > # Then compute metrics (like count, sum, avg) > # Sort by a dimension or metric > # look at top-n > This functionality has been available over many different interfaces in the > past on solr, but only streaming expressions have the ability to deliver > results in a scalable, performant and stable manner for systems that have > large data to the tune of Big data systems. > However, one barrier to entry is the query interface, not being simple enough > in streaming expressions. > to give an example of how involved the corresponding streaming expression can > get, to get it to work on large scale systems,{color:#4c9aff} _find top 10 > cities where someone named Alex works with the respective counts_{color} > {code:java} > qt=/stream&aggregationMode=facet&expr= > select( top( rollup(sort(by%3D"city+asc", >+plist( > > select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa), > > select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa) > )), > +over%3D"city",+sum(Nj3bXa)), > +n%3D"10",+sort%3D"sum(Nj3bXa)+desc"), > +city,+sum(Nj3bXa)+as+Nj3bXa) > {code} > This is a query on an alias with 2 collections behind it representing 2 data > partitions, which is a requirement of sorts in big data systems. This is one > of the only ways to get information from Billions of records in a matter of > seconds. This is awesome in terms of capability and performance. > But one can see how involved this syntax can be in the current scheme and is > a barrier to entry for new adopters. > > This Jira is to track the work of creating a simplified analytics endpoint > augmenting streaming expressions. > a starting proposal is to have the endpoint have these query parameters: > {code:java} > /analytics?action=aggregate&q=*:*&fq=name:alex&dimensions=city&metrics=count&sort=count&sortOrder=desc&limit=10{code} > This is equivalent to a sql that an analyst would write: > {code:java} > select city, count(*) from collection where name = 'alex' > group by city order by count(*) desc limit 10;{code} > On the solr side this would get translated to the best possible streaming > expression using *rollups, top, sort, plist* etc.; but all done transparently > to the user. > Heres to making the power of Streaming expressions simpler to use for all. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9599) Disable sort optimization in comparators on index sort
[ https://issues.apache.org/jira/browse/LUCENE-9599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayya Sharipova updated LUCENE-9599: Summary: Disable sort optimization in comparators on index sort (was: Make comparators aware of index sorting) > Disable sort optimization in comparators on index sort > -- > > Key: LUCENE-9599 > URL: https://issues.apache.org/jira/browse/LUCENE-9599 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 2h 10m > Remaining Estimate: 0h > > LUCENE-9280 introduced an ability for comparators to skip non-competitive > documents. But currently comparators are not aware of index sorting, and are > not able to early terminate when search sort is equal to index sort. > Currently, if search sort is equal to index sort, we have an early > termination in TopFieldCollector. As we work to rely on comparators to > provide skipping functionality, we would like to move this termination > functionality on index sort from > TopFieldCollector to comparators. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9599) Disable sort optimization in comparators on index sort
[ https://issues.apache.org/jira/browse/LUCENE-9599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayya Sharipova updated LUCENE-9599: Description: LUCENE-9280 introduced an ability for comparators to skip non-competitive documents. But currently comparators are not aware of index sorting, and can run sort optimization even when search sort is congruent with the index sort. As the early termination in this case is already handled in TopFieldCollector, we need to disable sort optimization in comparators. was: LUCENE-9280 introduced an ability for comparators to skip non-competitive documents. But currently comparators are not aware of index sorting, and are not able to early terminate when search sort is equal to index sort. Currently, if search sort is equal to index sort, we have an early termination in TopFieldCollector. As we work to rely on comparators to provide skipping functionality, we would like to move this termination functionality on index sort from TopFieldCollector to comparators. > Disable sort optimization in comparators on index sort > -- > > Key: LUCENE-9599 > URL: https://issues.apache.org/jira/browse/LUCENE-9599 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 2h 10m > Remaining Estimate: 0h > > LUCENE-9280 introduced an ability for comparators to skip non-competitive > documents. But currently comparators are not aware of index sorting, and can > run sort optimization even when search sort is congruent with the index sort. > As the early termination in this case is already handled in > TopFieldCollector, we need to disable sort optimization in comparators. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova opened a new pull request #2075: LUCENE-9599 Disable sort optim on index sort
mayya-sharipova opened a new pull request #2075: URL: https://github.com/apache/lucene-solr/pull/2075 Disable sort optimization in comparators on index sort. Currently, if search sort is equal or a part of the index sort, we have an early termination in TopFieldCollector. But comparators are not aware of the index sort, and may run sort optimization even if the search sort is congruent with the index sort. This patch: - make leaf comparators aware that search sort is congruent with the index sort. - disables sort optimization in comparators in this case. - removes a private MultiComparatorLeafCollector class as the only class that extended that class was TopFieldLeafCollector that now incorporates the logic of the deleted class. Relates to #1351 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova closed pull request #2063: LUCENE-9599 Make comparator aware of index sorting
mayya-sharipova closed pull request #2063: URL: https://github.com/apache/lucene-solr/pull/2063 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova commented on pull request #2063: LUCENE-9599 Make comparator aware of index sorting
mayya-sharipova commented on pull request #2063: URL: https://github.com/apache/lucene-solr/pull/2063#issuecomment-724995524 @msokolov @jimczi Thank you for the review. As per @jimczi's latest comment, since it makes more sense to disable sort optimization on index sort in comparators rather then move this logic from `TopFieldCollector` to comparators, I am closing this PR, and I have opened a new much smaller [PR](https://github.com/apache/lucene-solr/pull/2075) that will implement this plan. Sorry for the trouble. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] mayya-sharipova merged pull request #2051: LUCENE-9594 Add linear function for FeatureField
mayya-sharipova merged pull request #2051: URL: https://github.com/apache/lucene-solr/pull/2051 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9594) Linear function for FeatureField
[ https://issues.apache.org/jira/browse/LUCENE-9594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229545#comment-17229545 ] ASF subversion and git services commented on LUCENE-9594: - Commit 5897d14fe4f5dd785a561b18c939ea31186c4971 in lucene-solr's branch refs/heads/master from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5897d14 ] LUCENE-9594 Add linear function for FeatureField This adds a linear function and newLinearQuery for FeatureField > Linear function for FeatureField > > > Key: LUCENE-9594 > URL: https://issues.apache.org/jira/browse/LUCENE-9594 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently FeatureField supports only 3 functions: log, saturation and > sigmoid. > It is useful for certain cases to have a linear function. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] asfgit closed pull request #1650: SOLR-14021: Deprecate HDFS support in 8.6
asfgit closed pull request #1650: URL: https://github.com/apache/lucene-solr/pull/1650 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9594) Linear function for FeatureField
[ https://issues.apache.org/jira/browse/LUCENE-9594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229552#comment-17229552 ] ASF subversion and git services commented on LUCENE-9594: - Commit e76a25ebc72520ee3cbbeca31df82dd3a3a31048 in lucene-solr's branch refs/heads/branch_8x from Mayya Sharipova [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e76a25e ] LUCENE-9594 Add linear function for FeatureField This adds a linear function and newLinearQuery for FeatureField > Linear function for FeatureField > > > Key: LUCENE-9594 > URL: https://issues.apache.org/jira/browse/LUCENE-9594 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mayya Sharipova >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently FeatureField supports only 3 functions: log, saturation and > sigmoid. > It is useful for certain cases to have a linear function. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] asfgit closed pull request #1638: SOLR-14066: Deprecate DIH
asfgit closed pull request #1638: URL: https://github.com/apache/lucene-solr/pull/1638 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9594) Linear function for FeatureField
[ https://issues.apache.org/jira/browse/LUCENE-9594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayya Sharipova resolved LUCENE-9594. - Fix Version/s: 8.8 Resolution: Fixed > Linear function for FeatureField > > > Key: LUCENE-9594 > URL: https://issues.apache.org/jira/browse/LUCENE-9594 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mayya Sharipova >Priority: Minor > Fix For: 8.8 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently FeatureField supports only 3 functions: log, saturation and > sigmoid. > It is useful for certain cases to have a linear function. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] asfgit closed pull request #1812: SOLR-14799: JWT authentication plugin only requires sub claim when pr…
asfgit closed pull request #1812: URL: https://github.com/apache/lucene-solr/pull/1812 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] asfgit closed pull request #1420: package store PUT should be idempotent
asfgit closed pull request #1420: URL: https://github.com/apache/lucene-solr/pull/1420 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] asfgit closed pull request #1661: SOLR-14404: test fix
asfgit closed pull request #1661: URL: https://github.com/apache/lucene-solr/pull/1661 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] anshumg commented on pull request #2010: SOLR-12182: Don't persist base_url in ZK as the scheme is variable, compute from node_name instead
anshumg commented on pull request #2010: URL: https://github.com/apache/lucene-solr/pull/2010#issuecomment-725022361 We should restrict this PR to just handle the removal of persisted base_url in ZK. For users who intend to enable/disable TLS for their cluster, they should plan for downtime and as guaranteeing a safe rolling restart during such a transition is not possible right now. I think we're better off not trying for a best effort during a rolling restart here and removing the live_nodes part of the code change to keep things simple. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14991) tag and remove obsolete branches
[ https://issues.apache.org/jira/browse/SOLR-14991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229565#comment-17229565 ] Erick Erickson commented on SOLR-14991: --- Done, except for a few questions. The following branches didn't have a Jira I could check against. Can the identified people: 1> tag and remove the branch or 2> tell me to tag and remove the branch, I've got it down to a science or 3> tell me that they should be left alone. [~noble.paul] remotes/origin/jira-14151-revert remotes/origin/jira/V2Request [~caomanhdat] remotes/origin/jira/http2 [~danmuzi] or maybe [~rmuir] remotes/origin/revert-776-remove_icu_dependency Maybe this is LUCENE-8912 pull 776? In which case the JIRAs closed and I'll tag/remove. I've pinged Mark Miller about this one on a separate channel. remotes/origin/starburst Why these specific people? Well, since I couldn't go to a Jira I looked at the commit history of the branch and they were mentioned. These branches have been tagged/removed remotes/origin/SOLR-11795 remotes/origin/jira/LUCENE-8738 remotes/origin/jira/LUCENE-9312 remotes/origin/jira/LUCENE-9438 remotes/origin/jira/SOLR-13229 remotes/origin/jira/SOLR-13462 remotes/origin/jira/SOLR-13661-reverted remotes/origin/jira/SOLR-13661_2 remotes/origin/jira/SOLR-13793 remotes/origin/jira/SOLR-13822_backup remotes/origin/jira/SOLR-13834 remotes/origin/jira/SOLR-14354-revert remotes/origin/jira/SOLR-14383 remotes/origin/jira/lucene-5438-nrt-replication remotes/origin/jira/solr-12259 remotes/origin/jira/solr-13472 remotes/origin/jira/solr-13619 remotes/origin/jira/solr-13662 remotes/origin/jira/solr-13662-2 remotes/origin/jira/solr-13662-3 remotes/origin/jira/solr-13662-3_tmp remotes/origin/jira/solr-13662-fixes remotes/origin/jira/solr-13662-updated remotes/origin/jira/solr-13718-8x remotes/origin/jira/solr-13971 remotes/origin/jira/solr-13978 remotes/origin/jira/solr-14021 remotes/origin/jira/solr-14022 remotes/origin/jira/solr-14025 remotes/origin/jira/solr-14066-master remotes/origin/jira/solr-14071 remotes/origin/jira/solr-14151-revert remotes/origin/jira/solr-14151-revert-2 remotes/origin/jira/solr-14151-revert-8x remotes/origin/jira/solr-14151-revert-8x-2 remotes/origin/jira/solr-14158 remotes/origin/jira/solr-14599 remotes/origin/jira/solr-14599_1 remotes/origin/jira/solr-14603 remotes/origin/jira/solr-14603-8x remotes/origin/jira/solr-14616 remotes/origin/jira/solr-14799 remotes/origin/jira/solr-14914 remotes/origin/jira/solr14398 remotes/origin/jira/solr14404_fix remotes/origin/lucene-6835 remotes/origin/lucene-6997 remotes/origin/lucene-7015 remotes/origin/solr-13131 remotes/origin/solr-6733 > tag and remove obsolete branches > > > Key: SOLR-14991 > URL: https://issues.apache.org/jira/browse/SOLR-14991 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > > I'm going to gradually work through the branches, tagging and removing > 1> anything with a Jira name that's fixed > 2> anything that I'm certain will never be fixed (e.g. the various gradle > build branches) > So the changes will still available, they just won't pollute the branch list. > I'll list the branches here, all the tags will be > history/branches/lucene-solr/ > > This specifically will _not_ include > 1> any release, e.g. branch_8_4 > 2> anything I'm unsure about. People who've created branches should expect > some pings about this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14991) tag and remove obsolete branches
[ https://issues.apache.org/jira/browse/SOLR-14991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229565#comment-17229565 ] Erick Erickson edited comment on SOLR-14991 at 11/10/20, 11:13 PM: --- Done, except for a few questions. The following branches didn't have a Jira I could check against. Can the identified people: 1> tag and remove the branch or 2> tell me to tag and remove the branch, I've got it down to a science or 3> tell me that they should be left alone. [~noble.paul] remotes/origin/jira-14151-revert remotes/origin/jira/V2Request [~caomanhdat] remotes/origin/jira/http2 [~danmuzi] or maybe [~rmuir] remotes/origin/revert-776-remove_icu_dependency Maybe this is LUCENE-8912 pull 776? In which case the JIRAs closed and I'll tag/remove. I've pinged Mark Miller about this one on a separate channel. remotes/origin/starburst Why these specific people? Well, since I couldn't go to a Jira I looked at the commit history of the branch and they were mentioned. *These branches have been tagged/removed* remotes/origin/SOLR-11795 remotes/origin/jira/LUCENE-8738 remotes/origin/jira/LUCENE-9312 remotes/origin/jira/LUCENE-9438 remotes/origin/jira/SOLR-13229 remotes/origin/jira/SOLR-13462 remotes/origin/jira/SOLR-13661-reverted remotes/origin/jira/SOLR-13661_2 remotes/origin/jira/SOLR-13793 remotes/origin/jira/SOLR-13822_backup remotes/origin/jira/SOLR-13834 remotes/origin/jira/SOLR-14354-revert remotes/origin/jira/SOLR-14383 remotes/origin/jira/lucene-5438-nrt-replication remotes/origin/jira/solr-12259 remotes/origin/jira/solr-13472 remotes/origin/jira/solr-13619 remotes/origin/jira/solr-13662 remotes/origin/jira/solr-13662-2 remotes/origin/jira/solr-13662-3 remotes/origin/jira/solr-13662-3_tmp remotes/origin/jira/solr-13662-fixes remotes/origin/jira/solr-13662-updated remotes/origin/jira/solr-13718-8x remotes/origin/jira/solr-13971 remotes/origin/jira/solr-13978 remotes/origin/jira/solr-14021 remotes/origin/jira/solr-14022 remotes/origin/jira/solr-14025 remotes/origin/jira/solr-14066-master remotes/origin/jira/solr-14071 remotes/origin/jira/solr-14151-revert remotes/origin/jira/solr-14151-revert-2 remotes/origin/jira/solr-14151-revert-8x remotes/origin/jira/solr-14151-revert-8x-2 remotes/origin/jira/solr-14158 remotes/origin/jira/solr-14599 remotes/origin/jira/solr-14599_1 remotes/origin/jira/solr-14603 remotes/origin/jira/solr-14603-8x remotes/origin/jira/solr-14616 remotes/origin/jira/solr-14799 remotes/origin/jira/solr-14914 remotes/origin/jira/solr14398 remotes/origin/jira/solr14404_fix remotes/origin/lucene-6835 remotes/origin/lucene-6997 remotes/origin/lucene-7015 remotes/origin/solr-13131 remotes/origin/solr-6733 was (Author: erickerickson): Done, except for a few questions. The following branches didn't have a Jira I could check against. Can the identified people: 1> tag and remove the branch or 2> tell me to tag and remove the branch, I've got it down to a science or 3> tell me that they should be left alone. [~noble.paul] remotes/origin/jira-14151-revert remotes/origin/jira/V2Request [~caomanhdat] remotes/origin/jira/http2 [~danmuzi] or maybe [~rmuir] remotes/origin/revert-776-remove_icu_dependency Maybe this is LUCENE-8912 pull 776? In which case the JIRAs closed and I'll tag/remove. I've pinged Mark Miller about this one on a separate channel. remotes/origin/starburst Why these specific people? Well, since I couldn't go to a Jira I looked at the commit history of the branch and they were mentioned. These branches have been tagged/removed remotes/origin/SOLR-11795 remotes/origin/jira/LUCENE-8738 remotes/origin/jira/LUCENE-9312 remotes/origin/jira/LUCENE-9438 remotes/origin/jira/SOLR-13229 remotes/origin/jira/SOLR-13462 remotes/origin/jira/SOLR-13661-reverted remotes/origin/jira/SOLR-13661_2 remotes/origin/jira/SOLR-13793 remotes/origin/jira/SOLR-13822_backup remotes/origin/jira/SOLR-13834 remotes/origin/jira/SOLR-14354-revert remotes/origin/jira/SOLR-14383 remotes/origin/jira/lucene-5438-nrt-replication remotes/origin/jira/solr-12259 remotes/origin/jira/solr-13472 remotes/origin/jira/solr-13619 remotes/origin/jira/solr-13662 remotes/origin/jira/solr-13662-2 remotes/origin/jira/solr-13662-3 remotes/origin/jira/solr-13662-3_tmp remotes/origin/jira/solr-13662-fixes remotes/origin/jira/solr-13662-updated remotes/origin/jira/solr-13718-8x remotes/origin/jira/solr-13971 remotes/origin/jira/solr-13978 remotes/origin/jira/solr-14021 remotes/origin/jira/solr-14022 remotes/origin/jira/solr-14025 remotes/origin/jira/solr-14066-master remotes/origin/jira/solr-14071 remotes/origin/jira/solr-14151-revert remotes/origin/jira/solr-14151-revert-2 remotes/origin/jira/solr-14151-revert-8x remotes/origin/jira/solr-14151-revert-8x-2 remotes/or
[GitHub] [lucene-solr] noblepaul commented on a change in pull request #2065: SOLR-14977 : ContainerPlugins should be configurable
noblepaul commented on a change in pull request #2065: URL: https://github.com/apache/lucene-solr/pull/2065#discussion_r520936123 ## File path: solr/core/src/java/org/apache/solr/api/ContainerPluginsRegistry.java ## @@ -114,6 +118,16 @@ public synchronized ApiInfo getPlugin(String name) { return currentPlugins.get(name); } + static class PluginMetaHolder { +private final Map original; +private final PluginMeta meta; Review comment: It's a standard property , that is a part of the plugin definition This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14996) Facet incorrect counts when FQ exclusion applied with collapsing
Yevhen Tienkaiev created SOLR-14996: --- Summary: Facet incorrect counts when FQ exclusion applied with collapsing Key: SOLR-14996 URL: https://issues.apache.org/jira/browse/SOLR-14996 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: faceting Affects Versions: 8.6.3 Reporter: Yevhen Tienkaiev *numFound* not correct according to what is displayed in facets when in used collapsing and FQ with tag. Here example query: {code} curl --location --request GET 'http://localhost:8981/solr/test/select?facet.field={!ex=selected}job_type&facet=on&fq={!collapse%20field=user_id}&fq={!tag=selected}job_type:thinker&q=*:*&rows=0' {code} result is: {code} { "responseHeader": { "zkConnected": true, "status": 0, "QTime": 15, "params": { "q": "*:*", "facet.field": "{!ex=selected}job_type", "fq": [ "{!collapse field=user_id}", "{!tag=selected}job_type:thinker" ], "rows": "0", "facet": "on" } }, "response": { "numFound": 850, "start": 0, "maxScore": 1.0, "numFoundExact": true, "docs": [] }, "facet_counts": { "facet_queries": {}, "facet_fields": { "job_type": [ "runner", 220, "developer", 202, "digger", 202, "thinker", 195, "ninja", 181 ] }, "facet_ranges": {}, "facet_intervals": {}, "facet_heatmaps": {} } } {code} as you can see there FQ with {code} {!tag=selected}job_type:thinker {code} and facets with {code} {!ex=selected}job_type {code} but in results I see for *thinker* 195, but *numFound* is 850. Expected: *thinker* 195, but *numFound* is 195 *or* *thinker* 850, but *numFound* is 850 You can use this simple project to reproduce the issue https://github.com/Hronom/solr-cloud-basic-auth/tree/main/solr-cloud-playground-collapsing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14996) Facet incorrect counts when FQ exclusion applied with collapsing
[ https://issues.apache.org/jira/browse/SOLR-14996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yevhen Tienkaiev updated SOLR-14996: Description: *numFound* not correct according to what is displayed in facets with exclusion when used collapsing and FQ with tag. Here example query: {code} curl --location --request GET 'http://localhost:8981/solr/test/select?facet.field={!ex=selected}job_type&facet=on&fq={!collapse%20field=user_id}&fq={!tag=selected}job_type:thinker&q=*:*&rows=0' {code} result is: {code} { "responseHeader": { "zkConnected": true, "status": 0, "QTime": 15, "params": { "q": "*:*", "facet.field": "{!ex=selected}job_type", "fq": [ "{!collapse field=user_id}", "{!tag=selected}job_type:thinker" ], "rows": "0", "facet": "on" } }, "response": { "numFound": 850, "start": 0, "maxScore": 1.0, "numFoundExact": true, "docs": [] }, "facet_counts": { "facet_queries": {}, "facet_fields": { "job_type": [ "runner", 220, "developer", 202, "digger", 202, "thinker", 195, "ninja", 181 ] }, "facet_ranges": {}, "facet_intervals": {}, "facet_heatmaps": {} } } {code} as you can see there FQ with {code} {!tag=selected}job_type:thinker {code} and facets with {code} {!ex=selected}job_type {code} but in results I see for *thinker* 195, but *numFound* is 850. Expected: *thinker* 195, but *numFound* is 195 *or* *thinker* 850, but *numFound* is 850 You can use this simple project to reproduce the issue https://github.com/Hronom/solr-cloud-basic-auth/tree/main/solr-cloud-playground-collapsing was: *numFound* not correct according to what is displayed in facets when in used collapsing and FQ with tag. Here example query: {code} curl --location --request GET 'http://localhost:8981/solr/test/select?facet.field={!ex=selected}job_type&facet=on&fq={!collapse%20field=user_id}&fq={!tag=selected}job_type:thinker&q=*:*&rows=0' {code} result is: {code} { "responseHeader": { "zkConnected": true, "status": 0, "QTime": 15, "params": { "q": "*:*", "facet.field": "{!ex=selected}job_type", "fq": [ "{!collapse field=user_id}", "{!tag=selected}job_type:thinker" ], "rows": "0", "facet": "on" } }, "response": { "numFound": 850, "start": 0, "maxScore": 1.0, "numFoundExact": true, "docs": [] }, "facet_counts": { "facet_queries": {}, "facet_fields": { "job_type": [ "runner", 220, "developer", 202, "digger", 202, "thinker", 195, "ninja", 181 ] }, "facet_ranges": {}, "facet_intervals": {}, "facet_heatmaps": {} } } {code} as you can see there FQ with {code} {!tag=selected}job_type:thinker {code} and facets with {code} {!ex=selected}job_type {code} but in results I see for *thinker* 195, but *numFound* is 850. Expected: *thinker* 195, but *numFound* is 195 *or* *thinker* 850, but *numFound* is 850 You can use this simple project to reproduce the issue https://github.com/Hronom/solr-cloud-basic-auth/tree/main/solr-cloud-playground-collapsing > Facet incorrect counts when FQ exclusion applied with collapsing > > > Key: SOLR-14996 > URL: https://issues.apache.org/jira/browse/SOLR-14996 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 8.6.3 >Reporter: Yevhen Tienkaiev >Priority: Critical > > *numFound* not correct according to what is displayed in facets with > exclusion when used collapsing and FQ with tag. > Here example query: > {code} > curl --location --request GET > 'http://localhost:8981/solr/test/select?facet.field={!ex=selected}job_type&facet=on&fq={!collapse%20field=user_id}&fq={!tag=selected}job_type:thinker&q=*:*&rows=0' > {code} > result is: > {code} > { > "responseHeader": { > "zkConnected": true, > "status": 0, > "QTime": 15, > "params": { > "q": "*:*", > "facet.field": "{!ex=selected}job_type", > "fq": [ > "{!collapse field=user_id}", >
[jira] [Updated] (SOLR-14996) Facet incorrect counts when FQ exclusion applied with collapsing
[ https://issues.apache.org/jira/browse/SOLR-14996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yevhen Tienkaiev updated SOLR-14996: Description: *numFound* not correct according to what is displayed in facets with exclusion when used collapsing and FQ with tag. Here example query: {code} curl --location --request GET 'http://localhost:8981/solr/test/select?facet.field={!ex=selected}job_type&facet=on&fq={!collapse%20field=user_id}&fq={!tag=selected}job_type:thinker&q=*:*&rows=0' {code} result is: {code} { "responseHeader": { "zkConnected": true, "status": 0, "QTime": 15, "params": { "q": "*:*", "facet.field": "{!ex=selected}job_type", "fq": [ "{!collapse field=user_id}", "{!tag=selected}job_type:thinker" ], "rows": "0", "facet": "on" } }, "response": { "numFound": 850, "start": 0, "maxScore": 1.0, "numFoundExact": true, "docs": [] }, "facet_counts": { "facet_queries": {}, "facet_fields": { "job_type": [ "runner", 220, "developer", 202, "digger", 202, "thinker", 195, "ninja", 181 ] }, "facet_ranges": {}, "facet_intervals": {}, "facet_heatmaps": {} } } {code} as you can see there FQ with {code} {!tag=selected}job_type:thinker {code} and facets with {code} {!ex=selected}job_type {code} but in results I see for *thinker* 195, but *numFound* is 850. Expected: *thinker* 195, *numFound* is 195 *or* *thinker* 850, *numFound* is 850 You can use this simple project to reproduce the issue https://github.com/Hronom/solr-cloud-basic-auth/tree/main/solr-cloud-playground-collapsing was: *numFound* not correct according to what is displayed in facets with exclusion when used collapsing and FQ with tag. Here example query: {code} curl --location --request GET 'http://localhost:8981/solr/test/select?facet.field={!ex=selected}job_type&facet=on&fq={!collapse%20field=user_id}&fq={!tag=selected}job_type:thinker&q=*:*&rows=0' {code} result is: {code} { "responseHeader": { "zkConnected": true, "status": 0, "QTime": 15, "params": { "q": "*:*", "facet.field": "{!ex=selected}job_type", "fq": [ "{!collapse field=user_id}", "{!tag=selected}job_type:thinker" ], "rows": "0", "facet": "on" } }, "response": { "numFound": 850, "start": 0, "maxScore": 1.0, "numFoundExact": true, "docs": [] }, "facet_counts": { "facet_queries": {}, "facet_fields": { "job_type": [ "runner", 220, "developer", 202, "digger", 202, "thinker", 195, "ninja", 181 ] }, "facet_ranges": {}, "facet_intervals": {}, "facet_heatmaps": {} } } {code} as you can see there FQ with {code} {!tag=selected}job_type:thinker {code} and facets with {code} {!ex=selected}job_type {code} but in results I see for *thinker* 195, but *numFound* is 850. Expected: *thinker* 195, but *numFound* is 195 *or* *thinker* 850, but *numFound* is 850 You can use this simple project to reproduce the issue https://github.com/Hronom/solr-cloud-basic-auth/tree/main/solr-cloud-playground-collapsing > Facet incorrect counts when FQ exclusion applied with collapsing > > > Key: SOLR-14996 > URL: https://issues.apache.org/jira/browse/SOLR-14996 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 8.6.3 >Reporter: Yevhen Tienkaiev >Priority: Critical > > *numFound* not correct according to what is displayed in facets with > exclusion when used collapsing and FQ with tag. > Here example query: > {code} > curl --location --request GET > 'http://localhost:8981/solr/test/select?facet.field={!ex=selected}job_type&facet=on&fq={!collapse%20field=user_id}&fq={!tag=selected}job_type:thinker&q=*:*&rows=0' > {code} > result is: > {code} > { > "responseHeader": { > "zkConnected": true, > "status": 0, > "QTime": 15, > "params": { > "q": "*:*", > "facet.field": "{!ex=selected}job_type", > "fq": [ > "{!collapse field=user_id}",
[jira] [Updated] (SOLR-14996) Facet incorrect counts when FQ exclusion applied with collapsing
[ https://issues.apache.org/jira/browse/SOLR-14996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yevhen Tienkaiev updated SOLR-14996: Description: *numFound* not correct according to what is displayed in facets with exclusion when used collapsing and FQ with tag. Here example query: {code} curl --location --request GET 'http://localhost:8981/solr/test/select?facet.field={!ex=selected}job_type&facet=on&fq={!collapse%20field=user_id}&fq={!tag=selected}job_type:thinker&q=*:*&rows=0' {code} result is: {code} { "responseHeader": { "zkConnected": true, "status": 0, "QTime": 15, "params": { "q": "*:*", "facet.field": "{!ex=selected}job_type", "fq": [ "{!collapse field=user_id}", "{!tag=selected}job_type:thinker" ], "rows": "0", "facet": "on" } }, "response": { "numFound": 850, "start": 0, "maxScore": 1.0, "numFoundExact": true, "docs": [] }, "facet_counts": { "facet_queries": {}, "facet_fields": { "job_type": [ "runner", 220, "developer", 202, "digger", 202, "thinker", 195, "ninja", 181 ] }, "facet_ranges": {}, "facet_intervals": {}, "facet_heatmaps": {} } } {code} as you can see there FQ with {code} {!tag=selected}job_type:thinker {code} and facets with {code} {!ex=selected}job_type {code} in results I see for *thinker* 195, but *numFound* is 850. Expected: *thinker* 195, *numFound* is 195 *or* *thinker* 850, *numFound* is 850 You can use this simple project to reproduce the issue https://github.com/Hronom/solr-cloud-basic-auth/tree/main/solr-cloud-playground-collapsing was: *numFound* not correct according to what is displayed in facets with exclusion when used collapsing and FQ with tag. Here example query: {code} curl --location --request GET 'http://localhost:8981/solr/test/select?facet.field={!ex=selected}job_type&facet=on&fq={!collapse%20field=user_id}&fq={!tag=selected}job_type:thinker&q=*:*&rows=0' {code} result is: {code} { "responseHeader": { "zkConnected": true, "status": 0, "QTime": 15, "params": { "q": "*:*", "facet.field": "{!ex=selected}job_type", "fq": [ "{!collapse field=user_id}", "{!tag=selected}job_type:thinker" ], "rows": "0", "facet": "on" } }, "response": { "numFound": 850, "start": 0, "maxScore": 1.0, "numFoundExact": true, "docs": [] }, "facet_counts": { "facet_queries": {}, "facet_fields": { "job_type": [ "runner", 220, "developer", 202, "digger", 202, "thinker", 195, "ninja", 181 ] }, "facet_ranges": {}, "facet_intervals": {}, "facet_heatmaps": {} } } {code} as you can see there FQ with {code} {!tag=selected}job_type:thinker {code} and facets with {code} {!ex=selected}job_type {code} but in results I see for *thinker* 195, but *numFound* is 850. Expected: *thinker* 195, *numFound* is 195 *or* *thinker* 850, *numFound* is 850 You can use this simple project to reproduce the issue https://github.com/Hronom/solr-cloud-basic-auth/tree/main/solr-cloud-playground-collapsing > Facet incorrect counts when FQ exclusion applied with collapsing > > > Key: SOLR-14996 > URL: https://issues.apache.org/jira/browse/SOLR-14996 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: faceting >Affects Versions: 8.6.3 >Reporter: Yevhen Tienkaiev >Priority: Critical > > *numFound* not correct according to what is displayed in facets with > exclusion when used collapsing and FQ with tag. > Here example query: > {code} > curl --location --request GET > 'http://localhost:8981/solr/test/select?facet.field={!ex=selected}job_type&facet=on&fq={!collapse%20field=user_id}&fq={!tag=selected}job_type:thinker&q=*:*&rows=0' > {code} > result is: > {code} > { > "responseHeader": { > "zkConnected": true, > "status": 0, > "QTime": 15, > "params": { > "q": "*:*", > "facet.field": "{!ex=selected}job_type", > "fq": [ > "{!collapse field=user_id}", >
[GitHub] [lucene-solr] zacharymorn commented on a change in pull request #2068: LUCENE-8982: Separate out native code to another module to allow cpp build with gradle
zacharymorn commented on a change in pull request #2068: URL: https://github.com/apache/lucene-solr/pull/2068#discussion_r521068324 ## File path: lucene/misc/native/build.gradle ## @@ -0,0 +1,69 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/* + * This gets separated out from misc module into a native module due to incompatibility between cpp-library and java-library plugins. + * For details, please see https://github.com/gradle/gradle-native/issues/352#issuecomment-461724948 + */ +import org.apache.tools.ant.taskdefs.condition.Os + +description = 'Module for native code' + +apply plugin: 'cpp-library' + +library { + baseName = 'NativePosixUtil' + + // Native build for Windows platform will be added in later stage + targetMachines = [ + machines.linux.x86_64, + machines.macOS.x86_64, + machines.windows.x86_64 + ] + + // Point at platform-specific sources. Other platforms will be ignored + // (plugin won't find the toolchain). + if (Os.isFamily(Os.FAMILY_WINDOWS)) { +source.from file("${projectDir}/src/main/windows") + } else if (Os.isFamily(Os.FAMILY_UNIX) || Os.isFamily(Os.FAMILY_MAC)) { +source.from file("${projectDir}/src/main/posix") + } +} + +tasks.withType(CppCompile).configureEach { + def javaHome = rootProject.ext.runtimeJava.getInstallationDirectory().getAsFile().getPath() + + // Assume standard openjdk layout. This means only one architecture-specific include folder + // is present. + systemIncludes.from file("${javaHome}/include") + + for (def path : [ + file("${javaHome}/include/win32"), + file("${javaHome}/include/darwin"), + file("${javaHome}/include/linux"), + file("${javaHome}/include/solaris")]) { +if (path.exists()) { + systemIncludes.from path +} + } + + compilerArgs.add '-fPIC' Review comment: Ok cool. I've updated the comment here to reflect the new way of building the code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] zacharymorn commented on a change in pull request #2068: LUCENE-8982: Separate out native code to another module to allow cpp build with gradle
zacharymorn commented on a change in pull request #2068: URL: https://github.com/apache/lucene-solr/pull/2068#discussion_r521068544 ## File path: lucene/misc/src/java/org/apache/lucene/store/NativeUnixDirectory.java ## @@ -47,10 +47,10 @@ * * To use this you must compile * NativePosixUtil.cpp (exposes Linux-specific APIs through - * JNI) for your platform, by running ant - * build-native-unix, and then putting the resulting - * libNativePosixUtil.so (from - * lucene/build/native) onto your dynamic + * JNI) for your platform, by running + * ./gradlew build -Pbuild.native=true, and then putting the resulting Review comment: Updated. ## File path: lucene/packaging/build.gradle ## @@ -32,7 +32,9 @@ def includeInBinaries = project(":lucene").subprojects.findAll {subproject -> ":lucene:packaging", ":lucene:documentation", // Exclude parent container project of analysis modules (no artifacts). -":lucene:analysis" +":lucene:analysis", +// Exclude native module, which requires manual copying and enabling +":lucene:native" Review comment: It did fail, I've updated the reference here and confirmed it's able to build from directory root. ## File path: lucene/misc/native/build.gradle ## @@ -0,0 +1,69 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/* + * This gets separated out from misc module into a native module due to incompatibility between cpp-library and java-library plugins. + * For details, please see https://github.com/gradle/gradle-native/issues/352#issuecomment-461724948 + */ +import org.apache.tools.ant.taskdefs.condition.Os + +description = 'Module for native code' + +apply plugin: 'cpp-library' + +library { + baseName = 'NativePosixUtil' Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14788) Solr: The Next Big Thing
[ https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229715#comment-17229715 ] Noble Paul commented on SOLR-14788: --- [~dsmiley] [~ilan] [~markrmiller] Optimizing overseer is not enough. Eliminating a lot of work that is being done by overseer is the way to go. If there are 100K message sin the queue already it's already a lost cause. There is no point in trying to optimize it. I'm not saying that we should eliminate overseer altogether. Let's start chipping away a lot of the responsibilities performed by overseer today > Solr: The Next Big Thing > > > Key: SOLR-14788 > URL: https://issues.apache.org/jira/browse/SOLR-14788 > Project: Solr > Issue Type: Task >Reporter: Mark Robert Miller >Assignee: Mark Robert Miller >Priority: Critical > > h3. > [!https://www.unicode.org/consortium/aacimg/1F46E.png!|https://www.unicode.org/consortium/adopted-characters.html#b1F46E]{color:#00875a}*The > Policeman is on duty!*{color} > {quote}_{color:#de350b}*When The Policeman is on duty, sit back, relax, and > have some fun. Try to make some progress. Don't stress too much about the > impact of your changes or maintaining stability and performance and > correctness so much. Until the end of phase 1, I've got your back. I have a > variety of tools and contraptions I have been building over the years and I > will continue training them on this branch. I will review your changes and > peer out across the land and course correct where needed. As Mike D will be > thinking, "Sounds like a bottleneck Mark." And indeed it will be to some > extent. Which is why once stage one is completed, I will flip The Policeman > to off duty. When off duty, I'm always* {color:#de350b}*occasionally*{color} > *down for some vigilante justice, but I won't be walking the beat, all that > stuff about sit back and relax goes out the window.*{color}_ > {quote} > > I have stolen this title from Ishan or Noble and Ishan. > This issue is meant to capture the work of a small team that is forming to > push Solr and SolrCloud to the next phase. > I have kicked off the work with an effort to create a very fast and solid > base. That work is not 100% done, but it's ready to join the fight. > Tim Potter has started giving me a tremendous hand in finishing up. Ishan and > Noble have already contributed support and testing and have plans for > additional work to shore up some of our current shortcomings. > Others have expressed an interest in helping and hopefully they will pop up > here as well. > Let's organize and discuss our efforts here and in various sub issues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14788) Solr: The Next Big Thing
[ https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229715#comment-17229715 ] Noble Paul edited comment on SOLR-14788 at 11/11/20, 4:39 AM: -- [~dsmiley] [~ilan] [~markrmiller] Optimizing overseer is necessary but not sufficent. Eliminating a lot of work that is being done by overseer is the way to go. If there are 100K messages in the queue already it's already a lost cause. There is no point in trying to optimize it. I'm not saying that we should eliminate overseer altogether. Let's start chipping away a lot of the responsibilities performed by overseer today was (Author: noble.paul): [~dsmiley] [~ilan] [~markrmiller] Optimizing overseer is not enough. Eliminating a lot of work that is being done by overseer is the way to go. If there are 100K message sin the queue already it's already a lost cause. There is no point in trying to optimize it. I'm not saying that we should eliminate overseer altogether. Let's start chipping away a lot of the responsibilities performed by overseer today > Solr: The Next Big Thing > > > Key: SOLR-14788 > URL: https://issues.apache.org/jira/browse/SOLR-14788 > Project: Solr > Issue Type: Task >Reporter: Mark Robert Miller >Assignee: Mark Robert Miller >Priority: Critical > > h3. > [!https://www.unicode.org/consortium/aacimg/1F46E.png!|https://www.unicode.org/consortium/adopted-characters.html#b1F46E]{color:#00875a}*The > Policeman is on duty!*{color} > {quote}_{color:#de350b}*When The Policeman is on duty, sit back, relax, and > have some fun. Try to make some progress. Don't stress too much about the > impact of your changes or maintaining stability and performance and > correctness so much. Until the end of phase 1, I've got your back. I have a > variety of tools and contraptions I have been building over the years and I > will continue training them on this branch. I will review your changes and > peer out across the land and course correct where needed. As Mike D will be > thinking, "Sounds like a bottleneck Mark." And indeed it will be to some > extent. Which is why once stage one is completed, I will flip The Policeman > to off duty. When off duty, I'm always* {color:#de350b}*occasionally*{color} > *down for some vigilante justice, but I won't be walking the beat, all that > stuff about sit back and relax goes out the window.*{color}_ > {quote} > > I have stolen this title from Ishan or Noble and Ishan. > This issue is meant to capture the work of a small team that is forming to > push Solr and SolrCloud to the next phase. > I have kicked off the work with an effort to create a very fast and solid > base. That work is not 100% done, but it's ready to join the fight. > Tim Potter has started giving me a tremendous hand in finishing up. Ishan and > Noble have already contributed support and testing and have plans for > additional work to shore up some of our current shortcomings. > Others have expressed an interest in helping and hopefully they will pop up > here as well. > Let's organize and discuss our efforts here and in various sub issues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14788) Solr: The Next Big Thing
[ https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229741#comment-17229741 ] Mark Robert Miller commented on SOLR-14788: --- Most of the work done by the overseer is useless. You can’t optimize the current impl and be satisfied that is for sure, but i think it’s pretty easy to beat the non overseer paths ive seen described. You have to first recognize that while the overseer does lots of work it should do very little. You can’t be limited by the current impl in your thinking either. The fact there is a single overseer is a pretty simple detail with a decent impl. In the end, the overseer concept is about cluster communication. You have to describe how your solution nails cluster communication, so far, I havnt seen anything I’d ditch my approach first unless I’m show via demonstration. > Solr: The Next Big Thing > > > Key: SOLR-14788 > URL: https://issues.apache.org/jira/browse/SOLR-14788 > Project: Solr > Issue Type: Task >Reporter: Mark Robert Miller >Assignee: Mark Robert Miller >Priority: Critical > > h3. > [!https://www.unicode.org/consortium/aacimg/1F46E.png!|https://www.unicode.org/consortium/adopted-characters.html#b1F46E]{color:#00875a}*The > Policeman is on duty!*{color} > {quote}_{color:#de350b}*When The Policeman is on duty, sit back, relax, and > have some fun. Try to make some progress. Don't stress too much about the > impact of your changes or maintaining stability and performance and > correctness so much. Until the end of phase 1, I've got your back. I have a > variety of tools and contraptions I have been building over the years and I > will continue training them on this branch. I will review your changes and > peer out across the land and course correct where needed. As Mike D will be > thinking, "Sounds like a bottleneck Mark." And indeed it will be to some > extent. Which is why once stage one is completed, I will flip The Policeman > to off duty. When off duty, I'm always* {color:#de350b}*occasionally*{color} > *down for some vigilante justice, but I won't be walking the beat, all that > stuff about sit back and relax goes out the window.*{color}_ > {quote} > > I have stolen this title from Ishan or Noble and Ishan. > This issue is meant to capture the work of a small team that is forming to > push Solr and SolrCloud to the next phase. > I have kicked off the work with an effort to create a very fast and solid > base. That work is not 100% done, but it's ready to join the fight. > Tim Potter has started giving me a tremendous hand in finishing up. Ishan and > Noble have already contributed support and testing and have plans for > additional work to shore up some of our current shortcomings. > Others have expressed an interest in helping and hopefully they will pop up > here as well. > Let's organize and discuss our efforts here and in various sub issues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14788) Solr: The Next Big Thing
[ https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229741#comment-17229741 ] Mark Robert Miller edited comment on SOLR-14788 at 11/11/20, 5:30 AM: -- Most of the work done by the overseer is useless. You can’t optimize the current impl and be satisfied that is for sure, but i think it’s pretty easy to beat the non overseer paths ive seen described. You have to first recognize that while the overseer does lots of work it should do very little. You can’t be limited by the current impl in your thinking either. The fact there is a single overseer is a pretty simple detail with a decent impl. In the end, the overseer concept is about cluster communication. You have to describe how your solution nails cluster communication, so far, I havnt seen anything I’d ditch my approach for unless I’m show via demonstration. was (Author: markrmiller): Most of the work done by the overseer is useless. You can’t optimize the current impl and be satisfied that is for sure, but i think it’s pretty easy to beat the non overseer paths ive seen described. You have to first recognize that while the overseer does lots of work it should do very little. You can’t be limited by the current impl in your thinking either. The fact there is a single overseer is a pretty simple detail with a decent impl. In the end, the overseer concept is about cluster communication. You have to describe how your solution nails cluster communication, so far, I havnt seen anything I’d ditch my approach first unless I’m show via demonstration. > Solr: The Next Big Thing > > > Key: SOLR-14788 > URL: https://issues.apache.org/jira/browse/SOLR-14788 > Project: Solr > Issue Type: Task >Reporter: Mark Robert Miller >Assignee: Mark Robert Miller >Priority: Critical > > h3. > [!https://www.unicode.org/consortium/aacimg/1F46E.png!|https://www.unicode.org/consortium/adopted-characters.html#b1F46E]{color:#00875a}*The > Policeman is on duty!*{color} > {quote}_{color:#de350b}*When The Policeman is on duty, sit back, relax, and > have some fun. Try to make some progress. Don't stress too much about the > impact of your changes or maintaining stability and performance and > correctness so much. Until the end of phase 1, I've got your back. I have a > variety of tools and contraptions I have been building over the years and I > will continue training them on this branch. I will review your changes and > peer out across the land and course correct where needed. As Mike D will be > thinking, "Sounds like a bottleneck Mark." And indeed it will be to some > extent. Which is why once stage one is completed, I will flip The Policeman > to off duty. When off duty, I'm always* {color:#de350b}*occasionally*{color} > *down for some vigilante justice, but I won't be walking the beat, all that > stuff about sit back and relax goes out the window.*{color}_ > {quote} > > I have stolen this title from Ishan or Noble and Ishan. > This issue is meant to capture the work of a small team that is forming to > push Solr and SolrCloud to the next phase. > I have kicked off the work with an effort to create a very fast and solid > base. That work is not 100% done, but it's ready to join the fight. > Tim Potter has started giving me a tremendous hand in finishing up. Ishan and > Noble have already contributed support and testing and have plans for > additional work to shore up some of our current shortcomings. > Others have expressed an interest in helping and hopefully they will pop up > here as well. > Let's organize and discuss our efforts here and in various sub issues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14788) Solr: The Next Big Thing
[ https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229758#comment-17229758 ] Noble Paul commented on SOLR-14788: --- Thanks [~markrmiller] Any details on how we can eliminate a lot of the work done by overseer? (your SolrSeer design choices) > Solr: The Next Big Thing > > > Key: SOLR-14788 > URL: https://issues.apache.org/jira/browse/SOLR-14788 > Project: Solr > Issue Type: Task >Reporter: Mark Robert Miller >Assignee: Mark Robert Miller >Priority: Critical > > h3. > [!https://www.unicode.org/consortium/aacimg/1F46E.png!|https://www.unicode.org/consortium/adopted-characters.html#b1F46E]{color:#00875a}*The > Policeman is on duty!*{color} > {quote}_{color:#de350b}*When The Policeman is on duty, sit back, relax, and > have some fun. Try to make some progress. Don't stress too much about the > impact of your changes or maintaining stability and performance and > correctness so much. Until the end of phase 1, I've got your back. I have a > variety of tools and contraptions I have been building over the years and I > will continue training them on this branch. I will review your changes and > peer out across the land and course correct where needed. As Mike D will be > thinking, "Sounds like a bottleneck Mark." And indeed it will be to some > extent. Which is why once stage one is completed, I will flip The Policeman > to off duty. When off duty, I'm always* {color:#de350b}*occasionally*{color} > *down for some vigilante justice, but I won't be walking the beat, all that > stuff about sit back and relax goes out the window.*{color}_ > {quote} > > I have stolen this title from Ishan or Noble and Ishan. > This issue is meant to capture the work of a small team that is forming to > push Solr and SolrCloud to the next phase. > I have kicked off the work with an effort to create a very fast and solid > base. That work is not 100% done, but it's ready to join the fight. > Tim Potter has started giving me a tremendous hand in finishing up. Ishan and > Noble have already contributed support and testing and have plans for > additional work to shore up some of our current shortcomings. > Others have expressed an interest in helping and hopefully they will pop up > here as well. > Let's organize and discuss our efforts here and in various sub issues. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks
[ https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229764#comment-17229764 ] Tomas Eduardo Fernandez Lobbe commented on SOLR-14940: -- I believe the test added to {{TestPullReplicaErrorHandling}} is causing the failures described in https://issues.apache.org/jira/browse/SOLR-14992. The failures happen in the tests running immediately after {{testCloseHooksDeletedOnReconnect}}, for example: {noformat} [junit4] 2> 37296 INFO (TEST-TestPullReplicaErrorHandling.testCloseHooksDeletedOnReconnect-seed#[64BCF377B675F037]) [ ] o.a.s.c.MiniSolrCloudCluster Expired zookeeper session 72103195153465373 from node http://127.0.0.1:64010/solr [junit4] 2> 37296 INFO (TEST-TestPullReplicaErrorHandling.testCloseHooksDeletedOnReconnect-seed#[64BCF377B675F037]) [ ] o.a.s.c.SolrCloudTestCase waitForState (pull_replica_error_handling_test_close_hooks_deleted_on_reconnect): Expecting node to reconnect [junit4] 2> 37296 WARN (NIOWorkerThread-1) [ ] o.a.z.s.NIOServerCnxn Unexpected exception [junit4] 2> => EndOfStreamException: Unable to read additional data from client, it probably closed the socket: address = /127.0.0.1:64510, session = 0x1002979564b001d [junit4] 2>at org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:163) [junit4] 2> org.apache.zookeeper.server.ServerCnxn$EndOfStreamException: Unable to read additional data from client, it probably closed the socket: address = /127.0.0.1:64510, session = 0x1002979564b001d [junit4] 2>at org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:163) ~[zookeeper-3.6.2.jar:3.6.2] [junit4] 2>at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:326) [zookeeper-3.6.2.jar:3.6.2] [junit4] 2>at org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522) [zookeeper-3.6.2.jar:3.6.2] [junit4] 2>at org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154) [zookeeper-3.6.2.jar:3.6.2] [junit4] 2>at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?] [junit4] 2>at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?] [junit4] 2>at java.lang.Thread.run(Thread.java:832) [?:?] [junit4] 2> 37299 INFO (zkCallback-214-thread-1) [ ] o.a.s.c.c.ZkStateReader Updated live nodes from ZooKeeper... (4) -> (3) [junit4] 2> 37299 INFO (zkCallback-184-thread-3) [ ] o.a.s.c.c.ZkStateReader Updated live nodes from ZooKeeper... (4) -> (3) [junit4] 2> 37299 INFO (zkCallback-154-thread-1) [ ] o.a.s.c.c.ZkStateReader Updated live nodes from ZooKeeper... (4) -> (3) [junit4] 2> 37299 INFO (zkCallback-100-thread-1) [ ] o.a.s.c.c.ZkStateReader Updated live nodes from ZooKeeper... (4) -> (3) [junit4] 2> 37304 INFO (TEST-TestPullReplicaErrorHandling.testCloseHooksDeletedOnReconnect-seed#[64BCF377B675F037]) [ ] o.a.s.c.TestPullReplicaErrorHandling tearDown deleting collection [junit4] 2> 37306 INFO (SocketProxy-Acceptor-64086) [ ] o.a.s.c.s.c.SocketProxy accepted Socket[addr=/127.0.0.1,port=64511,localport=64086], receiveBufferSize: 65536 [junit4] 2> 37306 INFO (SocketProxy-Acceptor-64086) [ ] o.a.s.c.s.c.SocketProxy proxy connection Socket[addr=/127.0.0.1,port=64102,localport=64512], receiveBufferSize=65536 [junit4] 2> 37308 INFO (qtp1774678962-277) [n:127.0.0.1:64086_solr ] o.a.s.h.a.CollectionsHandler Invoked Collection Action :delete with params name=pull_replica_error_handling_test_close_hooks_deleted_on_reconnect&action=DELETE&wt=javabin&version=2 and sendToOCPQueue=true [junit4] 2> 37319 INFO (OverseerThreadFactory-238-thread-1-processing-n:127.0.0.1:64086_solr) [n:127.0.0.1:64086_solr ] o.a.s.c.a.c.OverseerCollectionMessageHandler Executing Collection Cmd=action=UNLOAD&deleteInstanceDir=true&deleteDataDir=true&deleteMetricsHistory=true, asyncId=null [junit4] 2> 37321 INFO (SocketProxy-Acceptor-64180) [ ] o.a.s.c.s.c.SocketProxy accepted Socket[addr=/127.0.0.1,port=64513,localport=64180], receiveBufferSize: 65536 [junit4] 2> 37322 INFO (SocketProxy-Acceptor-64180) [ ] o.a.s.c.s.c.SocketProxy proxy connection Socket[addr=/127.0.0.1,port=64194,localport=64514], receiveBufferSize=65536 [junit4] 2> 37324 INFO (qtp420303283-332) [n:127.0.0.1:64180_solr x:pull_replica_error_handling_test_close_hooks_deleted_on_reconnect_shard1_replica_n1 ] o.a.s.m.SolrMetricManager Closing metric reporters for registry=solr.core.pull_replica_error_handling_test_close_hooks_deleted_on_reconnect.shard1.replica_n1 tag=null [junit4] 2> 37324 INFO (qtp420303283-332) [n
[jira] [Reopened] (SOLR-14940) ReplicationHandler memory leak through SolrCore.closeHooks
[ https://issues.apache.org/jira/browse/SOLR-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomas Eduardo Fernandez Lobbe reopened SOLR-14940: -- > ReplicationHandler memory leak through SolrCore.closeHooks > -- > > Key: SOLR-14940 > URL: https://issues.apache.org/jira/browse/SOLR-14940 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: replication (java) > Environment: Solr Cloud Cluster on v.8.6.2 configured as 3 TLOG nodes > with 2 cores in each JVM. > >Reporter: Anver Sotnikov >Assignee: Mike Drob >Priority: Major > Fix For: master (9.0), 8.8 > > Attachments: Actual references to hooks that in turn hold references > to ReplicationHandlers.png, Memory Analyzer SolrCore.closeHooks .png > > Time Spent: 2h 10m > Remaining Estimate: 0h > > We are experiencing a memory leak in Solr Cloud cluster configured as 3 TLOG > nodes. > Leader does not seem to be affected while Followers are. > > Looking at memory dump we noticed that SolrCore holds lots of references to > ReplicationHandler through anonymous inner classes in SolrCore.closeHooks, > which in turn holds ReplicationHandlers. > ReplicationHandler registers hooks as anonymous inner classes in > SolrCore.closeHooks through ReplicationHandler.inform() -> > ReplicationHandler.registerCloseHook(). > > Whenever ZkController.stopReplicationFromLeader is called - it would shutdown > ReplicationHandler (ReplicationHandler.shutdown()), BUT reference to > ReplicationHandler will stay in SolrCore.closeHooks. Once replication is > started again on same SolrCore - new ReplicationHandler will be created and > registered in closeHooks. > > It looks like there are few scenarios when replication is stopped and > restarted on same core and in our TLOG setup it shows up quite often. > > Potential solutions: > # Allow unregistering SolrCore.closeHooks so it can be used from > ReplicationHandler.shutdown > # Hack but easier - break the link between ReplicationHandler close hooks > and full ReplicationHandler object so ReplicationHandler can be GCed even > when hooks are still registered in SolrCore.closeHooks -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14992) TestPullReplicaErrorHandling.testCantConnectToPullReplica Failures
[ https://issues.apache.org/jira/browse/SOLR-14992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17229765#comment-17229765 ] Tomas Eduardo Fernandez Lobbe commented on SOLR-14992: -- Maybe related to SOLR-14940. It's strange that it only fails in 8.x > TestPullReplicaErrorHandling.testCantConnectToPullReplica Failures > -- > > Key: SOLR-14992 > URL: https://issues.apache.org/jira/browse/SOLR-14992 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tomas Eduardo Fernandez Lobbe >Priority: Minor > > I've noticed this test started failing very frequently with an error like: > {noformat} > Error Message: > Error from server at http://127.0.0.1:39037/solr: Cannot create collection > pull_replica_error_handling_test_cant_connect_to_pull_replica. Value of > maxShardsPerNode is 1, and the number of nodes currently live or live and > part of your createNodeSet is 3. This allows a maximum of 3 to be created. > Value of numShards is 2, value of nrtReplicas is 1, value of tlogReplicas is > 0 and value of pullReplicas is 1. This requires 4 shards to be created > (higher than the allowed number) > Stack Trace: > org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error > from server at http://127.0.0.1:39037/solr: Cannot create collection > pull_replica_error_handling_test_cant_connect_to_pull_replica. Value of > maxShardsPerNode is 1, and the number of nodes currently live or live and > part of your createNodeSet is 3. This allows a maximum of 3 to be created. > Value of numShards is 2, value of nrtReplicas is 1, value of tlogReplicas is > 0 and value of pullReplicas is 1. This requires 4 shards to be created > (higher than the allowed number) > at > __randomizedtesting.SeedInfo.seed([3D670DC4BEABD958:3550EB0C6505ADD6]:0) > at > org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:681) > at > org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:266) > at > org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248) > at > org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:369) > at > org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:297) > at > org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1173) > at > org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:934) > at > org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:866) > at > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:214) > at > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:231) > at > org.apache.solr.cloud.TestPullReplicaErrorHandling.testCantConnectToPullReplica(TestPullReplicaErrorHandling.java:149) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974) > at > com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988) > at > com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at > org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) > at > org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) > at > org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) > at > org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) > at > org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > at > com.carrotsearch.randomizedtesting.ThreadLeakControl$Statemen
[jira] [Created] (LUCENE-9603) Remove redundant fieldType.stored() check
Shintaro Murakami created LUCENE-9603: - Summary: Remove redundant fieldType.stored() check Key: LUCENE-9603 URL: https://issues.apache.org/jira/browse/LUCENE-9603 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Shintaro Murakami No need to check fieldType.stored() here because already checked.https://github.com/apache/lucene-solr/blob/5897d14fe4f5dd785a561b18c939ea31186c4971/lucene/core/src/java/org/apache/lucene/index/IndexingChain.java#L587 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org