[jira] [Commented] (SOLR-14779) Solr collections gets wiped on restart
[ https://issues.apache.org/jira/browse/SOLR-14779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185680#comment-17185680 ] Antonio Dinis commented on SOLR-14779: -- Thank you Erick, I will try the community website. > Solr collections gets wiped on restart > --- > > Key: SOLR-14779 > URL: https://issues.apache.org/jira/browse/SOLR-14779 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 8.3 >Reporter: Antonio Dinis >Priority: Major > > Hello, > We have a 3 node Solr cluster (ensemble) with apache-zookeeper 3.5.5. > It works fine until we need to restart one of the nodes. Then all the content > of the collection gets deleted. > This is a production environment, and every time there is a restart or a > crash in one of the services/servers we loose lots of time restoring the > collection and work. > This is the way we start the nodes: > su - ipls004p -c "/applis/24374-iplsp-00/IPLS/solr-8.3.0/bin/solr start > -cloud -p 8987 -h s01vl9918254 -s > /applis/24374-iplsp-00/IPLS/solr-8.3.0/cloud/node1/solr -z > s01vl9918254:2181,s01vl9918256:2181,s01vl9918258:2181 -force" > This is the zoo.cfg: > # The number of milliseconds of each tick > tickTime=2000 > # The number of ticks that the initial > # synchronization phase can take > initLimit=10 > # The number of ticks that can pass between > # sending a request and getting an acknowledgement > syncLimit=5 > # the directory where the snapshot is stored. > # do not use /tmp for storage, /tmp here is just > # example sakes. > dataDir=/applis/24374-iplsp-00/IPLS/apache-zookeeper-3.5.5-bin/temp > # the port at which the clients will connect > clientPort=2181 > # the maximum number of client connections. > # increase this if you need to handle more clients > #maxClientCnxns=60 > # > # Be sure to read the maintenance section of the > # administrator guide before turning on autopurge. > # > # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance > # > # The number of snapshots to retain in dataDir > #autopurge.snapRetainCount=3 > # Purge task interval in hours > # Set to "0" to disable auto purge feature > #autopurge.purgeInterval=1 > 4lw.commands.whitelist=mntr,conf,ruok > server.1=s01vl9918256:3889:3888 > server.2=s01vl9918258:3889:3888 > server.3=s01vl9918254:3889:3888 > #server.4=s01vl9918255:3889:3888 > > > Thanks in advance > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] atris commented on a change in pull request #1785: Update Circuit Breaker configured as a standard plugin
atris commented on a change in pull request #1785: URL: https://github.com/apache/lucene-solr/pull/1785#discussion_r478243445 ## File path: solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreakerManager.java ## @@ -121,19 +135,41 @@ public static String toErrorMessage(List circuitBreakerList) { * * Any default circuit breakers should be registered here. */ - public static CircuitBreakerManager build(SolrConfig solrConfig) { -CircuitBreakerManager circuitBreakerManager = new CircuitBreakerManager(solrConfig.useCircuitBreakers); - -// Install the default circuit breakers -CircuitBreaker memoryCircuitBreaker = new MemoryCircuitBreaker(solrConfig); -CircuitBreaker cpuCircuitBreaker = new CPUCircuitBreaker(solrConfig); + @SuppressWarnings({"rawtypes"}) Review comment: Unfortunately, that is the way NamedArgs are structured in PluginInfo -- we need to fix that before we remove this warning. I will follow up. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] atris merged pull request #1785: Update Circuit Breaker configured as a standard plugin
atris merged pull request #1785: URL: https://github.com/apache/lucene-solr/pull/1785 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14588) Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker
[ https://issues.apache.org/jira/browse/SOLR-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Atri Sharma resolved SOLR-14588. Resolution: Fixed Fixed configuration in [https://github.com/apache/lucene-solr/commit/6a7da3cd508f5b1e445df08aa2f1fa926d586e99] > Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker > -- > > Key: SOLR-14588 > URL: https://issues.apache.org/jira/browse/SOLR-14588 > Project: Solr > Issue Type: Improvement >Reporter: Atri Sharma >Assignee: Atri Sharma >Priority: Blocker > Fix For: master (9.0), 8.7 > > Time Spent: 13h 50m > Remaining Estimate: 0h > > This Jira tracks addition of circuit breakers in the search path and > implements JVM based circuit breaker which rejects incoming search requests > if the JVM heap usage exceeds a defined percentage. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6930) Provide "Circuit Breakers" For Expensive Solr Queries
[ https://issues.apache.org/jira/browse/SOLR-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Atri Sharma resolved SOLR-6930. --- Resolution: Fixed > Provide "Circuit Breakers" For Expensive Solr Queries > - > > Key: SOLR-6930 > URL: https://issues.apache.org/jira/browse/SOLR-6930 > Project: Solr > Issue Type: Improvement > Components: search >Reporter: Mike Drob >Priority: Major > Attachments: SOLR-6930.patch, SOLR-6930.patch, SOLR-6930.patch, > SOLR-6930.patch, SOLR-6930.patch, SOLR-6930.patch > > > Ref: > http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_limiting_memory_usage.html > ES currently allows operators to configure "circuit breakers" to preemptively > fail queries that are estimated too large rather than allowing an OOM > Exception to happen. We might be able to do the same thing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] janhoy commented on pull request #1769: SOLR-11245: Absorb the docker-solr repo.
janhoy commented on pull request #1769: URL: https://github.com/apache/lucene-solr/pull/1769#issuecomment-681865016 My first attempt at this gave some bash errors: ``` ./gradlew assemble > Configure project :solr:docker readlink: illegal option -- f usage: readlink [-n] [file ...] /Users/janhoy/git/lucene-solr/solr/docker/tests/cases/create_core_exec/test.sh: line 18: ./../../shared.sh: No such file or directory FAILURE: Build failed with an exception. ``` I recognize this from the docker-solr repo, the test scripts use some commands that do not work on MacOS. So I added the workaround with putting gnu variants of these tools in my path, but then I could still not make the assemble task run: ``` > Configure project :solr:docker Test /Users/janhoy/git/lucene-solr/solr/docker/tests/cases/create_core_exec apache/solr:9.0.0-SNAPSHOT Cleaning up left-over containers from previous runs Running test_apache_solr_9.0.0_SNAPSHOT Unable to find image 'apache/solr:9.0.0-SNAPSHOT' locally docker: Error response from daemon: pull access denied for apache/solr, repository does not exist or may require 'docker login': denied: requested access to the resource is denied. See 'docker run --help'. FAILURE: Build failed with an exception. ``` I had to uncomment the `task test()` from `docker/build.gradle` and then my build succeeded. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] janhoy commented on pull request #1362: SOLR-13768 Remove requiresub parameter in JWTAuthPlugin
janhoy commented on pull request #1362: URL: https://github.com/apache/lucene-solr/pull/1362#issuecomment-681924972 Perhaps this can be more dynamic, i.e. we require whatever claim that is configured for `principalClaim` in the config. This defaults to `sub`, but if someone configures e.g. `principalClaim: userid` then we should no longer require the `sub` claim. So I claim that we don't need an extra config option. Solr needs a principal ID anyway, so we have to require *something*. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185855#comment-17185855 ] Gus Heck commented on SOLR-14726: - Can we make it a goal that the user be **completely** unaware of what mode (cloud or not) they are using in the initial contact. That's deployment stuff and nothing they should even think about on first contact. I think they should run "tutorial1.sh" or {{bin/solr -e tutorial1}} and then pull up a page in their web browser to see it worked. Cloud or non-cloud can be used behind the scenes as current or future maintainers see fit. An adapted version of my comments on slack: There are various things to learn about solr... I might order them thus for what I (IMHO) consider optimal pedagogy: # {color:#0747a6}First Contact: A cushy easy intro that stands up solr, throws data in for them, and let's the user query it either in the UI or via curl as suits them (different people have different styles){color} # {color:#0747a6}Basic search concepts: inverted indexes, tokenization, a query syntax, sort vs relevancy scoring.{color} # {color:#0747a6}How to get data in (because without data whatever), and the need to be able to re-index{color} # How to deploy solr in a basically competent fashion for light duty use in low security environments # Features such as facets, highlighting, analysis options etc, this section should be an a la carte menu into the ref guide, as by this point they are becoming more advanced. # Hardening and Scaling solr, and otherwise making it production ready For the first 3 you really don't want the user to see any of #4 and it really doesn't matter if it's cloud or not so long as the person trying to learn doesn't see whichever it is. I think bin/solr -e accomplishes that with #1, and we basically don't do a good job of teaching #3 (in the ref guide). When you get to #4 I can't imagine which cases you would want to have them start with non-cloud solr, and have a closing section on non-cloud and the trade-offs of using it. #5 should be a la carte anyway, and we do have a fairly coherent section for #6 > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task >Reporter: Ishan Chattopadhyaya >Assignee: Alexandre Rafalovitch >Priority: Major > Labels: newdev > Attachments: yasa-http.png > > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185857#comment-17185857 ] Gus Heck commented on SOLR-14726: - One caveat to what I just said is that cloud vs non-cloud does somewhat matter for "getting data in" WRT which SolrJ classes one might use. > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task >Reporter: Ishan Chattopadhyaya >Assignee: Alexandre Rafalovitch >Priority: Major > Labels: newdev > Attachments: yasa-http.png > > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185855#comment-17185855 ] Gus Heck edited comment on SOLR-14726 at 8/27/20, 2:00 PM: --- Can we make it a goal that the user be **completely** unaware of what mode (cloud or not) they are using in the initial contact. That's deployment stuff and nothing they should even think about on first contact. I think they should run "tutorial1.sh" or {{bin/solr -e tutorial1}} and then pull up a page in their web browser to see it worked. Cloud or non-cloud can be used behind the scenes as current or future maintainers see fit. An adapted version of my comments on slack: There are various things to learn about solr... I might order them thus for what I (IMHO) consider optimal pedagogy: # {color:#0747a6}First Contact: A cushy easy intro that stands up solr, throws data in for them, and let's the user query it either in the UI or via curl as suits them (different people have different styles){color} # {color:#0747a6}Basic search concepts: inverted indexes, tokenization, a query syntax, sort vs relevancy scoring.{color} # {color:#0747a6}How to get data in (because without data whatever), and the need to be able to re-index{color} # How to deploy solr in a basically competent fashion for light duty use in low security environments # Features such as facets, highlighting, analysis options etc, this section should be an a la carte menu into the ref guide, as by this point they are becoming more advanced. # Hardening and Scaling solr, and otherwise making it production ready For the first 3 you really don't want the user to see any of #4 and it really doesn't matter if it's cloud or not so long as the person trying to learn doesn't see whichever it is. I think bin/solr -e accomplishes that with #1, and we basically don't do a good job of teaching #3 (in the ref guide). When you get to #4 I can't imagine which cases you would want to have them start with non-cloud solr, though that section should have a closing section on non-cloud and the trade-offs of using it. #5 should be a la carte anyway, and we do have a fairly coherent section for #6 was (Author: gus_heck): Can we make it a goal that the user be **completely** unaware of what mode (cloud or not) they are using in the initial contact. That's deployment stuff and nothing they should even think about on first contact. I think they should run "tutorial1.sh" or {{bin/solr -e tutorial1}} and then pull up a page in their web browser to see it worked. Cloud or non-cloud can be used behind the scenes as current or future maintainers see fit. An adapted version of my comments on slack: There are various things to learn about solr... I might order them thus for what I (IMHO) consider optimal pedagogy: # {color:#0747a6}First Contact: A cushy easy intro that stands up solr, throws data in for them, and let's the user query it either in the UI or via curl as suits them (different people have different styles){color} # {color:#0747a6}Basic search concepts: inverted indexes, tokenization, a query syntax, sort vs relevancy scoring.{color} # {color:#0747a6}How to get data in (because without data whatever), and the need to be able to re-index{color} # How to deploy solr in a basically competent fashion for light duty use in low security environments # Features such as facets, highlighting, analysis options etc, this section should be an a la carte menu into the ref guide, as by this point they are becoming more advanced. # Hardening and Scaling solr, and otherwise making it production ready For the first 3 you really don't want the user to see any of #4 and it really doesn't matter if it's cloud or not so long as the person trying to learn doesn't see whichever it is. I think bin/solr -e accomplishes that with #1, and we basically don't do a good job of teaching #3 (in the ref guide). When you get to #4 I can't imagine which cases you would want to have them start with non-cloud solr, and have a closing section on non-cloud and the trade-offs of using it. #5 should be a la carte anyway, and we do have a fairly coherent section for #6 > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task >Reporter: Ishan Chattopadhyaya >Assignee: Alexandre Rafalovitch >Priority: Major > Labels: newdev > Attachments: yasa-http.png > > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following >
[jira] [Created] (SOLR-14782) QueryElevationComponent does not handle escaped query terms
Thomas Schmiereck created SOLR-14782: Summary: QueryElevationComponent does not handle escaped query terms Key: SOLR-14782 URL: https://issues.apache.org/jira/browse/SOLR-14782 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: query parsers Affects Versions: 8.2 Reporter: Thomas Schmiereck h1. Description if the elevate.xml contains a entry with spaces: <{color:#0033b3}query {color}{color:#174ad4}text{color}{color:#067d17}="aaa bbb"{color}><{color:#0033b3}doc {color}{color:#174ad4}id{color}{color:#067d17}="core2docId2" {color}/> and the Solr query term is escaped: {{?q=aaa\+bbb}} the Solr search itself handels this correctly, but the elevate component "QueryElevationComponent" does not unescape the query term bevor the lookup in the elevate.xml. Result is that the entry is not elevated. A also valid (not escaped) query like: {{?q=aaa%20bbb}} is working. h1. Technical Notes see: org.apache.solr.handler.component.QueryElevationComponent.MapElevationProvider#getElevationForQuery -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185873#comment-17185873 ] Gus Heck commented on SOLR-14726: - Oh and there's that elephant just outside the doorway (i.e. not in scope for this ticket)... the lack of user friendly documentation for lucene itself :) > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task >Reporter: Ishan Chattopadhyaya >Assignee: Alexandre Rafalovitch >Priority: Major > Labels: newdev > Attachments: yasa-http.png > > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9485) Smoke tester could abort early if Solr port 8983 is not available
Jan Høydahl created LUCENE-9485: --- Summary: Smoke tester could abort early if Solr port 8983 is not available Key: LUCENE-9485 URL: https://issues.apache.org/jira/browse/LUCENE-9485 Project: Lucene - Core Issue Type: Improvement Reporter: Jan Høydahl Assignee: Jan Høydahl Just a small improvement, if you have something running on port 8983, the smoketester will not detect that until all the lucene tests are run, and you waste time :) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] janhoy opened a new pull request #1792: LUCENE-9485: Check early if Solr port 8983 is available
janhoy opened a new pull request #1792: URL: https://github.com/apache/lucene-solr/pull/1792 Simple port check before tests start This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] msokolov commented on pull request #1789: LUCENE-9484: Allow sorting an index after the fact
msokolov commented on pull request #1789: URL: https://github.com/apache/lucene-solr/pull/1789#issuecomment-681978792 It's nice that this change is so straightforward. It makes me realize I don't know what happens today if we specify an index Sort and then open an existing index that is not sorted. Do we throw an error? Should we instead provide a sorted view on the index that can be used to rewrite it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw commented on pull request #1789: LUCENE-9484: Allow sorting an index after the fact
s1monw commented on pull request #1789: URL: https://github.com/apache/lucene-solr/pull/1789#issuecomment-681983796 > It's nice that this change is so straightforward. It makes me realize I don't know what happens today if we specify an index Sort and then open an existing index that is not sorted. Do we throw an error? Should we instead provide a sorted view on the index that can be used to rewrite it? yes we fail if you do that. I don't think we should do any magic here. rewriting can be very costly and taking lots of space. I think failing is the right thing to do. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14726) Streamline getting started experience
[ https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185889#comment-17185889 ] Alexandre Rafalovitch commented on SOLR-14726: -- I absolutely agree on First Contact being very smooth and hopefully show the power of Solr very quickly (which means good example dataset in the box). To me, however, this means standalone start as it just requires less moving parts and things to explain (e.g. Collection vs Core in Admin UI). At the same time, it is important to recognize that users who are new to Solr (or any complex product) may make easy mistakes very early on. If we don't give them equally easy way to troubleshoot, they may never get to the step 4. That again points away from cloud for the initial contact because, for example, if we ask them to run a config API command to update schema definition, they could see if that managed-schema has been rewritten or not. In the cloud, it is much harder. These checkpoints are quite important I feel and I am always annoyed by the tutorials that just give you step after step and you don't know if you mistyped something at step 2 or step 10. Also step 6 is mostly outside of this ticket's scope, apart from solving the issue that default configuration is currently so hard to read that people take that into production, however many warnings we put in the middle of a thousand-line file. > Streamline getting started experience > - > > Key: SOLR-14726 > URL: https://issues.apache.org/jira/browse/SOLR-14726 > Project: Solr > Issue Type: Task >Reporter: Ishan Chattopadhyaya >Assignee: Alexandre Rafalovitch >Priority: Major > Labels: newdev > Attachments: yasa-http.png > > > The reference guide Solr tutorial is here: > https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html > It needs to be simplified and easy to follow. Also, it should reflect our > best practices, that should also be followed in production. I have following > suggestions: > # Make it less verbose. It is too long. On my laptop, it required 35 page > downs button presses to get to the bottom of the page! > # First step of the tutorial should be to enable security (basic auth should > suffice). > # {{./bin/solr start -e cloud}} <-- All references of -e should be removed. > # All references of {{bin/solr post}} to be replaced with {{curl}} > # Convert all {{bin/solr create}} references to curl of collection creation > commands > # Add docker based startup instructions. > # Create a Jupyter Notebook version of the entire tutorial, make it so that > it can be easily executed from Google Colaboratory. Here's an example: > https://twitter.com/TheSearchStack/status/1289703715981496320 > # Provide downloadable Postman and Insomnia files so that the same tutorial > can be executed from those tools. Except for starting Solr, all other steps > should be possible to be carried out from those tools. > # Use V2 APIs everywhere in the tutorial > # Remove all example modes, sample data (films, tech products etc.), > configsets from Solr's distribution (instead let the examples refer to them > from github) > # Remove the post tool from Solr, curl should suffice. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9486) Explore using preset dictionaries with LZ4 for stored fields
Adrien Grand created LUCENE-9486: Summary: Explore using preset dictionaries with LZ4 for stored fields Key: LUCENE-9486 URL: https://issues.apache.org/jira/browse/LUCENE-9486 Project: Lucene - Core Issue Type: Improvement Reporter: Adrien Grand Follow-up of LUCENE-9447: using preset dictionaries with DEFLATE provided very significant gains. Adding support for preset dictionaries with LZ4 would be easy so let's give it a try? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz opened a new pull request #1793: LUCENE-9486: Use preset dictionaries with LZ4 for BEST_SPEED.
jpountz opened a new pull request #1793: URL: https://github.com/apache/lucene-solr/pull/1793 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14779) Solr collections gets wiped on restart
[ https://issues.apache.org/jira/browse/SOLR-14779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185896#comment-17185896 ] Antonio Dinis commented on SOLR-14779: -- Didn't manage to get any help from the community > Solr collections gets wiped on restart > --- > > Key: SOLR-14779 > URL: https://issues.apache.org/jira/browse/SOLR-14779 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 8.3 >Reporter: Antonio Dinis >Priority: Major > > Hello, > We have a 3 node Solr cluster (ensemble) with apache-zookeeper 3.5.5. > It works fine until we need to restart one of the nodes. Then all the content > of the collection gets deleted. > This is a production environment, and every time there is a restart or a > crash in one of the services/servers we loose lots of time restoring the > collection and work. > This is the way we start the nodes: > su - ipls004p -c "/applis/24374-iplsp-00/IPLS/solr-8.3.0/bin/solr start > -cloud -p 8987 -h s01vl9918254 -s > /applis/24374-iplsp-00/IPLS/solr-8.3.0/cloud/node1/solr -z > s01vl9918254:2181,s01vl9918256:2181,s01vl9918258:2181 -force" > This is the zoo.cfg: > # The number of milliseconds of each tick > tickTime=2000 > # The number of ticks that the initial > # synchronization phase can take > initLimit=10 > # The number of ticks that can pass between > # sending a request and getting an acknowledgement > syncLimit=5 > # the directory where the snapshot is stored. > # do not use /tmp for storage, /tmp here is just > # example sakes. > dataDir=/applis/24374-iplsp-00/IPLS/apache-zookeeper-3.5.5-bin/temp > # the port at which the clients will connect > clientPort=2181 > # the maximum number of client connections. > # increase this if you need to handle more clients > #maxClientCnxns=60 > # > # Be sure to read the maintenance section of the > # administrator guide before turning on autopurge. > # > # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance > # > # The number of snapshots to retain in dataDir > #autopurge.snapRetainCount=3 > # Purge task interval in hours > # Set to "0" to disable auto purge feature > #autopurge.purgeInterval=1 > 4lw.commands.whitelist=mntr,conf,ruok > server.1=s01vl9918256:3889:3888 > server.2=s01vl9918258:3889:3888 > server.3=s01vl9918254:3889:3888 > #server.4=s01vl9918255:3889:3888 > > > Thanks in advance > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-4045) Switch Maven test runner from maven-surefire-plugin to com.carrotsearch.randomizedtesting:junit4-maven-plugin
[ https://issues.apache.org/jira/browse/LUCENE-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved LUCENE-4045. - Resolution: Invalid No longer applicable (master). > Switch Maven test runner from maven-surefire-plugin to > com.carrotsearch.randomizedtesting:junit4-maven-plugin > - > > Key: LUCENE-4045 > URL: https://issues.apache.org/jira/browse/LUCENE-4045 > Project: Lucene - Core > Issue Type: Improvement > Components: general/test >Affects Versions: 4.0-ALPHA >Reporter: Steven Rowe >Assignee: Dawid Weiss >Priority: Minor > > {{com.carrotsearch.randomizedtesting:junit4-maven-plugin}} can be used to run > all Lucene/Solr tests under Maven, providing faster execution through load > balancing, along with all the other goodies CS.RT brings. (Not to mention it > would make testing under Maven much more like testing under Ant.) > From [a post Dawid Weiss made on the maven-dev mailing list in > January|http://mail-archives.apache.org/mod_mbox/maven-dev/201201.mbox/%3ccam21rt8kdxowwmy3uq+fqvn979fq3mbm_gveqwoh226mh_z...@mail.gmail.com%3E]: > {quote} > http://labs.carrotsearch.com/randomizedtesting.html > [...] > Load balancing is just part of what the project is about [...] > Maven integration can be seen as an integration test (with scarce > documentation yet) here: > https://github.com/carrotsearch/randomizedtesting/blob/master/integration-maven/junit4-maven-plugin-tests/src/it/01-basic-test/pom.xml > I've used it in another project, so a cleaner example of use from within a > POM is here (you should disable surefire or your tests will run twice): > https://github.com/carrotsearch/hppc/blob/master/hppc-core/pom.xml#L217 > {quote} > And from [a post Dawid made to the lucene-dev mailing list in > April|http://mail-archives.apache.org/mod_mbox/lucene-dev/201204.mbox/%3CCAM21Rt-Rc_Z6X04fznvsbK-d8APiUuYCMC-kvg7i=hzthaq...@mail.gmail.com%3E]: > {quote} > I didn't mention it but there is actually an equivalent of task as a > maven plugin... it basically redirects to the ant-plugin but has a maven-like > facade for passing the basic set of properties. I don't think it makes such a > big difference for maven build - we can stick to surefire. Let me know if > you'd like to try that other plugin though -- an example of a maven pom using > it is here: > https://github.com/carrotsearch/randomizedtesting/blob/master/examples/maven/pom.xml > {quote} > The CS.RT maven plugin requires Maven v3.0.2+; I asked Dawid whether Maven > 2.2.1 could be supported, and in private emails to me, he replied: > {quote} > I looked at it but it seems I need to stick to Maven 3 -- there are APIs for > filtering artefacts that seem to be available in 3.x only (copied and pasted > from surefire and maven core). If you want to dig the code is on github, I > won't have the time to look into it in the near future (first short vacation, > then a backlog of crap to deal with). > {quote} > {quote} > I admit I don't have enough Maven powers to actually think of a way to use > either surefire or another plugin (depending on a sysproperty or something). > This could be a fallback for folks who really need maven 2.x. > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5283) Fail the build if ant test didn't execute any tests (everything filtered out).
[ https://issues.apache.org/jira/browse/LUCENE-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved LUCENE-5283. - Resolution: Fixed > Fail the build if ant test didn't execute any tests (everything filtered out). > -- > > Key: LUCENE-5283 > URL: https://issues.apache.org/jira/browse/LUCENE-5283 > Project: Lucene - Core > Issue Type: Wish >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Trivial > Fix For: 6.0, 4.6 > > Attachments: LUCENE-5283-permgen.patch, LUCENE-5283.patch, > LUCENE-5283.patch, LUCENE-5283.patch, LUCENE-5283.patch > > > This should be an optional setting that defaults to 'false' (the build > proceeds). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-9126) Javadoc linting options silently swallow documentation errors
[ https://issues.apache.org/jira/browse/LUCENE-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved LUCENE-9126. - Resolution: Workaround > Javadoc linting options silently swallow documentation errors > - > > Key: LUCENE-9126 > URL: https://issues.apache.org/jira/browse/LUCENE-9126 > Project: Lucene - Core > Issue Type: Bug >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Minor > Fix For: master (9.0) > > > I tried to compile javadocs in gradle and I couldn't do it... The output was > full of errors. > I eventually narrowed the problem down to lint options – how they are > interpreted and parsed just doesn't make any sense to me. Try this: > {code} > # Examples below use plain javadoc from Java 11. > cd lucene/core > {code} > This emulates what we have in Ant (this is roughly the options Ant emits): > {code} > javadoc -d build\output -encoding "UTF-8" -sourcepath src\java -subpackages > org -quiet -Xdoclint:all -Xdoclint:-missing -Xdoclint:-accessibility > => no errors. > {code} > Now rerun it with this syntax: > {code} > javadoc -d build\output -encoding "UTF-8" -sourcepath src\java -subpackages > org -quiet -Xdoclint:all,-missing,-accessibility > => 100 errors, 5 warnings > {code} > This time javadoc displays errors about undefined tags (unknown tag: > lucene.experimental), HTML warnings (warning: empty tag), etc. > Let's add our custom tags and add overview file: > {code} > javadoc -overview "src/java/overview.html" -tag "lucene.experimental:a:xxx" > -tag "lucene.internal:a:xxx" -tag "lucene.spi:t:xxx" -d build\output > -encoding "UTF-8" -sourcepath src\java -subpackages org -quiet > -Xdoclint:all,-missing,-accessibility > => 100 errors, 5 warnings > => still HTML warnings > {code} > Let's get rid of html linting: > {code} > javadoc -overview "src/java/overview.html" -tag "lucene.experimental:a:xxx" > -tag "lucene.internal:a:xxx" -tag "lucene.spi:t:xxx" -d build\output > -encoding "UTF-8" -sourcepath src\java -subpackages org -quiet > -Xdoclint:all,-missing,-accessibility,-html > => 3 errors > => malformed HTML syntax in overview.html: src\java\overview.html:150: error: > bad use of '>' (>) > {code} > Finally, let's get rid of syntax linting: > {code} > javadoc -overview "src/java/overview.html" -tag "lucene.experimental:a:xxx" > -tag "lucene.internal:a:xxx" -tag "lucene.spi:t:xxx" -d build\output > -encoding "UTF-8" -sourcepath src\java -subpackages org -quiet > -Xdoclint:all,-missing,-accessibility,-html,-syntax > => passes > {code} > There are definitely bugs in our documentation -- look at the extra ">" in > the overview file, for example: > https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/overview.html#L150 > What I can't understand is why the first syntax suppresses pretty much ALL > the errors, including missing custom tag definitions. This should work, given > what's written in [1]? > [1] https://docs.oracle.com/en/java/javase/11/tools/javadoc.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9126) Javadoc linting options silently swallow documentation errors
[ https://issues.apache.org/jira/browse/LUCENE-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185903#comment-17185903 ] Dawid Weiss commented on LUCENE-9126: - JDK bug is fixed in JDK 15. I'm closing this as there's not much to do from our side. > Javadoc linting options silently swallow documentation errors > - > > Key: LUCENE-9126 > URL: https://issues.apache.org/jira/browse/LUCENE-9126 > Project: Lucene - Core > Issue Type: Bug >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Minor > Fix For: master (9.0) > > > I tried to compile javadocs in gradle and I couldn't do it... The output was > full of errors. > I eventually narrowed the problem down to lint options – how they are > interpreted and parsed just doesn't make any sense to me. Try this: > {code} > # Examples below use plain javadoc from Java 11. > cd lucene/core > {code} > This emulates what we have in Ant (this is roughly the options Ant emits): > {code} > javadoc -d build\output -encoding "UTF-8" -sourcepath src\java -subpackages > org -quiet -Xdoclint:all -Xdoclint:-missing -Xdoclint:-accessibility > => no errors. > {code} > Now rerun it with this syntax: > {code} > javadoc -d build\output -encoding "UTF-8" -sourcepath src\java -subpackages > org -quiet -Xdoclint:all,-missing,-accessibility > => 100 errors, 5 warnings > {code} > This time javadoc displays errors about undefined tags (unknown tag: > lucene.experimental), HTML warnings (warning: empty tag), etc. > Let's add our custom tags and add overview file: > {code} > javadoc -overview "src/java/overview.html" -tag "lucene.experimental:a:xxx" > -tag "lucene.internal:a:xxx" -tag "lucene.spi:t:xxx" -d build\output > -encoding "UTF-8" -sourcepath src\java -subpackages org -quiet > -Xdoclint:all,-missing,-accessibility > => 100 errors, 5 warnings > => still HTML warnings > {code} > Let's get rid of html linting: > {code} > javadoc -overview "src/java/overview.html" -tag "lucene.experimental:a:xxx" > -tag "lucene.internal:a:xxx" -tag "lucene.spi:t:xxx" -d build\output > -encoding "UTF-8" -sourcepath src\java -subpackages org -quiet > -Xdoclint:all,-missing,-accessibility,-html > => 3 errors > => malformed HTML syntax in overview.html: src\java\overview.html:150: error: > bad use of '>' (>) > {code} > Finally, let's get rid of syntax linting: > {code} > javadoc -overview "src/java/overview.html" -tag "lucene.experimental:a:xxx" > -tag "lucene.internal:a:xxx" -tag "lucene.spi:t:xxx" -d build\output > -encoding "UTF-8" -sourcepath src\java -subpackages org -quiet > -Xdoclint:all,-missing,-accessibility,-html,-syntax > => passes > {code} > There are definitely bugs in our documentation -- look at the extra ">" in > the overview file, for example: > https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/overview.html#L150 > What I can't understand is why the first syntax suppresses pretty much ALL > the errors, including missing custom tag definitions. This should work, given > what's written in [1]? > [1] https://docs.oracle.com/en/java/javase/11/tools/javadoc.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9486) Explore using preset dictionaries with LZ4 for stored fields
[ https://issues.apache.org/jira/browse/LUCENE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185904#comment-17185904 ] Adrien Grand edited comment on LUCENE-9486 at 8/27/20, 3:02 PM: I played with various configurations and ended up with a preset dictionary of 4kB combined with 10 sub blocks of 60kB, which gives interesting results. Here are some benchmarks on the same datasets as LUCENE-9447: On highly compressible JSON logs: ||Method||Index size(MB)||Index time(s)||Avg fetch time (us)|| |LZ4(16kB) (current BEST_SPEED)|304,2|9|5| |LZ4(60kB)|141,7|7,5|10| |LZ4(256kB)|105,1|7,5|33| |LZ4(1MB)|96,5|7,5|115| |LZ4 with preset dict (new BEST_SPEED)|91,9|7,5|16| |Deflate with preset dict (new BEST_COMPRESSION)|64.9|14|41| On enwiki documents: ||Method||Index size(MB)||Index time(s)||Avg fetch time (us)|| |LZ4(16kB) (current BEST_SPEED)|558,8|14,5|83| |LZ4(60kB)|526,2|15|120| |LZ4(256kB)|523,1|15|323| |LZ4(1MB)|521,3|15,5|1151| |LZ4 with preset dict (new BEST_SPEED)|515,2|15|135| |Deflate with preset dict (new BEST_COMPRESSION)|338.0|35|250| It makes fetch times a bit slower, which is fair I think given that these fetch times are still way under the cost of a page fault. Indexing remains as fast as today and compression gets respectively 3.3x and 8% better on these datasets. I also included the results with BEST_COMPRESSION in the above benchmarks to show the trade-off that users are making when going with one versus the other. was (Author: jpountz): I played with various configurations and ended up with a preset dictionary of 4kB combined with 10 sub blocks of 60kB, which gives interesting results. Here are some benchmarks on the same datasets as LUCENE-9447: On highly compressible JSON logs: ||Method||Index size(MB)||Index time(s)||Avg fetch time (us)|| |LZ4(16kB) (current BEST_SPEED)|304,2|9|5| |LZ4(60kB)|141,7|7,5|10| |LZ4(256kB)|105,1|7,5|33| |LZ4(1MB)|96,5|7,5|115| |LZ4 with preset dict (new BEST_SPEED)|91,9|7,5|16| |Deflate with preset dict (new BEST_SPEED)|64.9|14|41| On enwiki documents: ||Method||Index size(MB)||Index time(s)||Avg fetch time (us)|| |LZ4(16kB) (current BEST_SPEED)|558,8|14,5|83| |LZ4(60kB)|526,2|15|120| |LZ4(256kB)|523,1|15|323| |LZ4(1MB)|521,3|15,5|1151| |LZ4 with preset dict (new BEST_SPEED)|515,2|15|135| |Deflate with preset dict (new BEST_SPEED)|338.0|35|250| It makes fetch times a bit slower, which is fair I think given that these fetch times are still way under the cost of a page fault. Indexing remains as fast as today and compression gets respectively 3.3x and 8% better on these datasets. I also included the results with BEST_COMPRESSION in the above benchmarks to show the trade-off that users are making when going with one versus the other. > Explore using preset dictionaries with LZ4 for stored fields > > > Key: LUCENE-9486 > URL: https://issues.apache.org/jira/browse/LUCENE-9486 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Follow-up of LUCENE-9447: using preset dictionaries with DEFLATE provided > very significant gains. Adding support for preset dictionaries with LZ4 would > be easy so let's give it a try? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9486) Explore using preset dictionaries with LZ4 for stored fields
[ https://issues.apache.org/jira/browse/LUCENE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185904#comment-17185904 ] Adrien Grand commented on LUCENE-9486: -- I played with various configurations and ended up with a preset dictionary of 4kB combined with 10 sub blocks of 60kB, which gives interesting results. Here are some benchmarks on the same datasets as LUCENE-9447: On highly compressible JSON logs: ||Method||Index size(MB)||Index time(s)||Avg fetch time (us)|| |LZ4(16kB) (current BEST_SPEED)|304,2|9|5| |LZ4(60kB)|141,7|7,5|10| |LZ4(256kB)|105,1|7,5|33| |LZ4(1MB)|96,5|7,5|115| |LZ4 with preset dict (new BEST_SPEED)|91,9|7,5|16| |Deflate with preset dict (new BEST_SPEED)|64.9|14|41| On enwiki documents: ||Method||Index size(MB)||Index time(s)||Avg fetch time (us)|| |LZ4(16kB) (current BEST_SPEED)|558,8|14,5|83| |LZ4(60kB)|526,2|15|120| |LZ4(256kB)|523,1|15|323| |LZ4(1MB)|521,3|15,5|1151| |LZ4 with preset dict (new BEST_SPEED)|515,2|15|135| |Deflate with preset dict (new BEST_SPEED)|338.0|35|250| It makes fetch times a bit slower, which is fair I think given that these fetch times are still way under the cost of a page fault. Indexing remains as fast as today and compression gets respectively 3.3x and 8% better on these datasets. I also included the results with BEST_COMPRESSION in the above benchmarks to show the trade-off that users are making when going with one versus the other. > Explore using preset dictionaries with LZ4 for stored fields > > > Key: LUCENE-9486 > URL: https://issues.apache.org/jira/browse/LUCENE-9486 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Follow-up of LUCENE-9447: using preset dictionaries with DEFLATE provided > very significant gains. Adding support for preset dictionaries with LZ4 would > be easy so let's give it a try? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14779) Solr collections gets wiped on restart
[ https://issues.apache.org/jira/browse/SOLR-14779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185908#comment-17185908 ] Erick Erickson commented on SOLR-14779: --- Check your spam folder, there's at least two replies already, the first within an hour and a half of your post. Please do remember what you're asking for here. People are contributing their (or their company's) time to help with your problems _for free_. I suggested a couple of things to check, and your e-mail did not indicate you'd verified any either of them. You may want to review: https://cwiki.apache.org/confluence/display/SOLR/UsingMailingLists > Solr collections gets wiped on restart > --- > > Key: SOLR-14779 > URL: https://issues.apache.org/jira/browse/SOLR-14779 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 8.3 >Reporter: Antonio Dinis >Priority: Major > > Hello, > We have a 3 node Solr cluster (ensemble) with apache-zookeeper 3.5.5. > It works fine until we need to restart one of the nodes. Then all the content > of the collection gets deleted. > This is a production environment, and every time there is a restart or a > crash in one of the services/servers we loose lots of time restoring the > collection and work. > This is the way we start the nodes: > su - ipls004p -c "/applis/24374-iplsp-00/IPLS/solr-8.3.0/bin/solr start > -cloud -p 8987 -h s01vl9918254 -s > /applis/24374-iplsp-00/IPLS/solr-8.3.0/cloud/node1/solr -z > s01vl9918254:2181,s01vl9918256:2181,s01vl9918258:2181 -force" > This is the zoo.cfg: > # The number of milliseconds of each tick > tickTime=2000 > # The number of ticks that the initial > # synchronization phase can take > initLimit=10 > # The number of ticks that can pass between > # sending a request and getting an acknowledgement > syncLimit=5 > # the directory where the snapshot is stored. > # do not use /tmp for storage, /tmp here is just > # example sakes. > dataDir=/applis/24374-iplsp-00/IPLS/apache-zookeeper-3.5.5-bin/temp > # the port at which the clients will connect > clientPort=2181 > # the maximum number of client connections. > # increase this if you need to handle more clients > #maxClientCnxns=60 > # > # Be sure to read the maintenance section of the > # administrator guide before turning on autopurge. > # > # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance > # > # The number of snapshots to retain in dataDir > #autopurge.snapRetainCount=3 > # Purge task interval in hours > # Set to "0" to disable auto purge feature > #autopurge.purgeInterval=1 > 4lw.commands.whitelist=mntr,conf,ruok > server.1=s01vl9918256:3889:3888 > server.2=s01vl9918258:3889:3888 > server.3=s01vl9918254:3889:3888 > #server.4=s01vl9918255:3889:3888 > > > Thanks in advance > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] markrmiller commented on pull request #1781: SOLR-14777: Fix an ignore test that is helpful.
markrmiller commented on pull request #1781: URL: https://github.com/apache/lucene-solr/pull/1781#issuecomment-682046480 > @markrmiller Is this the right idea for removing ignores? This is the right idea though, as Tim says, some of these tests may already be okay due to other changes. But basically, remove the ignores, if they don't pass address the issues, if the test is much slower than 10 seconds total (not just what junit reports) move it to @Nightly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9486) Explore using preset dictionaries with LZ4 for stored fields
[ https://issues.apache.org/jira/browse/LUCENE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186038#comment-17186038 ] Robert Muir commented on LUCENE-9486: - +1 > Explore using preset dictionaries with LZ4 for stored fields > > > Key: LUCENE-9486 > URL: https://issues.apache.org/jira/browse/LUCENE-9486 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Adrien Grand >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Follow-up of LUCENE-9447: using preset dictionaries with DEFLATE provided > very significant gains. Adding support for preset dictionaries with LZ4 would > be easy so let's give it a try? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14783) Remove DIH from 9.0
Alexandre Rafalovitch created SOLR-14783: Summary: Remove DIH from 9.0 Key: SOLR-14783 URL: https://issues.apache.org/jira/browse/SOLR-14783 Project: Solr Issue Type: Task Security Level: Public (Default Security Level. Issues are Public) Components: contrib - DataImportHandler Affects Versions: master (9.0) Reporter: Alexandre Rafalovitch Assignee: Alexandre Rafalovitch Now that Data Import Handler (SOLR-14066) has been depreciated in 8.6, it can be removed in next major version (9) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] HoustonPutman commented on pull request #1769: SOLR-11245: Absorb the docker-solr repo.
HoustonPutman commented on pull request #1769: URL: https://github.com/apache/lucene-solr/pull/1769#issuecomment-682185233 @janhoy The tests should hopefully work for you now, though I do have the gnu utils installed, so I'm not sure they will. The tests are no longer run by default in the assemble, so that should work for you now regardless. I have done some modification on the tests to simplify out some of the logic. I also changed the permissions of folders created during the tests, so that it doesn't require root permissions anymore to run them. This works for me, but it might not work for others. Not confident on that yet. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14784) Reproducible failure for DirectUpdateHandlerTest
Erick Erickson created SOLR-14784: - Summary: Reproducible failure for DirectUpdateHandlerTest Key: SOLR-14784 URL: https://issues.apache.org/jira/browse/SOLR-14784 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: Tests Affects Versions: master (9.0) Reporter: Erick Erickson This is rather weird. It apparently was introduced by LUCENE-9456, but that seems odd. Although I do note that that push may do some different error handling, perhaps Solr needs to accommodate that. Of course it doesn't necessarily reproduce with other seeds. [~jpountz] do you have any hints? Reproduce 100% with: ./gradlew :solr:core:test --tests "org.apache.solr.update.DirectUpdateHandlerTest" -Ptests.seed=2BE3A8682E5E346D -Ptests.multiplier=2 -Ptests.badapples=false -Ptests.file.encoding=US-ASCI -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14784) Reproducible failure for DirectUpdateHandlerTest
[ https://issues.apache.org/jira/browse/SOLR-14784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-14784: -- Attachment: DirectUpdateHandlerTest-fail.txt > Reproducible failure for DirectUpdateHandlerTest > > > Key: SOLR-14784 > URL: https://issues.apache.org/jira/browse/SOLR-14784 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Tests >Affects Versions: master (9.0) >Reporter: Erick Erickson >Priority: Major > Attachments: DirectUpdateHandlerTest-fail.txt > > > This is rather weird. It apparently was introduced by LUCENE-9456, but that > seems odd. Although I do note that that push may do some different error > handling, perhaps Solr needs to accommodate that. > Of course it doesn't necessarily reproduce with other seeds. > [~jpountz] do you have any hints? > Reproduce 100% with: > ./gradlew :solr:core:test --tests > "org.apache.solr.update.DirectUpdateHandlerTest" > -Ptests.seed=2BE3A8682E5E346D -Ptests.multiplier=2 -Ptests.badapples=false > -Ptests.file.encoding=US-ASCI > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14784) Reproducible failure for DirectUpdateHandlerTest
[ https://issues.apache.org/jira/browse/SOLR-14784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-14784: -- Attachment: DirectUpdateHandlerTest-success.xml > Reproducible failure for DirectUpdateHandlerTest > > > Key: SOLR-14784 > URL: https://issues.apache.org/jira/browse/SOLR-14784 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Tests >Affects Versions: master (9.0) >Reporter: Erick Erickson >Priority: Major > Attachments: DirectUpdateHandlerTest-fail.txt, > DirectUpdateHandlerTest-success.xml > > > This is rather weird. It apparently was introduced by LUCENE-9456, but that > seems odd. Although I do note that that push may do some different error > handling, perhaps Solr needs to accommodate that. > Of course it doesn't necessarily reproduce with other seeds. > [~jpountz] do you have any hints? > Reproduce 100% with: > ./gradlew :solr:core:test --tests > "org.apache.solr.update.DirectUpdateHandlerTest" > -Ptests.seed=2BE3A8682E5E346D -Ptests.multiplier=2 -Ptests.badapples=false > -Ptests.file.encoding=US-ASCI > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14784) Reproducible failure for DirectUpdateHandlerTest
[ https://issues.apache.org/jira/browse/SOLR-14784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186120#comment-17186120 ] Erick Erickson commented on SOLR-14784: --- Oh crap. Ignore me so far, although this test is failing fairly regularly. tests.file.encoding=US-ASCI .vs. tests.file.encoding=US-ASCII not reprodudible 100% with the second "I". > Reproducible failure for DirectUpdateHandlerTest > > > Key: SOLR-14784 > URL: https://issues.apache.org/jira/browse/SOLR-14784 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Tests >Affects Versions: master (9.0) >Reporter: Erick Erickson >Priority: Major > Attachments: DirectUpdateHandlerTest-fail.txt, > DirectUpdateHandlerTest-success.xml > > > This is rather weird. It apparently was introduced by LUCENE-9456, but that > seems odd. Although I do note that that push may do some different error > handling, perhaps Solr needs to accommodate that. > Of course it doesn't necessarily reproduce with other seeds. > [~jpountz] do you have any hints? > Reproduce 100% with: > ./gradlew :solr:core:test --tests > "org.apache.solr.update.DirectUpdateHandlerTest" > -Ptests.seed=2BE3A8682E5E346D -Ptests.multiplier=2 -Ptests.badapples=false > -Ptests.file.encoding=US-ASCI > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Issue Comment Deleted] (SOLR-14784) Reproducible failure for DirectUpdateHandlerTest
[ https://issues.apache.org/jira/browse/SOLR-14784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-14784: -- Comment: was deleted (was: Oh crap. Ignore me so far, although this test is failing fairly regularly. tests.file.encoding=US-ASCI .vs. tests.file.encoding=US-ASCII not reprodudible 100% with the second "I". ) > Reproducible failure for DirectUpdateHandlerTest > > > Key: SOLR-14784 > URL: https://issues.apache.org/jira/browse/SOLR-14784 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Tests >Affects Versions: master (9.0) >Reporter: Erick Erickson >Priority: Major > Attachments: DirectUpdateHandlerTest-fail.txt, > DirectUpdateHandlerTest-success.xml > > > This is rather weird. It apparently was introduced by LUCENE-9456, but that > seems odd. Although I do note that that push may do some different error > handling, perhaps Solr needs to accommodate that. > Of course it doesn't necessarily reproduce with other seeds. > [~jpountz] do you have any hints? > Reproduce 100% with: > ./gradlew :solr:core:test --tests > "org.apache.solr.update.DirectUpdateHandlerTest" > -Ptests.seed=2BE3A8682E5E346D -Ptests.multiplier=2 -Ptests.badapples=false > -Ptests.file.encoding=US-ASCI > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14784) Reproducible failure for DirectUpdateHandlerTest
[ https://issues.apache.org/jira/browse/SOLR-14784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-14784: -- Description: This is rather weird. It apparently was introduced by LUCENE-9456, but that seems odd. Although I do note that that push may do some different error handling, perhaps Solr needs to accommodate that. Of course it doesn't necessarily reproduce with other seeds. [~jpountz] do you have any hints? Reproduce 100% with: ./gradlew :solr:core:test --tests "org.apache.solr.update.DirectUpdateHandlerTest" -Ptests.seed=2BE3A8682E5E346D -Ptests.multiplier=2 -Ptests.badapples=false -Ptests.file.encoding=US-ASCII was: This is rather weird. It apparently was introduced by LUCENE-9456, but that seems odd. Although I do note that that push may do some different error handling, perhaps Solr needs to accommodate that. Of course it doesn't necessarily reproduce with other seeds. [~jpountz] do you have any hints? Reproduce 100% with: ./gradlew :solr:core:test --tests "org.apache.solr.update.DirectUpdateHandlerTest" -Ptests.seed=2BE3A8682E5E346D -Ptests.multiplier=2 -Ptests.badapples=false -Ptests.file.encoding=US-ASCI > Reproducible failure for DirectUpdateHandlerTest > > > Key: SOLR-14784 > URL: https://issues.apache.org/jira/browse/SOLR-14784 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Tests >Affects Versions: master (9.0) >Reporter: Erick Erickson >Priority: Major > Attachments: DirectUpdateHandlerTest-fail.txt, > DirectUpdateHandlerTest-success.xml > > > This is rather weird. It apparently was introduced by LUCENE-9456, but that > seems odd. Although I do note that that push may do some different error > handling, perhaps Solr needs to accommodate that. > Of course it doesn't necessarily reproduce with other seeds. > [~jpountz] do you have any hints? > Reproduce 100% with: > ./gradlew :solr:core:test --tests > "org.apache.solr.update.DirectUpdateHandlerTest" > -Ptests.seed=2BE3A8682E5E346D -Ptests.multiplier=2 -Ptests.badapples=false > -Ptests.file.encoding=US-ASCII > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14784) Reproducible failure for DirectUpdateHandlerTest
[ https://issues.apache.org/jira/browse/SOLR-14784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186123#comment-17186123 ] Erick Erickson commented on SOLR-14784: --- Here's an interesting bit from the failure case: [junit4] 2> 28524 INFO (SUITE-DirectUpdateHandlerTest-seed#[81E1DC8C3CA7BF3B]-worker) [ ] o.a.s.m.r.SolrJmxReporter Closing reporter [org.apache.solr.metrics.reporters.SolrJmxReporter@4e97ad89: rootName = null, domain = solr.jetty, service url = null, agent id = null] for registry solr.jetty/com.codahale.metrics.MetricRegistry@7d190268 [junit4] 2> 28528 ERROR (SUITE-DirectUpdateHandlerTest-seed#[81E1DC8C3CA7BF3B]-worker) [ ] o.a.s.c.u.ObjectReleaseTracker [junit4] 2> => java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are still 1 open files: \{_0.cfs=1} [junit4] 2> at org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:812) [junit4] 2> java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are still 1 open files: \{_0.cfs=1} [junit4] 2> at org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:812) ~[java/:?] [junit4] 2> at org.apache.solr.common.util.ObjectReleaseTracker.tryClose(ObjectReleaseTracker.java:85) [java/:?] [junit4] 2> at org.apache.solr.SolrTestCaseJ4.teardownTestCases(SolrTestCaseJ4.java:329) [java/:?] [junit4] 2> at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?] [junit4] 2> at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?] [junit4] 2> at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] [junit4] 2> at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] [junit4] 2> at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1754) [randomizedtesting-runner-2.7.6.jar:?] [junit4] 2> at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:905) [randomizedtesting-runner-2.7.6.jar:?] [junit4] 2> at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) [randomizedtesting-runner-2.7.6.jar:?] [junit4] 2> at com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57) [randomizedtesting-runner-2.7.6.jar:?] [junit4] 2> at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) [java/:?] > Reproducible failure for DirectUpdateHandlerTest > > > Key: SOLR-14784 > URL: https://issues.apache.org/jira/browse/SOLR-14784 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Tests >Affects Versions: master (9.0) >Reporter: Erick Erickson >Priority: Major > Attachments: DirectUpdateHandlerTest-fail.txt, > DirectUpdateHandlerTest-success.xml > > > This is rather weird. It apparently was introduced by LUCENE-9456, but that > seems odd. Although I do note that that push may do some different error > handling, perhaps Solr needs to accommodate that. > Of course it doesn't necessarily reproduce with other seeds. > [~jpountz] do you have any hints? > Reproduce 100% with: > ./gradlew :solr:core:test --tests > "org.apache.solr.update.DirectUpdateHandlerTest" > -Ptests.seed=2BE3A8682E5E346D -Ptests.multiplier=2 -Ptests.badapples=false > -Ptests.file.encoding=US-ASCII > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14783) Remove DIH from 9.0
[ https://issues.apache.org/jira/browse/SOLR-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186148#comment-17186148 ] Ishan Chattopadhyaya commented on SOLR-14783: - +1 > Remove DIH from 9.0 > --- > > Key: SOLR-14783 > URL: https://issues.apache.org/jira/browse/SOLR-14783 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - DataImportHandler >Affects Versions: master (9.0) >Reporter: Alexandre Rafalovitch >Assignee: Alexandre Rafalovitch >Priority: Major > > Now that Data Import Handler (SOLR-14066) has been depreciated in 8.6, it can > be removed in next major version (9) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14783) Remove DIH from 9.0
[ https://issues.apache.org/jira/browse/SOLR-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186157#comment-17186157 ] Alexandre Rafalovitch commented on SOLR-14783: -- DIH depends on two database engines: * org.hsqldb:hsqldb * org.apache.derby:derby HSQLDB seems to be also used by SolrJ DERBY does not seem to be used by anything else, therefore its references and checksums can be removed too. > Remove DIH from 9.0 > --- > > Key: SOLR-14783 > URL: https://issues.apache.org/jira/browse/SOLR-14783 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - DataImportHandler >Affects Versions: master (9.0) >Reporter: Alexandre Rafalovitch >Assignee: Alexandre Rafalovitch >Priority: Major > > Now that Data Import Handler (SOLR-14066) has been depreciated in 8.6, it can > be removed in next major version (9) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14783) Remove DIH from 9.0
[ https://issues.apache.org/jira/browse/SOLR-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186159#comment-17186159 ] Alexandre Rafalovitch commented on SOLR-14783: -- solr/server/etc/security.policy has DIH permission. Need to test if ripping that out will not break any assumptions. > Remove DIH from 9.0 > --- > > Key: SOLR-14783 > URL: https://issues.apache.org/jira/browse/SOLR-14783 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - DataImportHandler >Affects Versions: master (9.0) >Reporter: Alexandre Rafalovitch >Assignee: Alexandre Rafalovitch >Priority: Major > > Now that Data Import Handler (SOLR-14066) has been depreciated in 8.6, it can > be removed in next major version (9) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14783) Remove DIH from 9.0
[ https://issues.apache.org/jira/browse/SOLR-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186170#comment-17186170 ] Ishan Chattopadhyaya commented on SOLR-14783: - bq. solr/server/etc/security.policy has DIH permission. Need to test if ripping that out will not break any assumptions. Lets leave it in. It is the deregisterDriver permission which is totally fine to keep it, would be an ugly deal breaker if we were to have DIH users add it by hand. > Remove DIH from 9.0 > --- > > Key: SOLR-14783 > URL: https://issues.apache.org/jira/browse/SOLR-14783 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - DataImportHandler >Affects Versions: master (9.0) >Reporter: Alexandre Rafalovitch >Assignee: Alexandre Rafalovitch >Priority: Major > > Now that Data Import Handler (SOLR-14066) has been depreciated in 8.6, it can > be removed in next major version (9) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14616) Remove CDCR from 9.0
[ https://issues.apache.org/jira/browse/SOLR-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186185#comment-17186185 ] Ishan Chattopadhyaya commented on SOLR-14616: - Based on today's committer meeting and Erick's post on CDCR in the Roadmap thread, I think it is very important to remove CDCR in 9.0. There's no reason to support it in 9.0 only to remove it one version later. Better to take such drastic measures in a major release. I'm planning to revive the effort to remove it in 9.0, and I'm planning to merge the PR soon. > Remove CDCR from 9.0 > > > Key: SOLR-14616 > URL: https://issues.apache.org/jira/browse/SOLR-14616 > Project: Solr > Issue Type: Sub-task >Affects Versions: master (9.0) >Reporter: Ishan Chattopadhyaya >Assignee: Anshum Gupta >Priority: Blocker > Time Spent: 50m > Remaining Estimate: 0h > > This was deprecated in SOLR-14022 and should be removed in 9.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-14616) Remove CDCR from 9.0
[ https://issues.apache.org/jira/browse/SOLR-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chattopadhyaya reassigned SOLR-14616: --- Assignee: Ishan Chattopadhyaya (was: Anshum Gupta) > Remove CDCR from 9.0 > > > Key: SOLR-14616 > URL: https://issues.apache.org/jira/browse/SOLR-14616 > Project: Solr > Issue Type: Sub-task >Affects Versions: master (9.0) >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Time Spent: 50m > Remaining Estimate: 0h > > This was deprecated in SOLR-14022 and should be removed in 9.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14616) Remove CDCR from 9.0
[ https://issues.apache.org/jira/browse/SOLR-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chattopadhyaya updated SOLR-14616: Fix Version/s: master (9.0) > Remove CDCR from 9.0 > > > Key: SOLR-14616 > URL: https://issues.apache.org/jira/browse/SOLR-14616 > Project: Solr > Issue Type: Sub-task >Affects Versions: master (9.0) >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Fix For: master (9.0) > > Time Spent: 50m > Remaining Estimate: 0h > > This was deprecated in SOLR-14022 and should be removed in 9.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] arafalov opened a new pull request #1794: SOLR-14783: Remove DIH from 9.0
arafalov opened a new pull request #1794: URL: https://github.com/apache/lucene-solr/pull/1794 # Description DIH has been deprecated in 8.6 and was marked to remove in 9.0 # Solution This removes DIH and vast majority of its references. (Should probably be squashed on merge, it was split into many commits for ease of review.) # Checklist Please review the following and check all that apply: - [X] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [X] I have created a Jira issue and added the issue ID to my pull request title. - [X] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [X] I have developed this patch against the `master` branch. - [X] I have run `ant precommit` and the appropriate test suite. - [X] I have added/removed documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14776) Precompute the fingerprint during PeerSync
[ https://issues.apache.org/jira/browse/SOLR-14776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cao Manh Dat updated SOLR-14776: Description: Computing fingerprint can very costly and take time. But the current implementation will send requests for getting fingerprint for multiple replicas, then on the first response it will then compute its own fingerprint for comparison. A very simple but effective improvement here is compute its own fingerprint right after send requests to other replicas. (was: Computing fingerprint can very costly and take time. ) > Precompute the fingerprint during PeerSync > -- > > Key: SOLR-14776 > URL: https://issues.apache.org/jira/browse/SOLR-14776 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > > Computing fingerprint can very costly and take time. But the current > implementation will send requests for getting fingerprint for multiple > replicas, then on the first response it will then compute its own fingerprint > for comparison. A very simple but effective improvement here is compute its > own fingerprint right after send requests to other replicas. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14684) CloudExitableDirectoryReaderTest failing about 25% of the time
[ https://issues.apache.org/jira/browse/SOLR-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186238#comment-17186238 ] ASF subversion and git services commented on SOLR-14684: Commit a93cb7102f02b25e50dfac2353e9c4c2a445b177 in lucene-solr's branch refs/heads/branch_8x from Cao Manh Dat [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a93cb71 ] SOLR-14684: CloudExitableDirectoryReaderTest failing about 25% of the time (#1724) > CloudExitableDirectoryReaderTest failing about 25% of the time > -- > > Key: SOLR-14684 > URL: https://issues.apache.org/jira/browse/SOLR-14684 > Project: Solr > Issue Type: Test > Components: Tests >Affects Versions: master (9.0) >Reporter: Erick Erickson >Priority: Major > Attachments: stdout > > Time Spent: 40m > Remaining Estimate: 0h > > If I beast this on my local machine, it fails (non reproducibly of course) > about 1/4 of the time. Log attached. The test itself hasn't changed in 11 > months or so. > It looks like occasionally the calls throw an error rather than return > partial results with a message: "Time allowed to handle this request > exceeded:[]". > It's been failing very intermittently for a couple of years, but the failure > rate really picked up in the last couple of weeks. IDK whether the failures > prior to the last couple of weeks are the same root cause. > I'll do some spelunking to see if I can pinpoint the commit that made this > happen, but it'll take a while. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14684) CloudExitableDirectoryReaderTest failing about 25% of the time
[ https://issues.apache.org/jira/browse/SOLR-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186239#comment-17186239 ] ASF subversion and git services commented on SOLR-14684: Commit 5f0c9dfbabd99f470823241437b20ada05ee79d7 in lucene-solr's branch refs/heads/branch_8x from Cao Manh Dat [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5f0c9df ] SOLR-14684: Skipping check time exceeded for the first request in a proper way > CloudExitableDirectoryReaderTest failing about 25% of the time > -- > > Key: SOLR-14684 > URL: https://issues.apache.org/jira/browse/SOLR-14684 > Project: Solr > Issue Type: Test > Components: Tests >Affects Versions: master (9.0) >Reporter: Erick Erickson >Priority: Major > Attachments: stdout > > Time Spent: 40m > Remaining Estimate: 0h > > If I beast this on my local machine, it fails (non reproducibly of course) > about 1/4 of the time. Log attached. The test itself hasn't changed in 11 > months or so. > It looks like occasionally the calls throw an error rather than return > partial results with a message: "Time allowed to handle this request > exceeded:[]". > It's been failing very intermittently for a couple of years, but the failure > rate really picked up in the last couple of weeks. IDK whether the failures > prior to the last couple of weeks are the same root cause. > I'll do some spelunking to see if I can pinpoint the commit that made this > happen, but it'll take a while. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13973) Deprecate Tika
[ https://issues.apache.org/jira/browse/SOLR-13973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186244#comment-17186244 ] David Smiley commented on SOLR-13973: - Andrew, your illustration confuses/surprises me a bit. Wouldn't the sync connector simply talk to Tika Server (which returns the text) to then pass along to Solr? This adds just one more service to your original setup. You could even embed Tika into the connector if your objective is to keep the service count low. > Deprecate Tika > -- > > Key: SOLR-13973 > URL: https://issues.apache.org/jira/browse/SOLR-13973 > Project: Solr > Issue Type: Improvement >Reporter: Ishan Chattopadhyaya >Priority: Blocker > Fix For: 8.7 > > Time Spent: 10m > Remaining Estimate: 0h > > Solr's primary responsibility should be to focus on search and scalability. > Having to deal with the problems (CVEs) of Velocity, Tika etc. can slow us > down. I propose that we deprecate it going forward. > Tika can be run outside Solr. Going forward, if someone wants to use these, > it should be possible to bring them into third party packages and installed > via package manager. > Plan is to just to throw warnings in logs and add deprecation notes in > reference guide for now. Removal can be done in 9.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14783) Remove DIH from 9.0
[ https://issues.apache.org/jira/browse/SOLR-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186251#comment-17186251 ] Alexandre Rafalovitch commented on SOLR-14783: -- The changes are in PR, ready for the review. > Remove DIH from 9.0 > --- > > Key: SOLR-14783 > URL: https://issues.apache.org/jira/browse/SOLR-14783 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - DataImportHandler >Affects Versions: master (9.0) >Reporter: Alexandre Rafalovitch >Assignee: Alexandre Rafalovitch >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Now that Data Import Handler (SOLR-14066) has been depreciated in 8.6, it can > be removed in next major version (9) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14617) Remove DIH from 9.0
[ https://issues.apache.org/jira/browse/SOLR-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186256#comment-17186256 ] Alexandre Rafalovitch commented on SOLR-14617: -- Accidental duplication. This issue could even be deleted and the other linked as needed. > Remove DIH from 9.0 > --- > > Key: SOLR-14617 > URL: https://issues.apache.org/jira/browse/SOLR-14617 > Project: Solr > Issue Type: Sub-task >Affects Versions: master (9.0) >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Blocker > > Remove DIH from 9.0. This was deprecated in SOLR-14066. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14785) Update synonyms by API and reload collection in Solr
Gitterh created SOLR-14785: -- Summary: Update synonyms by API and reload collection in Solr Key: SOLR-14785 URL: https://issues.apache.org/jira/browse/SOLR-14785 Project: Solr Issue Type: Task Security Level: Public (Default Security Level. Issues are Public) Components: search Affects Versions: 8.6.1 Reporter: Gitterh I am using Solr 8.6.1, started in solrcloud mode. The field type is ``` { "add-field-type" : { "name":"articleTitle", "positionIncrementGap":100, "multiValued":false, "class":"solr.TextField", "indexAnalyzer":{ "tokenizer":\{ "class":"solr.StandardTokenizerFactory" }, "filters":[ \{ "class":"solr.LowerCaseFilterFactory" }, \{ "class":"solr.ManagedStopFilterFactory", "managed":"english" }, \{ "class":"solr.ManagedSynonymGraphFilterFactory", "managed":"english" }, \{ "class":"solr.FlattenGraphFilterFactory" }, \{ "class":"solr.PorterStemFilterFactory" } ] }, "queryAnalyzer":{ "tokenizer":\{ "class":"solr.StandardTokenizerFactory" }, "filters":[ \{ "class":"solr.LowerCaseFilterFactory" }, \{ "class":"solr.ManagedStopFilterFactory", "managed":"english" }, \{ "class":"solr.ManagedSynonymGraphFilterFactory", "managed":"english" }, \{ "class":"solr.PorterStemFilterFactory" } ] } } } ``` After I add a document ``` { "id": 100, "articleTitle": "Best smartphone" } ``` I update the synonyms list by API ``` curl -X PUT -H 'Content-type:application/json' --data-binary '["iphone", "smartphone"]' "http://localhost:8983/solr/articles/schema/analysis/synonyms/english"; ``` and reload the collection by API ``` http://localhost:8983/solr/admin/collections?action=RELOAD&name=articles ``` However when I try to search the documents don't pop-up. ``` http://localhost:8983/solr/articles/select?q=articleTitle:iphone ``` No result are returned. I expected that added document will be returned. It works only if I first update the synonyms list and after that add the document into collection. How to configure Solr to find the documents by synonyms if the synonyms are changed after documents are created? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org