date:20200827

[jira] [Commented] (SOLR-14779) Solr collections gets wiped on restart

2020-08-27 Thread Antonio Dinis (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185680#comment-17185680
 ] 

Antonio Dinis commented on SOLR-14779:
--

Thank you Erick,

I will try the community website.

> Solr collections gets wiped on restart 
> ---
>
> Key: SOLR-14779
> URL: https://issues.apache.org/jira/browse/SOLR-14779
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 8.3
>Reporter: Antonio Dinis
>Priority: Major
>
> Hello,
> We have a 3 node Solr cluster (ensemble) with apache-zookeeper 3.5.5.
> It works fine until we need to restart one of the nodes. Then all the content 
> of the collection gets deleted.
> This is a production environment, and every time there is a restart or a 
> crash in one of the services/servers we loose lots of time restoring the 
> collection and work. 
> This is the way we start the nodes:
> su - ipls004p -c "/applis/24374-iplsp-00/IPLS/solr-8.3.0/bin/solr start 
> -cloud -p 8987 -h s01vl9918254 -s 
> /applis/24374-iplsp-00/IPLS/solr-8.3.0/cloud/node1/solr -z 
> s01vl9918254:2181,s01vl9918256:2181,s01vl9918258:2181 -force"
> This is the zoo.cfg:
> # The number of milliseconds of each tick
> tickTime=2000
> # The number of ticks that the initial
> # synchronization phase can take
> initLimit=10
> # The number of ticks that can pass between
> # sending a request and getting an acknowledgement
> syncLimit=5
> # the directory where the snapshot is stored.
> # do not use /tmp for storage, /tmp here is just
> # example sakes.
> dataDir=/applis/24374-iplsp-00/IPLS/apache-zookeeper-3.5.5-bin/temp
> # the port at which the clients will connect
> clientPort=2181
> # the maximum number of client connections.
> # increase this if you need to handle more clients
> #maxClientCnxns=60
> #
> # Be sure to read the maintenance section of the
> # administrator guide before turning on autopurge.
> #
> # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
> #
> # The number of snapshots to retain in dataDir
> #autopurge.snapRetainCount=3
> # Purge task interval in hours
> # Set to "0" to disable auto purge feature
> #autopurge.purgeInterval=1
> 4lw.commands.whitelist=mntr,conf,ruok
> server.1=s01vl9918256:3889:3888
> server.2=s01vl9918258:3889:3888
> server.3=s01vl9918254:3889:3888
> #server.4=s01vl9918255:3889:3888
>  
>  
> Thanks in advance
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] atris commented on a change in pull request #1785: Update Circuit Breaker configured as a standard plugin

2020-08-27 Thread GitBox



atris commented on a change in pull request #1785:
URL: https://github.com/apache/lucene-solr/pull/1785#discussion_r478243445



##
File path: 
solr/core/src/java/org/apache/solr/util/circuitbreaker/CircuitBreakerManager.java
##
@@ -121,19 +135,41 @@ public static String toErrorMessage(List 
circuitBreakerList) {
*
* Any default circuit breakers should be registered here.
*/
-  public static CircuitBreakerManager build(SolrConfig solrConfig) {
-CircuitBreakerManager circuitBreakerManager = new 
CircuitBreakerManager(solrConfig.useCircuitBreakers);
-
-// Install the default circuit breakers
-CircuitBreaker memoryCircuitBreaker = new MemoryCircuitBreaker(solrConfig);
-CircuitBreaker cpuCircuitBreaker = new CPUCircuitBreaker(solrConfig);
+  @SuppressWarnings({"rawtypes"})

Review comment:
   Unfortunately, that is the way NamedArgs are structured in PluginInfo -- 
we need to fix that before we remove this warning. I will follow up.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] atris merged pull request #1785: Update Circuit Breaker configured as a standard plugin

2020-08-27 Thread GitBox



atris merged pull request #1785:
URL: https://github.com/apache/lucene-solr/pull/1785


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (SOLR-14588) Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker

2020-08-27 Thread Atri Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Atri Sharma resolved SOLR-14588.

Resolution: Fixed

Fixed configuration in 
[https://github.com/apache/lucene-solr/commit/6a7da3cd508f5b1e445df08aa2f1fa926d586e99]

> Circuit Breakers Infrastructure and Real JVM Based Circuit Breaker
> --
>
> Key: SOLR-14588
> URL: https://issues.apache.org/jira/browse/SOLR-14588
> Project: Solr
>  Issue Type: Improvement
>Reporter: Atri Sharma
>Assignee: Atri Sharma
>Priority: Blocker
> Fix For: master (9.0), 8.7
>
>  Time Spent: 13h 50m
>  Remaining Estimate: 0h
>
> This Jira tracks addition of circuit breakers in the search path and 
> implements JVM based circuit breaker which rejects incoming search requests 
> if the JVM heap usage exceeds a defined percentage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (SOLR-6930) Provide "Circuit Breakers" For Expensive Solr Queries

2020-08-27 Thread Atri Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Atri Sharma resolved SOLR-6930.
---
Resolution: Fixed

> Provide "Circuit Breakers" For Expensive Solr Queries
> -
>
> Key: SOLR-6930
> URL: https://issues.apache.org/jira/browse/SOLR-6930
> Project: Solr
>  Issue Type: Improvement
>  Components: search
>Reporter: Mike Drob
>Priority: Major
> Attachments: SOLR-6930.patch, SOLR-6930.patch, SOLR-6930.patch, 
> SOLR-6930.patch, SOLR-6930.patch, SOLR-6930.patch
>
>
> Ref: 
> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_limiting_memory_usage.html
> ES currently allows operators to configure "circuit breakers" to preemptively 
> fail queries that are estimated too large rather than allowing an OOM 
> Exception to happen. We might be able to do the same thing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] janhoy commented on pull request #1769: SOLR-11245: Absorb the docker-solr repo.

2020-08-27 Thread GitBox



janhoy commented on pull request #1769:
URL: https://github.com/apache/lucene-solr/pull/1769#issuecomment-681865016


   My first attempt at this gave some bash errors:
   
   ```
   ./gradlew assemble
   > Configure project :solr:docker
   readlink: illegal option -- f
   usage: readlink [-n] [file ...]
   
/Users/janhoy/git/lucene-solr/solr/docker/tests/cases/create_core_exec/test.sh: 
line 18: ./../../shared.sh: No such file or directory
   
   FAILURE: Build failed with an exception.
   ```
   
   I recognize this from the docker-solr repo, the test scripts use some 
commands that do not work on MacOS. So I added the workaround with putting gnu 
variants of these tools in my path, but then I could still not make the 
assemble task run:
   
   ```
   > Configure project :solr:docker
   Test /Users/janhoy/git/lucene-solr/solr/docker/tests/cases/create_core_exec 
apache/solr:9.0.0-SNAPSHOT
   Cleaning up left-over containers from previous runs
   Running test_apache_solr_9.0.0_SNAPSHOT
   Unable to find image 'apache/solr:9.0.0-SNAPSHOT' locally
   docker: Error response from daemon: pull access denied for apache/solr, 
repository does not exist or may require 'docker login': denied: requested 
access to the resource is denied.
   See 'docker run --help'.
   
   FAILURE: Build failed with an exception.
   ```
   
   I had to uncomment the `task test()` from `docker/build.gradle` and then my 
build succeeded.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] janhoy commented on pull request #1362: SOLR-13768 Remove requiresub parameter in JWTAuthPlugin

2020-08-27 Thread GitBox



janhoy commented on pull request #1362:
URL: https://github.com/apache/lucene-solr/pull/1362#issuecomment-681924972


   Perhaps this can be more dynamic, i.e. we require whatever claim that is 
configured for `principalClaim` in the config. This defaults to `sub`, but if 
someone configures e.g. `principalClaim: userid` then we should no longer 
require the `sub` claim.
   
   So I claim that we don't need an extra config option. Solr needs a principal 
ID anyway, so we have to require *something*.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14726) Streamline getting started experience

2020-08-27 Thread Gus Heck (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185855#comment-17185855
 ] 

Gus Heck commented on SOLR-14726:
-

Can we make it a goal that the user be **completely** unaware of what mode 
(cloud or not) they are using in the initial contact. That's deployment stuff 
and nothing they should even think about on first contact. I think they should 
run "tutorial1.sh" or {{bin/solr -e tutorial1}} and then pull up a page in 
their web browser to see it worked. Cloud or non-cloud can be used behind the 
scenes as current or future maintainers see fit. An adapted version of my 
comments on slack:

There are various things to learn about solr... I might order them thus for 
what I (IMHO) consider optimal pedagogy:
 # {color:#0747a6}First Contact: A cushy easy intro that stands up solr, throws 
data in for them, and let's the user query it either in the UI or via curl as 
suits them (different people have different styles){color}
 # {color:#0747a6}Basic search concepts: inverted indexes, tokenization, a 
query syntax, sort vs relevancy scoring.{color}
 # {color:#0747a6}How to get data in (because without data whatever), and the 
need to be able to re-index{color}
 # How to deploy solr in a basically competent fashion for light duty use in 
low security environments
 # Features such as facets, highlighting, analysis options etc, this section 
should be an a la carte menu into the ref guide, as by this point they are 
becoming more advanced.
 # Hardening and Scaling solr, and otherwise making it production ready

For the first 3 you really don't want the user to see any of #4 and it really 
doesn't matter if it's cloud or not so long as the person trying to learn 
doesn't see whichever it is. I think bin/solr -e accomplishes that with #1, and 
we basically don't do a good job of teaching #3 (in the ref guide). When you 
get to #4 I can't imagine which cases you would want to have them start with 
non-cloud solr, and have a closing section on non-cloud and the trade-offs of 
using it. #5 should be a la carte anyway, and we do have a fairly coherent 
section for #6 

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>Reporter: Ishan Chattopadhyaya
>Assignee: Alexandre Rafalovitch
>Priority: Major
>  Labels: newdev
> Attachments: yasa-http.png
>
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14726) Streamline getting started experience

2020-08-27 Thread Gus Heck (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185857#comment-17185857
 ] 

Gus Heck commented on SOLR-14726:
-

One caveat to what I just said is that cloud vs non-cloud does somewhat matter 
for "getting data in" WRT which SolrJ classes one might use.

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>Reporter: Ishan Chattopadhyaya
>Assignee: Alexandre Rafalovitch
>Priority: Major
>  Labels: newdev
> Attachments: yasa-http.png
>
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-14726) Streamline getting started experience

2020-08-27 Thread Gus Heck (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185855#comment-17185855
 ] 

Gus Heck edited comment on SOLR-14726 at 8/27/20, 2:00 PM:
---

Can we make it a goal that the user be **completely** unaware of what mode 
(cloud or not) they are using in the initial contact. That's deployment stuff 
and nothing they should even think about on first contact. I think they should 
run "tutorial1.sh" or {{bin/solr -e tutorial1}} and then pull up a page in 
their web browser to see it worked. Cloud or non-cloud can be used behind the 
scenes as current or future maintainers see fit. An adapted version of my 
comments on slack:

There are various things to learn about solr... I might order them thus for 
what I (IMHO) consider optimal pedagogy:
 # {color:#0747a6}First Contact: A cushy easy intro that stands up solr, throws 
data in for them, and let's the user query it either in the UI or via curl as 
suits them (different people have different styles){color}
 # {color:#0747a6}Basic search concepts: inverted indexes, tokenization, a 
query syntax, sort vs relevancy scoring.{color}
 # {color:#0747a6}How to get data in (because without data whatever), and the 
need to be able to re-index{color}
 # How to deploy solr in a basically competent fashion for light duty use in 
low security environments
 # Features such as facets, highlighting, analysis options etc, this section 
should be an a la carte menu into the ref guide, as by this point they are 
becoming more advanced.
 # Hardening and Scaling solr, and otherwise making it production ready

For the first 3 you really don't want the user to see any of #4 and it really 
doesn't matter if it's cloud or not so long as the person trying to learn 
doesn't see whichever it is. I think bin/solr -e accomplishes that with #1, and 
we basically don't do a good job of teaching #3 (in the ref guide). When you 
get to #4 I can't imagine which cases you would want to have them start with 
non-cloud solr, though that section should have a closing section on non-cloud 
and the trade-offs of using it. #5 should be a la carte anyway, and we do have 
a fairly coherent section for #6 


was (Author: gus_heck):
Can we make it a goal that the user be **completely** unaware of what mode 
(cloud or not) they are using in the initial contact. That's deployment stuff 
and nothing they should even think about on first contact. I think they should 
run "tutorial1.sh" or {{bin/solr -e tutorial1}} and then pull up a page in 
their web browser to see it worked. Cloud or non-cloud can be used behind the 
scenes as current or future maintainers see fit. An adapted version of my 
comments on slack:

There are various things to learn about solr... I might order them thus for 
what I (IMHO) consider optimal pedagogy:
 # {color:#0747a6}First Contact: A cushy easy intro that stands up solr, throws 
data in for them, and let's the user query it either in the UI or via curl as 
suits them (different people have different styles){color}
 # {color:#0747a6}Basic search concepts: inverted indexes, tokenization, a 
query syntax, sort vs relevancy scoring.{color}
 # {color:#0747a6}How to get data in (because without data whatever), and the 
need to be able to re-index{color}
 # How to deploy solr in a basically competent fashion for light duty use in 
low security environments
 # Features such as facets, highlighting, analysis options etc, this section 
should be an a la carte menu into the ref guide, as by this point they are 
becoming more advanced.
 # Hardening and Scaling solr, and otherwise making it production ready

For the first 3 you really don't want the user to see any of #4 and it really 
doesn't matter if it's cloud or not so long as the person trying to learn 
doesn't see whichever it is. I think bin/solr -e accomplishes that with #1, and 
we basically don't do a good job of teaching #3 (in the ref guide). When you 
get to #4 I can't imagine which cases you would want to have them start with 
non-cloud solr, and have a closing section on non-cloud and the trade-offs of 
using it. #5 should be a la carte anyway, and we do have a fairly coherent 
section for #6 

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>Reporter: Ishan Chattopadhyaya
>Assignee: Alexandre Rafalovitch
>Priority: Major
>  Labels: newdev
> Attachments: yasa-http.png
>
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
>

[jira] [Created] (SOLR-14782) QueryElevationComponent does not handle escaped query terms

2020-08-27 Thread Thomas Schmiereck (Jira)

Thomas Schmiereck created SOLR-14782:


 Summary: QueryElevationComponent does not handle escaped query 
terms
 Key: SOLR-14782
 URL: https://issues.apache.org/jira/browse/SOLR-14782
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: query parsers
Affects Versions: 8.2
Reporter: Thomas Schmiereck


h1. Description

if the elevate.xml contains a entry with spaces:

<{color:#0033b3}query {color}{color:#174ad4}text{color}{color:#067d17}="aaa 
bbb"{color}><{color:#0033b3}doc 
{color}{color:#174ad4}id{color}{color:#067d17}="core2docId2" 
{color}/>

and the Solr query term is escaped:

{{?q=aaa\+bbb}}

the Solr search itself handels this correctly, but the elevate component 
"QueryElevationComponent" does not unescape the query term bevor the lookup in 
the elevate.xml.

Result is that the entry is not elevated.

A also valid (not escaped) query like:

{{?q=aaa%20bbb}}

is working.
h1. Technical Notes

see:
org.apache.solr.handler.component.QueryElevationComponent.MapElevationProvider#getElevationForQuery

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14726) Streamline getting started experience

2020-08-27 Thread Gus Heck (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185873#comment-17185873
 ] 

Gus Heck commented on SOLR-14726:
-

Oh and there's that elephant just outside the doorway (i.e. not in scope for 
this ticket)... the lack of user friendly documentation for lucene itself :)

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>Reporter: Ishan Chattopadhyaya
>Assignee: Alexandre Rafalovitch
>Priority: Major
>  Labels: newdev
> Attachments: yasa-http.png
>
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9485) Smoke tester could abort early if Solr port 8983 is not available

2020-08-27 Thread Jira

Jan Høydahl created LUCENE-9485:
---

 Summary: Smoke tester could abort early if Solr port 8983 is not 
available
 Key: LUCENE-9485
 URL: https://issues.apache.org/jira/browse/LUCENE-9485
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Jan Høydahl
Assignee: Jan Høydahl


Just a small improvement, if you have something running on port 8983, the 
smoketester will not detect that until all the lucene tests are run, and you 
waste time :)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] janhoy opened a new pull request #1792: LUCENE-9485: Check early if Solr port 8983 is available

2020-08-27 Thread GitBox



janhoy opened a new pull request #1792:
URL: https://github.com/apache/lucene-solr/pull/1792


   Simple port check before tests start



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] msokolov commented on pull request #1789: LUCENE-9484: Allow sorting an index after the fact

2020-08-27 Thread GitBox



msokolov commented on pull request #1789:
URL: https://github.com/apache/lucene-solr/pull/1789#issuecomment-681978792


   It's nice that this change is so straightforward. It makes me realize I 
don't know what happens today if we specify an index Sort and then open an 
existing index that is not sorted. Do we throw an error? Should we instead 
provide a sorted view on the index that can be used to rewrite it?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] s1monw commented on pull request #1789: LUCENE-9484: Allow sorting an index after the fact

2020-08-27 Thread GitBox



s1monw commented on pull request #1789:
URL: https://github.com/apache/lucene-solr/pull/1789#issuecomment-681983796


   > It's nice that this change is so straightforward. It makes me realize I 
don't know what happens today if we specify an index Sort and then open an 
existing index that is not sorted. Do we throw an error? Should we instead 
provide a sorted view on the index that can be used to rewrite it?
   
   yes we fail if you do that. I don't think we should do any magic here. 
rewriting can be very costly and taking lots of space. I think failing is the 
right thing to do.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14726) Streamline getting started experience

2020-08-27 Thread Alexandre Rafalovitch (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185889#comment-17185889
 ] 

Alexandre Rafalovitch commented on SOLR-14726:
--

I absolutely agree on First Contact being very smooth and hopefully show the 
power of Solr very quickly (which means good example dataset in the box). To 
me, however, this means standalone start as it just requires less moving parts 
and things to explain (e.g. Collection vs Core in Admin UI).

At the same time, it is important to recognize that users who are new to Solr 
(or any complex product) may make easy mistakes very early on. If we don't give 
them equally easy way to troubleshoot, they may never get to the step 4. That 
again points away from cloud for the initial contact because, for example, if 
we ask them to run a config API command to update schema definition, they could 
see if that managed-schema has been rewritten or not. In the cloud, it is much 
harder. These checkpoints are quite important I feel and I am always annoyed by 
the tutorials that just give you step after step and you don't know if you 
mistyped something at step 2 or step 10.

Also step 6 is mostly outside of this ticket's scope, apart from solving the 
issue that default configuration is currently so hard to read that people take 
that into production, however many warnings we put in the middle of a 
thousand-line file.

> Streamline getting started experience
> -
>
> Key: SOLR-14726
> URL: https://issues.apache.org/jira/browse/SOLR-14726
> Project: Solr
>  Issue Type: Task
>Reporter: Ishan Chattopadhyaya
>Assignee: Alexandre Rafalovitch
>Priority: Major
>  Labels: newdev
> Attachments: yasa-http.png
>
>
> The reference guide Solr tutorial is here:
> https://lucene.apache.org/solr/guide/8_6/solr-tutorial.html
> It needs to be simplified and easy to follow. Also, it should reflect our 
> best practices, that should also be followed in production. I have following 
> suggestions:
> # Make it less verbose. It is too long. On my laptop, it required 35 page 
> downs button presses to get to the bottom of the page!
> # First step of the tutorial should be to enable security (basic auth should 
> suffice).
> # {{./bin/solr start -e cloud}} <-- All references of -e should be removed.
> # All references of {{bin/solr post}} to be replaced with {{curl}}
> # Convert all {{bin/solr create}} references to curl of collection creation 
> commands
> # Add docker based startup instructions.
> # Create a Jupyter Notebook version of the entire tutorial, make it so that 
> it can be easily executed from Google Colaboratory. Here's an example: 
> https://twitter.com/TheSearchStack/status/1289703715981496320
> # Provide downloadable Postman and Insomnia files so that the same tutorial 
> can be executed from those tools. Except for starting Solr, all other steps 
> should be possible to be carried out from those tools.
> # Use V2 APIs everywhere in the tutorial
> # Remove all example modes, sample data (films, tech products etc.), 
> configsets from Solr's distribution (instead let the examples refer to them 
> from github)
> # Remove the post tool from Solr, curl should suffice.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (LUCENE-9486) Explore using preset dictionaries with LZ4 for stored fields

2020-08-27 Thread Adrien Grand (Jira)

Adrien Grand created LUCENE-9486:


 Summary: Explore using preset dictionaries with LZ4 for stored 
fields
 Key: LUCENE-9486
 URL: https://issues.apache.org/jira/browse/LUCENE-9486
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand


Follow-up of LUCENE-9447: using preset dictionaries with DEFLATE provided very 
significant gains. Adding support for preset dictionaries with LZ4 would be 
easy so let's give it a try?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] jpountz opened a new pull request #1793: LUCENE-9486: Use preset dictionaries with LZ4 for BEST_SPEED.

2020-08-27 Thread GitBox



jpountz opened a new pull request #1793:
URL: https://github.com/apache/lucene-solr/pull/1793


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14779) Solr collections gets wiped on restart

2020-08-27 Thread Antonio Dinis (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185896#comment-17185896
 ] 

Antonio Dinis commented on SOLR-14779:
--

Didn't manage to get any help from the community

> Solr collections gets wiped on restart 
> ---
>
> Key: SOLR-14779
> URL: https://issues.apache.org/jira/browse/SOLR-14779
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 8.3
>Reporter: Antonio Dinis
>Priority: Major
>
> Hello,
> We have a 3 node Solr cluster (ensemble) with apache-zookeeper 3.5.5.
> It works fine until we need to restart one of the nodes. Then all the content 
> of the collection gets deleted.
> This is a production environment, and every time there is a restart or a 
> crash in one of the services/servers we loose lots of time restoring the 
> collection and work. 
> This is the way we start the nodes:
> su - ipls004p -c "/applis/24374-iplsp-00/IPLS/solr-8.3.0/bin/solr start 
> -cloud -p 8987 -h s01vl9918254 -s 
> /applis/24374-iplsp-00/IPLS/solr-8.3.0/cloud/node1/solr -z 
> s01vl9918254:2181,s01vl9918256:2181,s01vl9918258:2181 -force"
> This is the zoo.cfg:
> # The number of milliseconds of each tick
> tickTime=2000
> # The number of ticks that the initial
> # synchronization phase can take
> initLimit=10
> # The number of ticks that can pass between
> # sending a request and getting an acknowledgement
> syncLimit=5
> # the directory where the snapshot is stored.
> # do not use /tmp for storage, /tmp here is just
> # example sakes.
> dataDir=/applis/24374-iplsp-00/IPLS/apache-zookeeper-3.5.5-bin/temp
> # the port at which the clients will connect
> clientPort=2181
> # the maximum number of client connections.
> # increase this if you need to handle more clients
> #maxClientCnxns=60
> #
> # Be sure to read the maintenance section of the
> # administrator guide before turning on autopurge.
> #
> # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
> #
> # The number of snapshots to retain in dataDir
> #autopurge.snapRetainCount=3
> # Purge task interval in hours
> # Set to "0" to disable auto purge feature
> #autopurge.purgeInterval=1
> 4lw.commands.whitelist=mntr,conf,ruok
> server.1=s01vl9918256:3889:3888
> server.2=s01vl9918258:3889:3888
> server.3=s01vl9918254:3889:3888
> #server.4=s01vl9918255:3889:3888
>  
>  
> Thanks in advance
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-4045) Switch Maven test runner from maven-surefire-plugin to com.carrotsearch.randomizedtesting:junit4-maven-plugin

2020-08-27 Thread Dawid Weiss (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-4045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-4045.
-
Resolution: Invalid

No longer applicable (master).

> Switch Maven test runner from maven-surefire-plugin to 
> com.carrotsearch.randomizedtesting:junit4-maven-plugin
> -
>
> Key: LUCENE-4045
> URL: https://issues.apache.org/jira/browse/LUCENE-4045
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: general/test
>Affects Versions: 4.0-ALPHA
>Reporter: Steven Rowe
>Assignee: Dawid Weiss
>Priority: Minor
>
> {{com.carrotsearch.randomizedtesting:junit4-maven-plugin}} can be used to run 
> all Lucene/Solr tests under Maven, providing faster execution through load 
> balancing, along with all the other goodies CS.RT brings.  (Not to mention it 
> would make testing under Maven much more like testing under Ant.)
> From [a post Dawid Weiss made on the maven-dev mailing list in 
> January|http://mail-archives.apache.org/mod_mbox/maven-dev/201201.mbox/%3ccam21rt8kdxowwmy3uq+fqvn979fq3mbm_gveqwoh226mh_z...@mail.gmail.com%3E]:
> {quote}
> http://labs.carrotsearch.com/randomizedtesting.html
> [...]
> Load balancing is just part of what the project is about [...]
> Maven integration can be seen as an integration test (with scarce 
> documentation yet) here:
> https://github.com/carrotsearch/randomizedtesting/blob/master/integration-maven/junit4-maven-plugin-tests/src/it/01-basic-test/pom.xml
> I've used it in another project, so a cleaner example of use from within a 
> POM is here (you should disable surefire or your tests will run twice):
> https://github.com/carrotsearch/hppc/blob/master/hppc-core/pom.xml#L217
> {quote}
> And from [a post Dawid made to the lucene-dev mailing list in 
> April|http://mail-archives.apache.org/mod_mbox/lucene-dev/201204.mbox/%3CCAM21Rt-Rc_Z6X04fznvsbK-d8APiUuYCMC-kvg7i=hzthaq...@mail.gmail.com%3E]:
> {quote}
> I didn't mention it but there is actually an equivalent of  task as a 
> maven plugin... it basically redirects to the ant-plugin but has a maven-like 
> facade for passing the basic set of properties. I don't think it makes such a 
> big difference for maven build - we can stick to surefire. Let me know if 
> you'd like to try that other plugin though -- an example of a maven pom using 
> it is here:
> https://github.com/carrotsearch/randomizedtesting/blob/master/examples/maven/pom.xml
> {quote}
> The CS.RT maven plugin requires Maven v3.0.2+; I asked Dawid whether Maven 
> 2.2.1 could be supported, and in private emails to me, he replied:
> {quote}
> I looked at it but it seems I need to stick to Maven 3 -- there are APIs for 
> filtering artefacts that seem to be available in 3.x only (copied and pasted 
> from surefire and maven core). If you want to dig the code is on github, I 
> won't have the time to look into it in the near future (first short vacation, 
> then a backlog of crap to deal with).
> {quote}
> {quote}
> I admit I don't have enough Maven powers to actually think of a way to use 
> either surefire or another plugin (depending on a sysproperty or something). 
> This could be a fallback for folks who really need maven 2.x.
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5283) Fail the build if ant test didn't execute any tests (everything filtered out).

2020-08-27 Thread Dawid Weiss (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-5283.
-
Resolution: Fixed

> Fail the build if ant test didn't execute any tests (everything filtered out).
> --
>
> Key: LUCENE-5283
> URL: https://issues.apache.org/jira/browse/LUCENE-5283
> Project: Lucene - Core
>  Issue Type: Wish
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Trivial
> Fix For: 6.0, 4.6
>
> Attachments: LUCENE-5283-permgen.patch, LUCENE-5283.patch, 
> LUCENE-5283.patch, LUCENE-5283.patch, LUCENE-5283.patch
>
>
> This should be an optional setting that defaults to 'false' (the build 
> proceeds).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-9126) Javadoc linting options silently swallow documentation errors

2020-08-27 Thread Dawid Weiss (Jira)



 [ 
https://issues.apache.org/jira/browse/LUCENE-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Weiss resolved LUCENE-9126.
-
Resolution: Workaround

> Javadoc linting options silently swallow documentation errors
> -
>
> Key: LUCENE-9126
> URL: https://issues.apache.org/jira/browse/LUCENE-9126
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
> Fix For: master (9.0)
>
>
> I tried to compile javadocs in gradle and I couldn't do it... The output was 
> full of errors.
> I eventually narrowed the problem down to lint options – how they are 
> interpreted and parsed just doesn't make any sense to me. Try this:
> {code}
> # Examples below use plain javadoc from Java 11.
> cd lucene/core
> {code}
> This emulates what we have in Ant (this is roughly the options Ant emits):
> {code}
> javadoc -d build\output -encoding "UTF-8" -sourcepath src\java -subpackages 
> org -quiet -Xdoclint:all -Xdoclint:-missing -Xdoclint:-accessibility
> => no errors.
> {code}
> Now rerun it with this syntax:
> {code}
> javadoc -d build\output -encoding "UTF-8" -sourcepath src\java -subpackages 
> org -quiet -Xdoclint:all,-missing,-accessibility
> => 100 errors, 5 warnings
> {code}
> This time javadoc displays errors about undefined tags (unknown tag: 
> lucene.experimental), HTML warnings (warning: empty  tag), etc.
> Let's add our custom tags and add overview file:
> {code}
> javadoc -overview "src/java/overview.html" -tag "lucene.experimental:a:xxx" 
> -tag "lucene.internal:a:xxx" -tag "lucene.spi:t:xxx" -d build\output 
> -encoding "UTF-8" -sourcepath src\java -subpackages org -quiet 
> -Xdoclint:all,-missing,-accessibility
> => 100 errors, 5 warnings
> => still HTML warnings
> {code}
> Let's get rid of html linting:
> {code}
> javadoc -overview "src/java/overview.html" -tag "lucene.experimental:a:xxx" 
> -tag "lucene.internal:a:xxx" -tag "lucene.spi:t:xxx" -d build\output 
> -encoding "UTF-8" -sourcepath src\java -subpackages org -quiet 
> -Xdoclint:all,-missing,-accessibility,-html
> => 3 errors
> => malformed HTML syntax in overview.html: src\java\overview.html:150: error: 
> bad use of '>' (>)
> {code}
> Finally, let's get rid of syntax linting:
> {code}
> javadoc -overview "src/java/overview.html" -tag "lucene.experimental:a:xxx" 
> -tag "lucene.internal:a:xxx" -tag "lucene.spi:t:xxx" -d build\output 
> -encoding "UTF-8" -sourcepath src\java -subpackages org -quiet 
> -Xdoclint:all,-missing,-accessibility,-html,-syntax
> => passes
> {code}
> There are definitely bugs in our documentation -- look at the extra ">" in 
> the overview file, for example:
> https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/overview.html#L150
> What I can't understand is why the first syntax suppresses pretty much ALL 
> the errors, including missing custom tag definitions. This should work, given 
> what's written in [1]?
> [1] https://docs.oracle.com/en/java/javase/11/tools/javadoc.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9126) Javadoc linting options silently swallow documentation errors

2020-08-27 Thread Dawid Weiss (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185903#comment-17185903
 ] 

Dawid Weiss commented on LUCENE-9126:
-

JDK bug is fixed in JDK 15. I'm closing this as there's not much to do from our 
side.

> Javadoc linting options silently swallow documentation errors
> -
>
> Key: LUCENE-9126
> URL: https://issues.apache.org/jira/browse/LUCENE-9126
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>Priority: Minor
> Fix For: master (9.0)
>
>
> I tried to compile javadocs in gradle and I couldn't do it... The output was 
> full of errors.
> I eventually narrowed the problem down to lint options – how they are 
> interpreted and parsed just doesn't make any sense to me. Try this:
> {code}
> # Examples below use plain javadoc from Java 11.
> cd lucene/core
> {code}
> This emulates what we have in Ant (this is roughly the options Ant emits):
> {code}
> javadoc -d build\output -encoding "UTF-8" -sourcepath src\java -subpackages 
> org -quiet -Xdoclint:all -Xdoclint:-missing -Xdoclint:-accessibility
> => no errors.
> {code}
> Now rerun it with this syntax:
> {code}
> javadoc -d build\output -encoding "UTF-8" -sourcepath src\java -subpackages 
> org -quiet -Xdoclint:all,-missing,-accessibility
> => 100 errors, 5 warnings
> {code}
> This time javadoc displays errors about undefined tags (unknown tag: 
> lucene.experimental), HTML warnings (warning: empty  tag), etc.
> Let's add our custom tags and add overview file:
> {code}
> javadoc -overview "src/java/overview.html" -tag "lucene.experimental:a:xxx" 
> -tag "lucene.internal:a:xxx" -tag "lucene.spi:t:xxx" -d build\output 
> -encoding "UTF-8" -sourcepath src\java -subpackages org -quiet 
> -Xdoclint:all,-missing,-accessibility
> => 100 errors, 5 warnings
> => still HTML warnings
> {code}
> Let's get rid of html linting:
> {code}
> javadoc -overview "src/java/overview.html" -tag "lucene.experimental:a:xxx" 
> -tag "lucene.internal:a:xxx" -tag "lucene.spi:t:xxx" -d build\output 
> -encoding "UTF-8" -sourcepath src\java -subpackages org -quiet 
> -Xdoclint:all,-missing,-accessibility,-html
> => 3 errors
> => malformed HTML syntax in overview.html: src\java\overview.html:150: error: 
> bad use of '>' (>)
> {code}
> Finally, let's get rid of syntax linting:
> {code}
> javadoc -overview "src/java/overview.html" -tag "lucene.experimental:a:xxx" 
> -tag "lucene.internal:a:xxx" -tag "lucene.spi:t:xxx" -d build\output 
> -encoding "UTF-8" -sourcepath src\java -subpackages org -quiet 
> -Xdoclint:all,-missing,-accessibility,-html,-syntax
> => passes
> {code}
> There are definitely bugs in our documentation -- look at the extra ">" in 
> the overview file, for example:
> https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/overview.html#L150
> What I can't understand is why the first syntax suppresses pretty much ALL 
> the errors, including missing custom tag definitions. This should work, given 
> what's written in [1]?
> [1] https://docs.oracle.com/en/java/javase/11/tools/javadoc.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-9486) Explore using preset dictionaries with LZ4 for stored fields

2020-08-27 Thread Adrien Grand (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185904#comment-17185904
 ] 

Adrien Grand edited comment on LUCENE-9486 at 8/27/20, 3:02 PM:


I played with various configurations and ended up with a preset dictionary of 
4kB combined with 10 sub blocks of 60kB, which gives interesting results. Here 
are some benchmarks on the same datasets as LUCENE-9447:

On highly compressible JSON logs:

||Method||Index size(MB)||Index time(s)||Avg fetch time (us)||
|LZ4(16kB) (current BEST_SPEED)|304,2|9|5|
|LZ4(60kB)|141,7|7,5|10|
|LZ4(256kB)|105,1|7,5|33|
|LZ4(1MB)|96,5|7,5|115|
|LZ4 with preset dict (new BEST_SPEED)|91,9|7,5|16|
|Deflate with preset dict (new BEST_COMPRESSION)|64.9|14|41|

On enwiki documents:

||Method||Index size(MB)||Index time(s)||Avg fetch time (us)||
|LZ4(16kB) (current BEST_SPEED)|558,8|14,5|83|
|LZ4(60kB)|526,2|15|120|
|LZ4(256kB)|523,1|15|323|
|LZ4(1MB)|521,3|15,5|1151|
|LZ4 with preset dict (new BEST_SPEED)|515,2|15|135|
|Deflate with preset dict (new BEST_COMPRESSION)|338.0|35|250|

It makes fetch times a bit slower, which is fair I think given that these fetch 
times are still way under the cost of a page fault. Indexing remains as fast as 
today and compression gets respectively 3.3x and 8% better on these datasets.

I also included the results with BEST_COMPRESSION in the above benchmarks to 
show the trade-off that users are making when going with one versus the other.


was (Author: jpountz):
I played with various configurations and ended up with a preset dictionary of 
4kB combined with 10 sub blocks of 60kB, which gives interesting results. Here 
are some benchmarks on the same datasets as LUCENE-9447:

On highly compressible JSON logs:

||Method||Index size(MB)||Index time(s)||Avg fetch time (us)||
|LZ4(16kB) (current BEST_SPEED)|304,2|9|5|
|LZ4(60kB)|141,7|7,5|10|
|LZ4(256kB)|105,1|7,5|33|
|LZ4(1MB)|96,5|7,5|115|
|LZ4 with preset dict (new BEST_SPEED)|91,9|7,5|16|
|Deflate with preset dict (new BEST_SPEED)|64.9|14|41|

On enwiki documents:

||Method||Index size(MB)||Index time(s)||Avg fetch time (us)||
|LZ4(16kB) (current BEST_SPEED)|558,8|14,5|83|
|LZ4(60kB)|526,2|15|120|
|LZ4(256kB)|523,1|15|323|
|LZ4(1MB)|521,3|15,5|1151|
|LZ4 with preset dict (new BEST_SPEED)|515,2|15|135|
|Deflate with preset dict (new BEST_SPEED)|338.0|35|250|

It makes fetch times a bit slower, which is fair I think given that these fetch 
times are still way under the cost of a page fault. Indexing remains as fast as 
today and compression gets respectively 3.3x and 8% better on these datasets.

I also included the results with BEST_COMPRESSION in the above benchmarks to 
show the trade-off that users are making when going with one versus the other.

> Explore using preset dictionaries with LZ4 for stored fields
> 
>
> Key: LUCENE-9486
> URL: https://issues.apache.org/jira/browse/LUCENE-9486
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Follow-up of LUCENE-9447: using preset dictionaries with DEFLATE provided 
> very significant gains. Adding support for preset dictionaries with LZ4 would 
> be easy so let's give it a try?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9486) Explore using preset dictionaries with LZ4 for stored fields

2020-08-27 Thread Adrien Grand (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185904#comment-17185904
 ] 

Adrien Grand commented on LUCENE-9486:
--

I played with various configurations and ended up with a preset dictionary of 
4kB combined with 10 sub blocks of 60kB, which gives interesting results. Here 
are some benchmarks on the same datasets as LUCENE-9447:

On highly compressible JSON logs:

||Method||Index size(MB)||Index time(s)||Avg fetch time (us)||
|LZ4(16kB) (current BEST_SPEED)|304,2|9|5|
|LZ4(60kB)|141,7|7,5|10|
|LZ4(256kB)|105,1|7,5|33|
|LZ4(1MB)|96,5|7,5|115|
|LZ4 with preset dict (new BEST_SPEED)|91,9|7,5|16|
|Deflate with preset dict (new BEST_SPEED)|64.9|14|41|

On enwiki documents:

||Method||Index size(MB)||Index time(s)||Avg fetch time (us)||
|LZ4(16kB) (current BEST_SPEED)|558,8|14,5|83|
|LZ4(60kB)|526,2|15|120|
|LZ4(256kB)|523,1|15|323|
|LZ4(1MB)|521,3|15,5|1151|
|LZ4 with preset dict (new BEST_SPEED)|515,2|15|135|
|Deflate with preset dict (new BEST_SPEED)|338.0|35|250|

It makes fetch times a bit slower, which is fair I think given that these fetch 
times are still way under the cost of a page fault. Indexing remains as fast as 
today and compression gets respectively 3.3x and 8% better on these datasets.

I also included the results with BEST_COMPRESSION in the above benchmarks to 
show the trade-off that users are making when going with one versus the other.

> Explore using preset dictionaries with LZ4 for stored fields
> 
>
> Key: LUCENE-9486
> URL: https://issues.apache.org/jira/browse/LUCENE-9486
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Follow-up of LUCENE-9447: using preset dictionaries with DEFLATE provided 
> very significant gains. Adding support for preset dictionaries with LZ4 would 
> be easy so let's give it a try?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14779) Solr collections gets wiped on restart

2020-08-27 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17185908#comment-17185908
 ] 

Erick Erickson commented on SOLR-14779:
---

Check your spam folder, there's at least two replies already, the first within 
an hour and a half of your post.

Please do remember what you're asking for here. People are contributing their 
(or their company's) time to help with your problems _for free_. I suggested a 
couple of things to check, and your e-mail did not indicate you'd verified any 
either of them.

You may want to review:

https://cwiki.apache.org/confluence/display/SOLR/UsingMailingLists

> Solr collections gets wiped on restart 
> ---
>
> Key: SOLR-14779
> URL: https://issues.apache.org/jira/browse/SOLR-14779
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: 8.3
>Reporter: Antonio Dinis
>Priority: Major
>
> Hello,
> We have a 3 node Solr cluster (ensemble) with apache-zookeeper 3.5.5.
> It works fine until we need to restart one of the nodes. Then all the content 
> of the collection gets deleted.
> This is a production environment, and every time there is a restart or a 
> crash in one of the services/servers we loose lots of time restoring the 
> collection and work. 
> This is the way we start the nodes:
> su - ipls004p -c "/applis/24374-iplsp-00/IPLS/solr-8.3.0/bin/solr start 
> -cloud -p 8987 -h s01vl9918254 -s 
> /applis/24374-iplsp-00/IPLS/solr-8.3.0/cloud/node1/solr -z 
> s01vl9918254:2181,s01vl9918256:2181,s01vl9918258:2181 -force"
> This is the zoo.cfg:
> # The number of milliseconds of each tick
> tickTime=2000
> # The number of ticks that the initial
> # synchronization phase can take
> initLimit=10
> # The number of ticks that can pass between
> # sending a request and getting an acknowledgement
> syncLimit=5
> # the directory where the snapshot is stored.
> # do not use /tmp for storage, /tmp here is just
> # example sakes.
> dataDir=/applis/24374-iplsp-00/IPLS/apache-zookeeper-3.5.5-bin/temp
> # the port at which the clients will connect
> clientPort=2181
> # the maximum number of client connections.
> # increase this if you need to handle more clients
> #maxClientCnxns=60
> #
> # Be sure to read the maintenance section of the
> # administrator guide before turning on autopurge.
> #
> # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
> #
> # The number of snapshots to retain in dataDir
> #autopurge.snapRetainCount=3
> # Purge task interval in hours
> # Set to "0" to disable auto purge feature
> #autopurge.purgeInterval=1
> 4lw.commands.whitelist=mntr,conf,ruok
> server.1=s01vl9918256:3889:3888
> server.2=s01vl9918258:3889:3888
> server.3=s01vl9918254:3889:3888
> #server.4=s01vl9918255:3889:3888
>  
>  
> Thanks in advance
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] markrmiller commented on pull request #1781: SOLR-14777: Fix an ignore test that is helpful.

2020-08-27 Thread GitBox



markrmiller commented on pull request #1781:
URL: https://github.com/apache/lucene-solr/pull/1781#issuecomment-682046480


   > @markrmiller Is this the right idea for removing ignores?
   
   This is the right idea though, as Tim says, some of these tests may already 
be okay due to other changes.
   
   But basically, remove the ignores, if they don't pass address the issues, if 
the test is much slower than 10 seconds total (not just what junit reports) 
move it to @Nightly.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (LUCENE-9486) Explore using preset dictionaries with LZ4 for stored fields

2020-08-27 Thread Robert Muir (Jira)



[ 
https://issues.apache.org/jira/browse/LUCENE-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186038#comment-17186038
 ] 

Robert Muir commented on LUCENE-9486:
-

+1

> Explore using preset dictionaries with LZ4 for stored fields
> 
>
> Key: LUCENE-9486
> URL: https://issues.apache.org/jira/browse/LUCENE-9486
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Follow-up of LUCENE-9447: using preset dictionaries with DEFLATE provided 
> very significant gains. Adding support for preset dictionaries with LZ4 would 
> be easy so let's give it a try?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14783) Remove DIH from 9.0

2020-08-27 Thread Alexandre Rafalovitch (Jira)

Alexandre Rafalovitch created SOLR-14783:


 Summary: Remove DIH from 9.0
 Key: SOLR-14783
 URL: https://issues.apache.org/jira/browse/SOLR-14783
 Project: Solr
  Issue Type: Task
  Security Level: Public (Default Security Level. Issues are Public)
  Components: contrib - DataImportHandler
Affects Versions: master (9.0)
Reporter: Alexandre Rafalovitch
Assignee: Alexandre Rafalovitch


Now that Data Import Handler (SOLR-14066) has been depreciated in 8.6, it can 
be removed in next major version (9)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] HoustonPutman commented on pull request #1769: SOLR-11245: Absorb the docker-solr repo.

2020-08-27 Thread GitBox



HoustonPutman commented on pull request #1769:
URL: https://github.com/apache/lucene-solr/pull/1769#issuecomment-682185233


   @janhoy The tests should hopefully work for you now, though I do have the 
gnu utils installed, so I'm not sure they will.
   
   The tests are no longer run by default in the assemble, so that should work 
for you now regardless.
   
   I have done some modification on the tests to simplify out some of the 
logic. I also changed the permissions of folders created during the tests, so 
that it doesn't require root permissions anymore to run them. This works for 
me, but it might not work for others. Not confident on that yet.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14784) Reproducible failure for DirectUpdateHandlerTest

2020-08-27 Thread Erick Erickson (Jira)

Erick Erickson created SOLR-14784:
-

 Summary: Reproducible failure for DirectUpdateHandlerTest
 Key: SOLR-14784
 URL: https://issues.apache.org/jira/browse/SOLR-14784
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: Tests
Affects Versions: master (9.0)
Reporter: Erick Erickson


This is rather weird. It apparently was introduced by LUCENE-9456, but that 
seems odd. Although I do note that that push may do some different error 
handling, perhaps Solr needs to accommodate that.

Of course it doesn't necessarily reproduce with other seeds.

[~jpountz] do you have any hints?

Reproduce 100% with:

./gradlew :solr:core:test --tests 
"org.apache.solr.update.DirectUpdateHandlerTest" -Ptests.seed=2BE3A8682E5E346D 
-Ptests.multiplier=2 -Ptests.badapples=false -Ptests.file.encoding=US-ASCI

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14784) Reproducible failure for DirectUpdateHandlerTest

2020-08-27 Thread Erick Erickson (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-14784:
--
Attachment: DirectUpdateHandlerTest-fail.txt

> Reproducible failure for DirectUpdateHandlerTest
> 
>
> Key: SOLR-14784
> URL: https://issues.apache.org/jira/browse/SOLR-14784
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Affects Versions: master (9.0)
>Reporter: Erick Erickson
>Priority: Major
> Attachments: DirectUpdateHandlerTest-fail.txt
>
>
> This is rather weird. It apparently was introduced by LUCENE-9456, but that 
> seems odd. Although I do note that that push may do some different error 
> handling, perhaps Solr needs to accommodate that.
> Of course it doesn't necessarily reproduce with other seeds.
> [~jpountz] do you have any hints?
> Reproduce 100% with:
> ./gradlew :solr:core:test --tests 
> "org.apache.solr.update.DirectUpdateHandlerTest" 
> -Ptests.seed=2BE3A8682E5E346D -Ptests.multiplier=2 -Ptests.badapples=false 
> -Ptests.file.encoding=US-ASCI
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14784) Reproducible failure for DirectUpdateHandlerTest

2020-08-27 Thread Erick Erickson (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-14784:
--
Attachment: DirectUpdateHandlerTest-success.xml

> Reproducible failure for DirectUpdateHandlerTest
> 
>
> Key: SOLR-14784
> URL: https://issues.apache.org/jira/browse/SOLR-14784
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Affects Versions: master (9.0)
>Reporter: Erick Erickson
>Priority: Major
> Attachments: DirectUpdateHandlerTest-fail.txt, 
> DirectUpdateHandlerTest-success.xml
>
>
> This is rather weird. It apparently was introduced by LUCENE-9456, but that 
> seems odd. Although I do note that that push may do some different error 
> handling, perhaps Solr needs to accommodate that.
> Of course it doesn't necessarily reproduce with other seeds.
> [~jpountz] do you have any hints?
> Reproduce 100% with:
> ./gradlew :solr:core:test --tests 
> "org.apache.solr.update.DirectUpdateHandlerTest" 
> -Ptests.seed=2BE3A8682E5E346D -Ptests.multiplier=2 -Ptests.badapples=false 
> -Ptests.file.encoding=US-ASCI
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14784) Reproducible failure for DirectUpdateHandlerTest

2020-08-27 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186120#comment-17186120
 ] 

Erick Erickson commented on SOLR-14784:
---

Oh crap. Ignore me so far, although this test is failing fairly regularly.

tests.file.encoding=US-ASCI 

.vs.

tests.file.encoding=US-ASCII

 

not reprodudible 100% with the second "I".

 

> Reproducible failure for DirectUpdateHandlerTest
> 
>
> Key: SOLR-14784
> URL: https://issues.apache.org/jira/browse/SOLR-14784
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Affects Versions: master (9.0)
>Reporter: Erick Erickson
>Priority: Major
> Attachments: DirectUpdateHandlerTest-fail.txt, 
> DirectUpdateHandlerTest-success.xml
>
>
> This is rather weird. It apparently was introduced by LUCENE-9456, but that 
> seems odd. Although I do note that that push may do some different error 
> handling, perhaps Solr needs to accommodate that.
> Of course it doesn't necessarily reproduce with other seeds.
> [~jpountz] do you have any hints?
> Reproduce 100% with:
> ./gradlew :solr:core:test --tests 
> "org.apache.solr.update.DirectUpdateHandlerTest" 
> -Ptests.seed=2BE3A8682E5E346D -Ptests.multiplier=2 -Ptests.badapples=false 
> -Ptests.file.encoding=US-ASCI
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Issue Comment Deleted] (SOLR-14784) Reproducible failure for DirectUpdateHandlerTest

2020-08-27 Thread Erick Erickson (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-14784:
--
Comment: was deleted

(was: Oh crap. Ignore me so far, although this test is failing fairly regularly.

tests.file.encoding=US-ASCI 

.vs.

tests.file.encoding=US-ASCII

 

not reprodudible 100% with the second "I".

 )

> Reproducible failure for DirectUpdateHandlerTest
> 
>
> Key: SOLR-14784
> URL: https://issues.apache.org/jira/browse/SOLR-14784
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Affects Versions: master (9.0)
>Reporter: Erick Erickson
>Priority: Major
> Attachments: DirectUpdateHandlerTest-fail.txt, 
> DirectUpdateHandlerTest-success.xml
>
>
> This is rather weird. It apparently was introduced by LUCENE-9456, but that 
> seems odd. Although I do note that that push may do some different error 
> handling, perhaps Solr needs to accommodate that.
> Of course it doesn't necessarily reproduce with other seeds.
> [~jpountz] do you have any hints?
> Reproduce 100% with:
> ./gradlew :solr:core:test --tests 
> "org.apache.solr.update.DirectUpdateHandlerTest" 
> -Ptests.seed=2BE3A8682E5E346D -Ptests.multiplier=2 -Ptests.badapples=false 
> -Ptests.file.encoding=US-ASCI
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14784) Reproducible failure for DirectUpdateHandlerTest

2020-08-27 Thread Erick Erickson (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-14784:
--
Description: 
This is rather weird. It apparently was introduced by LUCENE-9456, but that 
seems odd. Although I do note that that push may do some different error 
handling, perhaps Solr needs to accommodate that.

Of course it doesn't necessarily reproduce with other seeds.

[~jpountz] do you have any hints?

Reproduce 100% with:

./gradlew :solr:core:test --tests 
"org.apache.solr.update.DirectUpdateHandlerTest" -Ptests.seed=2BE3A8682E5E346D 
-Ptests.multiplier=2 -Ptests.badapples=false -Ptests.file.encoding=US-ASCII

 

  was:
This is rather weird. It apparently was introduced by LUCENE-9456, but that 
seems odd. Although I do note that that push may do some different error 
handling, perhaps Solr needs to accommodate that.

Of course it doesn't necessarily reproduce with other seeds.

[~jpountz] do you have any hints?

Reproduce 100% with:

./gradlew :solr:core:test --tests 
"org.apache.solr.update.DirectUpdateHandlerTest" -Ptests.seed=2BE3A8682E5E346D 
-Ptests.multiplier=2 -Ptests.badapples=false -Ptests.file.encoding=US-ASCI

 


> Reproducible failure for DirectUpdateHandlerTest
> 
>
> Key: SOLR-14784
> URL: https://issues.apache.org/jira/browse/SOLR-14784
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Affects Versions: master (9.0)
>Reporter: Erick Erickson
>Priority: Major
> Attachments: DirectUpdateHandlerTest-fail.txt, 
> DirectUpdateHandlerTest-success.xml
>
>
> This is rather weird. It apparently was introduced by LUCENE-9456, but that 
> seems odd. Although I do note that that push may do some different error 
> handling, perhaps Solr needs to accommodate that.
> Of course it doesn't necessarily reproduce with other seeds.
> [~jpountz] do you have any hints?
> Reproduce 100% with:
> ./gradlew :solr:core:test --tests 
> "org.apache.solr.update.DirectUpdateHandlerTest" 
> -Ptests.seed=2BE3A8682E5E346D -Ptests.multiplier=2 -Ptests.badapples=false 
> -Ptests.file.encoding=US-ASCII
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14784) Reproducible failure for DirectUpdateHandlerTest

2020-08-27 Thread Erick Erickson (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186123#comment-17186123
 ] 

Erick Erickson commented on SOLR-14784:
---

Here's an interesting bit from the failure case:

 

[junit4] 2> 28524 INFO 
(SUITE-DirectUpdateHandlerTest-seed#[81E1DC8C3CA7BF3B]-worker) [ ] 
o.a.s.m.r.SolrJmxReporter Closing reporter 
[org.apache.solr.metrics.reporters.SolrJmxReporter@4e97ad89: rootName = null, 
domain = solr.jetty, service url = null, agent id = null] for registry 
solr.jetty/com.codahale.metrics.MetricRegistry@7d190268
 [junit4] 2> 28528 ERROR 
(SUITE-DirectUpdateHandlerTest-seed#[81E1DC8C3CA7BF3B]-worker) [ ] 
o.a.s.c.u.ObjectReleaseTracker 
 [junit4] 2> => java.lang.RuntimeException: MockDirectoryWrapper: cannot close: 
there are still 1 open files: \{_0.cfs=1}
 [junit4] 2> at 
org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:812)
 [junit4] 2> java.lang.RuntimeException: MockDirectoryWrapper: cannot close: 
there are still 1 open files: \{_0.cfs=1}
 [junit4] 2> at 
org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:812)
 ~[java/:?]
 [junit4] 2> at 
org.apache.solr.common.util.ObjectReleaseTracker.tryClose(ObjectReleaseTracker.java:85)
 [java/:?]
 [junit4] 2> at 
org.apache.solr.SolrTestCaseJ4.teardownTestCases(SolrTestCaseJ4.java:329) 
[java/:?]
 [junit4] 2> at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method) ~[?:?]
 [junit4] 2> at 
jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 ~[?:?]
 [junit4] 2> at 
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:?]
 [junit4] 2> at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
 [junit4] 2> at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1754)
 [randomizedtesting-runner-2.7.6.jar:?]
 [junit4] 2> at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:905)
 [randomizedtesting-runner-2.7.6.jar:?]
 [junit4] 2> at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 [randomizedtesting-runner-2.7.6.jar:?]
 [junit4] 2> at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
 [randomizedtesting-runner-2.7.6.jar:?]
 [junit4] 2> at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
 [java/:?]

> Reproducible failure for DirectUpdateHandlerTest
> 
>
> Key: SOLR-14784
> URL: https://issues.apache.org/jira/browse/SOLR-14784
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Affects Versions: master (9.0)
>Reporter: Erick Erickson
>Priority: Major
> Attachments: DirectUpdateHandlerTest-fail.txt, 
> DirectUpdateHandlerTest-success.xml
>
>
> This is rather weird. It apparently was introduced by LUCENE-9456, but that 
> seems odd. Although I do note that that push may do some different error 
> handling, perhaps Solr needs to accommodate that.
> Of course it doesn't necessarily reproduce with other seeds.
> [~jpountz] do you have any hints?
> Reproduce 100% with:
> ./gradlew :solr:core:test --tests 
> "org.apache.solr.update.DirectUpdateHandlerTest" 
> -Ptests.seed=2BE3A8682E5E346D -Ptests.multiplier=2 -Ptests.badapples=false 
> -Ptests.file.encoding=US-ASCII
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14783) Remove DIH from 9.0

2020-08-27 Thread Ishan Chattopadhyaya (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186148#comment-17186148
 ] 

Ishan Chattopadhyaya commented on SOLR-14783:
-

+1

> Remove DIH from 9.0
> ---
>
> Key: SOLR-14783
> URL: https://issues.apache.org/jira/browse/SOLR-14783
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - DataImportHandler
>Affects Versions: master (9.0)
>Reporter: Alexandre Rafalovitch
>Assignee: Alexandre Rafalovitch
>Priority: Major
>
> Now that Data Import Handler (SOLR-14066) has been depreciated in 8.6, it can 
> be removed in next major version (9)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14783) Remove DIH from 9.0

2020-08-27 Thread Alexandre Rafalovitch (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186157#comment-17186157
 ] 

Alexandre Rafalovitch commented on SOLR-14783:
--

DIH depends on two database engines:
 * org.hsqldb:hsqldb
 * org.apache.derby:derby

 

HSQLDB seems to be also used by SolrJ

DERBY does not seem to be used by anything else, therefore its references and 
checksums can be removed too.

> Remove DIH from 9.0
> ---
>
> Key: SOLR-14783
> URL: https://issues.apache.org/jira/browse/SOLR-14783
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - DataImportHandler
>Affects Versions: master (9.0)
>Reporter: Alexandre Rafalovitch
>Assignee: Alexandre Rafalovitch
>Priority: Major
>
> Now that Data Import Handler (SOLR-14066) has been depreciated in 8.6, it can 
> be removed in next major version (9)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14783) Remove DIH from 9.0

2020-08-27 Thread Alexandre Rafalovitch (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186159#comment-17186159
 ] 

Alexandre Rafalovitch commented on SOLR-14783:
--

solr/server/etc/security.policy has DIH permission. Need to test if ripping 
that out will not break any assumptions.

> Remove DIH from 9.0
> ---
>
> Key: SOLR-14783
> URL: https://issues.apache.org/jira/browse/SOLR-14783
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - DataImportHandler
>Affects Versions: master (9.0)
>Reporter: Alexandre Rafalovitch
>Assignee: Alexandre Rafalovitch
>Priority: Major
>
> Now that Data Import Handler (SOLR-14066) has been depreciated in 8.6, it can 
> be removed in next major version (9)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14783) Remove DIH from 9.0

2020-08-27 Thread Ishan Chattopadhyaya (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186170#comment-17186170
 ] 

Ishan Chattopadhyaya commented on SOLR-14783:
-

bq. solr/server/etc/security.policy has DIH permission. Need to test if ripping 
that out will not break any assumptions.

Lets leave it in. It is the deregisterDriver permission which is totally fine 
to keep it, would be an ugly deal breaker if we were to have DIH users add it 
by hand.

> Remove DIH from 9.0
> ---
>
> Key: SOLR-14783
> URL: https://issues.apache.org/jira/browse/SOLR-14783
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - DataImportHandler
>Affects Versions: master (9.0)
>Reporter: Alexandre Rafalovitch
>Assignee: Alexandre Rafalovitch
>Priority: Major
>
> Now that Data Import Handler (SOLR-14066) has been depreciated in 8.6, it can 
> be removed in next major version (9)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14616) Remove CDCR from 9.0

2020-08-27 Thread Ishan Chattopadhyaya (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186185#comment-17186185
 ] 

Ishan Chattopadhyaya commented on SOLR-14616:
-

Based on today's committer meeting and Erick's post on CDCR in the Roadmap 
thread, I think it is very important to remove CDCR in 9.0. There's no reason 
to support it in 9.0 only to remove it one version later. Better to take such 
drastic measures in a major release. I'm planning to revive the effort to 
remove it in 9.0, and I'm planning to merge the PR soon.

> Remove CDCR from 9.0
> 
>
> Key: SOLR-14616
> URL: https://issues.apache.org/jira/browse/SOLR-14616
> Project: Solr
>  Issue Type: Sub-task
>Affects Versions: master (9.0)
>Reporter: Ishan Chattopadhyaya
>Assignee: Anshum Gupta
>Priority: Blocker
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This was deprecated in SOLR-14022 and should be removed in 9.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Assigned] (SOLR-14616) Remove CDCR from 9.0

2020-08-27 Thread Ishan Chattopadhyaya (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya reassigned SOLR-14616:
---

Assignee: Ishan Chattopadhyaya  (was: Anshum Gupta)

> Remove CDCR from 9.0
> 
>
> Key: SOLR-14616
> URL: https://issues.apache.org/jira/browse/SOLR-14616
> Project: Solr
>  Issue Type: Sub-task
>Affects Versions: master (9.0)
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Blocker
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This was deprecated in SOLR-14022 and should be removed in 9.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14616) Remove CDCR from 9.0

2020-08-27 Thread Ishan Chattopadhyaya (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ishan Chattopadhyaya updated SOLR-14616:

Fix Version/s: master (9.0)

> Remove CDCR from 9.0
> 
>
> Key: SOLR-14616
> URL: https://issues.apache.org/jira/browse/SOLR-14616
> Project: Solr
>  Issue Type: Sub-task
>Affects Versions: master (9.0)
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Blocker
> Fix For: master (9.0)
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This was deprecated in SOLR-14022 and should be removed in 9.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene-solr] arafalov opened a new pull request #1794: SOLR-14783: Remove DIH from 9.0

2020-08-27 Thread GitBox



arafalov opened a new pull request #1794:
URL: https://github.com/apache/lucene-solr/pull/1794


   # Description
   
   DIH has been deprecated in 8.6 and was marked to remove in 9.0
   
   # Solution
   
   This removes DIH and vast majority of its references.
   (Should probably be squashed on merge, it was split into many commits for 
ease of review.)
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [X] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [X] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [X] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [X] I have developed this patch against the `master` branch.
   - [X] I have run `ant precommit` and the appropriate test suite.
   - [X] I have added/removed documentation for the [Ref 
Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) 
(for Solr changes only).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Updated] (SOLR-14776) Precompute the fingerprint during PeerSync

2020-08-27 Thread Cao Manh Dat (Jira)



 [ 
https://issues.apache.org/jira/browse/SOLR-14776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-14776:

Description: Computing fingerprint can very costly and take time. But the 
current implementation will send requests for getting fingerprint for multiple 
replicas, then on the first response it will then compute its own fingerprint 
for comparison. A very simple but effective improvement here is compute its own 
fingerprint right after send requests to other replicas.  (was: Computing 
fingerprint can very costly and take time. )

> Precompute the fingerprint during PeerSync
> --
>
> Key: SOLR-14776
> URL: https://issues.apache.org/jira/browse/SOLR-14776
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
>Priority: Major
>
> Computing fingerprint can very costly and take time. But the current 
> implementation will send requests for getting fingerprint for multiple 
> replicas, then on the first response it will then compute its own fingerprint 
> for comparison. A very simple but effective improvement here is compute its 
> own fingerprint right after send requests to other replicas.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14684) CloudExitableDirectoryReaderTest failing about 25% of the time

2020-08-27 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186238#comment-17186238
 ] 

ASF subversion and git services commented on SOLR-14684:


Commit a93cb7102f02b25e50dfac2353e9c4c2a445b177 in lucene-solr's branch 
refs/heads/branch_8x from Cao Manh Dat
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a93cb71 ]

SOLR-14684: CloudExitableDirectoryReaderTest failing about 25% of the time 
(#1724)



> CloudExitableDirectoryReaderTest failing about 25% of the time
> --
>
> Key: SOLR-14684
> URL: https://issues.apache.org/jira/browse/SOLR-14684
> Project: Solr
>  Issue Type: Test
>  Components: Tests
>Affects Versions: master (9.0)
>Reporter: Erick Erickson
>Priority: Major
> Attachments: stdout
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> If I beast this on my local machine, it fails (non reproducibly of course) 
> about 1/4 of the time. Log attached. The test itself hasn't changed in 11 
> months or so.
> It looks like occasionally the calls throw an error rather than return 
> partial results with a message: "Time allowed to handle this request 
> exceeded:[]".
> It's been failing very intermittently for a couple of years, but the failure 
> rate really picked up in the last couple of weeks. IDK whether the failures 
> prior to the last couple of weeks are the same root cause.
> I'll do some spelunking to see if I can pinpoint the commit that made this 
> happen, but it'll take a while.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14684) CloudExitableDirectoryReaderTest failing about 25% of the time

2020-08-27 Thread ASF subversion and git services (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186239#comment-17186239
 ] 

ASF subversion and git services commented on SOLR-14684:


Commit 5f0c9dfbabd99f470823241437b20ada05ee79d7 in lucene-solr's branch 
refs/heads/branch_8x from Cao Manh Dat
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5f0c9df ]

SOLR-14684: Skipping check time exceeded for the first request in a proper way


> CloudExitableDirectoryReaderTest failing about 25% of the time
> --
>
> Key: SOLR-14684
> URL: https://issues.apache.org/jira/browse/SOLR-14684
> Project: Solr
>  Issue Type: Test
>  Components: Tests
>Affects Versions: master (9.0)
>Reporter: Erick Erickson
>Priority: Major
> Attachments: stdout
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> If I beast this on my local machine, it fails (non reproducibly of course) 
> about 1/4 of the time. Log attached. The test itself hasn't changed in 11 
> months or so.
> It looks like occasionally the calls throw an error rather than return 
> partial results with a message: "Time allowed to handle this request 
> exceeded:[]".
> It's been failing very intermittently for a couple of years, but the failure 
> rate really picked up in the last couple of weeks. IDK whether the failures 
> prior to the last couple of weeks are the same root cause.
> I'll do some spelunking to see if I can pinpoint the commit that made this 
> happen, but it'll take a while.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-13973) Deprecate Tika

2020-08-27 Thread David Smiley (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-13973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186244#comment-17186244
 ] 

David Smiley commented on SOLR-13973:
-

Andrew, your illustration confuses/surprises me a bit.  Wouldn't the sync 
connector simply talk to Tika Server (which returns the text) to then pass 
along to Solr?  This adds just one more service to your original setup.  You 
could even embed Tika into the connector if your objective is to keep the 
service count low.

> Deprecate Tika
> --
>
> Key: SOLR-13973
> URL: https://issues.apache.org/jira/browse/SOLR-13973
> Project: Solr
>  Issue Type: Improvement
>Reporter: Ishan Chattopadhyaya
>Priority: Blocker
> Fix For: 8.7
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Solr's primary responsibility should be to focus on search and scalability. 
> Having to deal with the problems (CVEs) of Velocity, Tika etc. can slow us 
> down. I propose that we deprecate it going forward.
> Tika can be run outside Solr. Going forward, if someone wants to use these, 
> it should be possible to bring them into third party packages and installed 
> via package manager.
> Plan is to just to throw warnings in logs and add deprecation notes in 
> reference guide for now. Removal can be done in 9.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14783) Remove DIH from 9.0

2020-08-27 Thread Alexandre Rafalovitch (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186251#comment-17186251
 ] 

Alexandre Rafalovitch commented on SOLR-14783:
--

The changes are in PR, ready for the review.

> Remove DIH from 9.0
> ---
>
> Key: SOLR-14783
> URL: https://issues.apache.org/jira/browse/SOLR-14783
> Project: Solr
>  Issue Type: Task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - DataImportHandler
>Affects Versions: master (9.0)
>Reporter: Alexandre Rafalovitch
>Assignee: Alexandre Rafalovitch
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Now that Data Import Handler (SOLR-14066) has been depreciated in 8.6, it can 
> be removed in next major version (9)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Commented] (SOLR-14617) Remove DIH from 9.0

2020-08-27 Thread Alexandre Rafalovitch (Jira)



[ 
https://issues.apache.org/jira/browse/SOLR-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186256#comment-17186256
 ] 

Alexandre Rafalovitch commented on SOLR-14617:
--

Accidental duplication. This issue could even be deleted and the other linked 
as needed.

> Remove DIH from 9.0
> ---
>
> Key: SOLR-14617
> URL: https://issues.apache.org/jira/browse/SOLR-14617
> Project: Solr
>  Issue Type: Sub-task
>Affects Versions: master (9.0)
>Reporter: Ishan Chattopadhyaya
>Assignee: Ishan Chattopadhyaya
>Priority: Blocker
>
> Remove DIH from 9.0. This was deprecated in SOLR-14066.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[jira] [Created] (SOLR-14785) Update synonyms by API and reload collection in Solr

2020-08-27 Thread Gitterh (Jira)

Gitterh created SOLR-14785:
--

 Summary: Update synonyms by API and reload collection in Solr
 Key: SOLR-14785
 URL: https://issues.apache.org/jira/browse/SOLR-14785
 Project: Solr
  Issue Type: Task
  Security Level: Public (Default Security Level. Issues are Public)
  Components: search
Affects Versions: 8.6.1
Reporter: Gitterh


I am using Solr 8.6.1, started in solrcloud mode.
The field type is
```
{
 "add-field-type" : {
 "name":"articleTitle",
 "positionIncrementGap":100,
 "multiValued":false,
 "class":"solr.TextField",
 "indexAnalyzer":{
 "tokenizer":\{ "class":"solr.StandardTokenizerFactory" },
 "filters":[
 \{ "class":"solr.LowerCaseFilterFactory" },
 \{ "class":"solr.ManagedStopFilterFactory", "managed":"english" },
 \{ "class":"solr.ManagedSynonymGraphFilterFactory", "managed":"english" },
 \{ "class":"solr.FlattenGraphFilterFactory" },
 \{ "class":"solr.PorterStemFilterFactory" }
 ]
 },
 "queryAnalyzer":{
 "tokenizer":\{ "class":"solr.StandardTokenizerFactory" },
 "filters":[
 \{ "class":"solr.LowerCaseFilterFactory" },
 \{ "class":"solr.ManagedStopFilterFactory", "managed":"english" },
 \{ "class":"solr.ManagedSynonymGraphFilterFactory", "managed":"english" },
 \{ "class":"solr.PorterStemFilterFactory" }
 ]
 }
 }
 }
```
After I add a document
```
{
 "id": 100,
 "articleTitle": "Best smartphone"
} 
```
I update the synonyms list by API 
```
curl -X PUT -H 'Content-type:application/json' --data-binary '["iphone", 
"smartphone"]' 
"http://localhost:8983/solr/articles/schema/analysis/synonyms/english";
```
and reload the collection by API
```
http://localhost:8983/solr/admin/collections?action=RELOAD&name=articles
```

However when I try to search the documents don't pop-up.
```
http://localhost:8983/solr/articles/select?q=articleTitle:iphone
```
No result are returned. I expected that added document will be returned.

It works only if I first update the synonyms list and after that add the 
document into collection.

How to configure Solr to find the documents by synonyms if the synonyms are 
changed after documents are created?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

53 matches

Mail list logo