Re: How to maintain fast query speed during heavy indexing?
There are two issues: 1> autowarming on the replicas 2> Until https://issues.apache.org/jira/browse/SOLR-11982 (Solr 7.4, unreleased), requests would go to the leaders along with the PULL and TLOG replicas. Since the leaders were busily indexing, the entire query would suffer speed-wise. So what I'd do is see if you can apply the patch there and adjust your autowarming. Solr 7.4 will be out in the not-too-distant future, perhaps over the summer. No real schedule has been agreed on though, Best, Erick On Mon, May 21, 2018 at 9:23 PM, Nguyen Nguyen wrote: > Hello everyone, > > I'm running SolrCloud cluster of 5 nodes with 5 shards and 3 replicas per > shard. I usually see spikes in query performance during high indexing > period. I would like to have stable query response time even during high > indexing period. I recently upgraded to Solr 7.3 and running with 2 TLOG > replicas and 1 PULL replica. Using a small maxWriteMBPerSec for > replication and only query PULL replicas during indexing period, I'm still > seeing long query time for some queries (although not as often as before > the change). > > My first question is 'Is it possible to control replication of non-leader > like in master/slave configuration (eg: disablepoll, fetchindex)?'. This > way, I can disable replication on the followers until committing is > completed on the leaders while sending query requests to the followers (or > just PULL replica) only. Then when data is committed on leaders, I would > send query requests back to only leaders and tell the followers to start to > fetch the newly updated index. > > If manual replication control isn't possible, I'm planning to have > duplicate collections and use an alias to switch between the two collection > at different times. For example: while 'collection1' collection being > indexed, and alias 'search' would point to 'collection2' collection to > serve query request. Once indexing is completed on 'collection1', 'search' > alias would now point to 'collection1', and 'collection2' will be updated > to be in sync with 'collection1'. The cycle repeats for next indexing > cycle. My question for this method would be if there is any existing > method to sync one collection to another so that I don't have to send the > same update requests to the two collections. > > Also wondering if there are other better methods everyone is using? > > Thanks much! > > Cheers, > > -Nguyen
Solr Dates TimeZone
Hi It's possible to configure Solr with a timezone other than GMT? It's possible to configure Solr Admin to view dates with a timezone other than GMT? What is the best way to store a birth date in Solr? We use TrieDate type. Thanks!
Zookeeper 3.4.12 with Solr 6.6.2?
Is anybody running Zookeeper 3.4.12 with Solr 6.6.2? Is that a recommended combination? Not recommended? wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog)
Re: Atomic update error with JSON handler
Hi, Firstly, I have already tried the request body enclosed in [...] without success. Turns out it was not the only issue. The path was not right for the atomic updates: On the v2 API: localhost:8983/v2/c/testnode/update/json/commands?commit=true Succeeds localhost:8983/v2/c/testnode/update/json?commit=true Fails localhost:8983/v2/c/testnode/update?commit=true Fails On the old API: localhost:8983/solr/testnode/update/json?commit=true Succeeds localhost:8983/solr/testnode/update/json/docs?commit=true Fails Some insight about what caused my confusion: In the ( https://lucene.apache.org/solr/guide/7_3/updating-parts-of-documents.html#example-updating-part-of-a-document ) page of the Solr Guide, it is not emphasized that for an atomic JSON update to work you must use the command endpoint. Furthermore the example JSONs in that paragraph do not have the actual commands like add/delete/commit level as they are shown in a previous page. ( https://lucene.apache.org/solr/guide/7_3/uploading-data-with-index-handlers.html#sending-json-update-commands ) Either boldly stating that the atomic updates are commands or showing complete JSON requests as examples would have been much clearer. It is also surprising to me that the command endpoint accepts the "list of documents" format where the guide does not mention that. (at the second link provided) Thank you for pointing me in the right direction! Nandor On Tue, May 22, 2018 at 8:14 AM, Yasufumi Mizoguchi wrote: > Hi, > > At least, it is better to enclose your json body with '[ ]', I think. > > Following is the result I tried using curl. > > $ curl -XPOST "localhost:8983/solr/test_core/update/json?commit=true" > --data-binary '{"id":"test1","title":{"set":"Solr Rocks"}}' > { > "responseHeader":{ > "status":400, > "QTime":18}, > "error":{ > "metadata":[ > "error-class","org.apache.solr.common.SolrException", > "root-error-class","org.apache.solr.common.SolrException"], > "msg":"Unknown command 'id' at [5]", > "code":400}} > $ curl -XPOST "localhost:8983/solr/test_core/update/json?commit=true" > --data-binary '[{"id":"test1","title":{"set":"Solr Rocks"}}]' > { > "responseHeader":{ > "status":0, > "QTime":250}} > > Thanks, > Yasufumi > > > 2018年5月22日(火) 1:26 Nándor Mátravölgyi : > > > Hi, > > > > I'm trying to build a simple document search core with SolrCloud. I've > run > > into an issue when trying to partially update doucments. (aka atomic > > updates) It appears to be a bug, because the semantically same request > > succeeds in XML format, while it fails as JSON. > > > > The body of the XML request: > > test1 > update="set">Solr Rocks > > > > The body of the JSON request: > > {"id":"test1","title":{"set":"Solr Rocks"}} > > > > I'm using the requests library in Python3 to send the update request. > > Sending the XML request with the following code works as expected: > > r = requests.post(' > > http://localhost:8983/v2/c/testnode/update/xml?commit=true', > > headers={'Content-type': 'application/xml'}, data=xml) > > > > Sending the JSON request with the following codes return with a > > SolrException: > > r = requests.post(' > > http://localhost:8983/v2/c/testnode/update/json?commit=true', > > headers={'Content-type': 'application/json'}, data=json) > > r = requests.post(' > > http://localhost:8983/solr/testnode/update/json/docs?commit=true', > > headers={'Content-type': 'application/json'}, data=json) > > > > Using the same lines of code to send a JSON request that is not an atomic > > update works as expected. Such JSON request body is like: > > {"id":"test1","title":"Solr Rocks"} > > > > The error message in the response is: ERROR: [doc=test1] unknown field > > 'title.set' > > Here is the log of the exception: https://pastebin.com/raw/VJe5hR25 > > > > Depending on which API I send the request to, the logs are identical > except > > on line 27 and 28: > > This is with v2: > > at > > > > org.apache.solr.handler.UpdateRequestHandlerApi$1. > call(UpdateRequestHandlerApi.java:48) > > at org.apache.solr.api.V2HttpCall.execute(V2HttpCall.java:325) > > and this is with the other: > > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503) > > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711) > > > > I'm using Solr 7.3.1 and I believe I do everything according to the > > documentation. ( > > > > https://lucene.apache.org/solr/guide/7_3/updating-parts- > of-documents.html#atomic-updates > > ) > > The solrconfig.xml and managed-schema files are fairly simple, they have > > code snippets from the examples mostly: https://pastebin.com/199JJkp0 > > https://pastebin.com/Dp1YK46k > > > > This could be a bug, or I can't fathom what I'm missing. Can anyone help > me > > out? > > Thanks, > > Nandor > > >
Re: Zookeeper 3.4.12 with Solr 6.6.2?
We have 3.4.10 and have *tested* at a functional level 6.6.2. So far it works. We have not done any stress/load testing. But would have to do this prior to release. On Tue, May 22, 2018 at 9:44 AM, Walter Underwood wrote: > Is anybody running Zookeeper 3.4.12 with Solr 6.6.2? Is that a recommended > combination? Not recommended? > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > >
deletebyQuery vs deletebyId
Hi, I have a quick question about deletebyQuery vs deleteById. When using deleteByQuery, if query is id:123 is that same as deleteById in terms of performance. Thanks Jay
Re: deletebyQuery vs deletebyId
On 5/22/2018 6:35 PM, Jay Potharaju wrote: I have a quick question about deletebyQuery vs deleteById. When using deleteByQuery, if query is id:123 is that same as deleteById in terms of performance. If there is absolutely nothing else happening to update the index, the difference between the two would probably be outside normal human perception of time -- I think you'd only be able to see the difference by measuring it with software, and you might need something that can show time units below one millisecond. On a query that matches a lot of documents, the difference might be more pronounced, but likely still pretty small. The issue with DBQ, which I already explained to you on another mailing list thread, is that DBQ can interact badly with other operations, segment merges in particular. The delete itself won't take very long, but the simple fact that DBQ was used might result in a noticeable pause in your indexing operations. http://lucene.472066.n3.nabble.com/Async-exceptions-during-distributed-update-td4388725.html#a4388787 As mentioned there, the pauses don't happen with id-based delete. Thanks, Shawn
Is it possible to index documents without storing their content?
dear community, Is it possible to index documents (e.g. pdf, word,...) for fulltextsearch without storing their content(payload) inside Solr server? Thanking you in advance for your help BR Tom
Re: Question regarding TLS version for solr
Hi Christopher /Shawn , Thank you for replying .But ,I checked the java version solr using ,and it is already version 1.8. @Christopher ,can you let me know what steps you followed for TLS authentication on solr version 7.3.0. Thanks & Regards, - Anchal Sharma e-Pricer Development ES Team Mobile: +9871290248 -Christopher Schultz wrote: - To: solr-user@lucene.apache.org From: Christopher Schultz Date: 05/17/2018 06:29PM Subject: Re: Question regarding TLS version for solr -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Shawn, On 5/17/18 4:23 AM, Shawn Heisey wrote: > On 5/17/2018 1:53 AM, Anchal Sharma2 wrote: >> We are using solr version 5.3.0 and have been trying to enable >> security on our solr .We followed steps mentioned on site >> -https://lucene.apache.org/solr/guide/6_6/enabling-ssl.html .But >> by default it picks ,TLS version 1.0,which is causing an issue >> as our application uses TLSv 1.2.We tried using online resources >> ,but could not find anything regarding TLS enablement for solr . >> >> It will be a huge help if anyone can provide some suggestions as >> to how we can enable TLS v 1.2 for solr. > > The choice of ciphers and encryption protocols is mostly made by > Java. The servlet container might influence it as well. The only > servlet container that is supported since Solr 5.0 is the Jetty > that is bundled in the Solr download. > > TLS 1.2 was added in Java 7, and it became default in Java 8. If > you can install the latest version of Java 8 and make sure that it > has the policy files for unlimited crypto strength installed, > support for TLS 1.2 might happen automatically. There is no "default" TLS version for either the client or the server: the two endpoints always negotiate the highest mutual version they both support. The key agreement, authentication, and cipher suites are the items that are negotiated during the handshake. > Solr 5.3.0 is running a fairly old version of Jetty -- 9.2.11. > Information for 9.2.x versions is hard to find, so although I think > it probably CAN do TLS 1.2 if the Java version supports it, I can't > be absolutely sure. You'll need to upgrade Solr to get an upgraded > Jetty. I would be shocked if Jetty ships with its own crypto libraries; it should be using JSSE. Anchal, Java 1.7 or later is an absolute requirement if you want to use TLSv1.2 (and you SHOULD want to use it). I have recently spent a lot of time getting Solr 7.3.0 running with TLS mutual-authentication, but I haven't worked with the 5.3.x line. I can tell you have I've done things for my version, but they may need some adjustments for yours. - -chris -BEGIN PGP SIGNATURE- Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlr9fKYACgkQHPApP6U8 pFh8lRAAmmvBMUSk35keW0OG0/SHpUy/ExJK69JGIKGwi96ddbz2yH8MG+OjjE3G GNq/o5+EMT7tP/nW6XuPQou5UQvA2nlA9jsskox3A+CqOH7e6cbSxfxIkTqf9YDl Kxr4J6mYjvTIjJAqLXGF+ghJfswS6RjZezDgo1PdSUox+gUOvmY61tlSjuYTaAYw vH1i1DRzb8PkkR4ULePF48Y4r5+ZYz/4ZwSvnJTTkyl97KCw93rZ/kI5v9p3cCHK Ycuwi/ZirO/VNf/9ruAOtgET3aojNfuNCX/A+vrSbJfiY7mXo05lYKN+eT80elQr X8OKQaqHP6haF2aNPHrqXGtY2YoiGrdyaGtrXkUHFDfXgQeOmlk/eSVWemcSsatk eEHSWW9NALMaalRAM7NuXQtgqq1badJhKysiJwSqFgcdgVKcSt8SsQ/09qTPjaNE Ce1/EHdR6j1hM0Bnv5Hzf85cZjM7PfLmh7P8fnUD5d8eSbBpeWYVBDsS+fXp8WWv FO5axbnSYIScOIz33i0UZyxpJgcsAkABLGghL6WWQSkfBf4ANgdTumS7K9Pn7Thz Uq+lD9QPEPWJ91Fc0gnCWtDAEIRjOyLLbYzgI4ebV5qo41GO1WDDHfQZEcqA0Vod +K8oAMD8nnwU+TprTFkjlQwbDnW1q1efTD6IrpEL5H7h6Xw2cgg= =RpO6 -END PGP SIGNATURE-
Re: How to maintain fast query speed during heavy indexing?
Great info! Thanks, Erick! Cheers, Nguyen On Tue, May 22, 2018 at 5:45 AM Erick Erickson wrote: > There are two issues: > > 1> autowarming on the replicas > > 2> Until https://issues.apache.org/jira/browse/SOLR-11982 (Solr 7.4, > unreleased), requests would go to the leaders along with the PULL and > TLOG replicas. Since the leaders were busily indexing, the entire > query would suffer speed-wise. > > So what I'd do is see if you can apply the patch there and adjust your > autowarming. Solr 7.4 will be out in the not-too-distant future, > perhaps over the summer. No real schedule has been agreed on though, > > Best, > Erick > > On Mon, May 21, 2018 at 9:23 PM, Nguyen Nguyen > wrote: > > Hello everyone, > > > > I'm running SolrCloud cluster of 5 nodes with 5 shards and 3 replicas per > > shard. I usually see spikes in query performance during high indexing > > period. I would like to have stable query response time even during high > > indexing period. I recently upgraded to Solr 7.3 and running with 2 TLOG > > replicas and 1 PULL replica. Using a small maxWriteMBPerSec for > > replication and only query PULL replicas during indexing period, I'm > still > > seeing long query time for some queries (although not as often as before > > the change). > > > > My first question is 'Is it possible to control replication of non-leader > > like in master/slave configuration (eg: disablepoll, fetchindex)?'. This > > way, I can disable replication on the followers until committing is > > completed on the leaders while sending query requests to the followers > (or > > just PULL replica) only. Then when data is committed on leaders, I would > > send query requests back to only leaders and tell the followers to start > to > > fetch the newly updated index. > > > > If manual replication control isn't possible, I'm planning to have > > duplicate collections and use an alias to switch between the two > collection > > at different times. For example: while 'collection1' collection being > > indexed, and alias 'search' would point to 'collection2' collection to > > serve query request. Once indexing is completed on 'collection1', > 'search' > > alias would now point to 'collection1', and 'collection2' will be updated > > to be in sync with 'collection1'. The cycle repeats for next indexing > > cycle. My question for this method would be if there is any existing > > method to sync one collection to another so that I don't have to send the > > same update requests to the two collections. > > > > Also wondering if there are other better methods everyone is using? > > > > Thanks much! > > > > Cheers, > > > > -Nguyen >