new data structure for some fields
Hello all i am facing some kind of requirement that where for an id p1 is associated with some category_ids c1,c2,c3,c4 with some integers b1,b2,b3,b4. We need to sort the query of solr on the basis of b1/b2/b3/b4 depending on given category_id . Right now we mapped the category_ids into multi-valued attribute. [c1,c2,c3,c4] something like this. we are querying into it. But from now we also need to find which integer b1,b2,b3.. associated with given category and also sort the whole query on it. sorry for any typos.. Regards Abhishek
Re: new data structure for some fields
hi binoy thanks for reply. I mean by sort is to sort the data-sets on the basis of integers values given for that category. For any document let say for an id P1, category associated is c1,c2,c3,c4 (using multivalued field) For new implementation similarly a number is associated with each category. let say c1---b1,c2---b2,c3---b3,c4---b4. now when we querying into solr for the ids which have c1 in their categories. (q=category_id:c1) now i want the result of this query sorted on the basis of number(b) associated with it throughout the result.. number of association is usually less than 20 (means an id can't be mapped more than 20 category_ids) On Mon, Dec 21, 2015 at 3:59 PM, Binoy Dalal wrote: > When you say sort, do you mean search on the basis of category and > integers? Or score the docs based on their category and integer values? > > Also, for any given document, how many categories or integers are > associated with it? > > On Mon, 21 Dec 2015, 14:43 Abhishek Mishra wrote: > > > Hello all > > > > i am facing some kind of requirement that where for an id p1 is > associated > > with some category_ids c1,c2,c3,c4 with some integers b1,b2,b3,b4. We > need > > to sort the query of solr on the basis of b1/b2/b3/b4 depending on given > > category_id . Right now we mapped the category_ids into multi-valued > > attribute. [c1,c2,c3,c4] something like this. we are querying into it. > But > > from now we also need to find which integer b1,b2,b3.. associated with > > given category and also sort the whole query on it. > > > > > > sorry for any typos.. > > > > Regards > > Abhishek > > > -- > Regards, > Binoy Dalal >
Re: new data structure for some fields
Hi binoy it will not work as category and integer is one to one mapping so if category_id is multivalued same goes to integer also. and you need some kind of mechanism which will identify which integer to pick given to category_id for search thenafter you can implement sort according to it. On Mon, Dec 21, 2015 at 5:27 PM, Binoy Dalal wrote: > Small edit: > The sort parameter in the solrconfig goes in the request handler > declaration that you're using. So if it's select, put in the name="defaults"> list. > > On Mon, 21 Dec 2015, 17:21 Binoy Dalal wrote: > > > OK. You will only be able to sort based on the integers if the integer > > field is single valued, I.e. only one integer is associated with one > > category I'd. > > > > To do this you've to use the sort parameter. > > You can either specify it in your solrconfig.XML like so: > > integer asc > > Field name followed by the order - asc/desc > > > > Or you can specify the it along with our query by appending it to your > > query like so: > > /select?q=query&sort=integet%20asc > > > > If you want to apply these sorting rules for all docs, then specify the > > sorting in your solrconfig. If you only want It for a certain subset then > > apply the parameter from code at the app level > > > > On Mon, 21 Dec 2015, 16:49 Abhishek Mishra wrote: > > > >> hi binoy > >> thanks for reply. I mean by sort is to sort the data-sets on the basis > of > >> integers values given for that category. > >> For any document let say for an id P1, > >> category associated is c1,c2,c3,c4 (using multivalued field) > >> For new implementation > >> similarly a number is associated with each category. let say > >> c1---b1,c2---b2,c3---b3,c4---b4. > >> now when we querying into solr for the ids which have c1 in their > >> categories. (q=category_id:c1) now i want the result of this query > sorted > >> on the basis of number(b) associated with it throughout the result.. > >> > >> number of association is usually less than 20 (means an id can't be > mapped > >> more than 20 category_ids) > >> > >> > >> On Mon, Dec 21, 2015 at 3:59 PM, Binoy Dalal > >> wrote: > >> > >> > When you say sort, do you mean search on the basis of category and > >> > integers? Or score the docs based on their category and integer > values? > >> > > >> > Also, for any given document, how many categories or integers are > >> > associated with it? > >> > > >> > On Mon, 21 Dec 2015, 14:43 Abhishek Mishra > >> wrote: > >> > > >> > > Hello all > >> > > > >> > > i am facing some kind of requirement that where for an id p1 is > >> > associated > >> > > with some category_ids c1,c2,c3,c4 with some integers b1,b2,b3,b4. > We > >> > need > >> > > to sort the query of solr on the basis of b1/b2/b3/b4 depending on > >> given > >> > > category_id . Right now we mapped the category_ids into multi-valued > >> > > attribute. [c1,c2,c3,c4] something like this. we are querying into > it. > >> > But > >> > > from now we also need to find which integer b1,b2,b3.. associated > with > >> > > given category and also sort the whole query on it. > >> > > > >> > > > >> > > sorry for any typos.. > >> > > > >> > > Regards > >> > > Abhishek > >> > > > >> > -- > >> > Regards, > >> > Binoy Dalal > >> > > >> > > -- > > Regards, > > Binoy Dalal > > > -- > Regards, > Binoy Dalal >
Need a group custom function(fieldcollapsing)
Hi all We are running on solr5.2.1 . Now the requirement come that we need the data on basis on some algo. The algorithm part we need to put on result obtained from query. So best we can do is using group.field,group.main,group.func. In group.func we need to use custom function which will run the algorithm part. My doubts are where we need to put custom function in which file??. I found some articles related to this https://dzone.com/articles/how-write-custom-solr in this it's not explained where to put the code part in which file. Regards, Abhishek
Re: Need a group custom function(fieldcollapsing)
Any update on this??? On Mon, Mar 14, 2016 at 4:06 PM, Abhishek Mishra wrote: > Hi all > We are running on solr5.2.1 . Now the requirement come that we need the > data on basis on some algo. The algorithm part we need to put on result > obtained from query. So best we can do is using > group.field,group.main,group.func. In group.func we need to use custom > function which will run the algorithm part. My doubts are where we need to > put custom function in which file??. I found some articles related to this > https://dzone.com/articles/how-write-custom-solr > in this it's not explained where to put the code part in which file. > > > Regards, > Abhishek >
edismax parsing confusion
Hi all i am running solr query with these parameter bf: "sum(product(new_popularity,100),if(exists(third_price),50,0))" qf: "test_product^5 category_path_tf^4 product_id gender" q: "handbags between rs150 and rs 400" defType: "edismax" parsed query is like below one for q:- (+(DisjunctionMaxQuery((category_path_tf:handbags^4.0 | gender:handbag | test_product:handbag^5.0 | product_id:handbags)) DisjunctionMaxQuery((category_path_tf:between^4.0 | gender:between | test_product:between^5.0 | product_id:between)) +DisjunctionMaxQuery((category_path_tf:rs150^4.0 | gender:rs150 | test_product:rs150^5.0 | product_id:rs150)) +DisjunctionMaxQuery((category_path_tf:rs^4.0 | gender:rs | test_product:rs^5.0 | product_id:rs)) DisjunctionMaxQuery((category_path_tf:400^4.0 | gender:400 | test_product:400^5.0 | product_id:400))) DisjunctionMaxQuery(("":"handbags between rs150 ? rs 400")) (DisjunctionMaxQuery(("":"handbags between")) DisjunctionMaxQuery(("":"between rs150")) DisjunctionMaxQuery(("":"rs 400"))) (DisjunctionMaxQuery(("":"handbags between rs150")) DisjunctionMaxQuery(("":"between rs150")) DisjunctionMaxQuery(("":"rs150 ? rs")) DisjunctionMaxQuery(("":"? rs 400"))) FunctionQuery(sum(product(float(new_popularity),const(100)),if(exists(float(third_price)),const(50),const(0)/no_coord but for dismax parser it is working perfect: (+(DisjunctionMaxQuery((category_path_tf:handbags^4.0 | gender:handbag | test_product:handbag^5.0 | product_id:handbags)) DisjunctionMaxQuery((category_path_tf:between^4.0 | gender:between | test_product:between^5.0 | product_id:between)) DisjunctionMaxQuery((category_path_tf:rs150^4.0 | gender:rs150 | test_product:rs150^5.0 | product_id:rs150)) DisjunctionMaxQuery((product_id:and)) DisjunctionMaxQuery((category_path_tf:rs^4.0 | gender:rs | test_product:rs^5.0 | product_id:rs)) DisjunctionMaxQuery((category_path_tf:400^4.0 | gender:400 | test_product:400^5.0 | product_id:400))) DisjunctionMaxQuery(("":"handbags between rs150 ? rs 400")) FunctionQuery(sum(product(float(new_popularity),const(100)),if(exists(float(third_price)),const(50),const(0)/no_coord *according to me difference between dismax and edismax is based on some extra features plus working of boosting fucntions.* Regards, Abhishek
Re: edismax parsing confusion
Hello guys sorry for late response. @steve I am using solr 5.2 . @greg i am using default mm from config file(According to me it is default mm is 1). Regards, Abhishek On Tue, Apr 4, 2017 at 5:27 AM, Greg Pendlebury wrote: > eDismax uses 'mm', so knowing what that has been set to is important, or if > it has been left unset/default you would need to consider whether 'q.op' > has been set. Or the default operator from the config file. > > Ta, > Greg > > > On 3 April 2017 at 23:56, Steve Rowe wrote: > > > Hi Abhishek, > > > > Which version of Solr are you using? > > > > I can see that the parsed queries are different, but they’re also very > > similar, and there’s a lot of detail there - can you be more specific > about > > what the problem is? > > > > -- > > Steve > > www.lucidworks.com > > > > > On Apr 3, 2017, at 4:54 AM, Abhishek Mishra > > wrote: > > > > > > Hi all > > > i am running solr query with these parameter > > > > > > bf: "sum(product(new_popularity,100),if(exists(third_price),50,0))" > > > qf: "test_product^5 category_path_tf^4 product_id gender" > > > q: "handbags between rs150 and rs 400" > > > defType: "edismax" > > > > > > parsed query is like below one > > > > > > for q:- > > > (+(DisjunctionMaxQuery((category_path_tf:handbags^4.0 | > gender:handbag | > > > test_product:handbag^5.0 | product_id:handbags)) > > > DisjunctionMaxQuery((category_path_tf:between^4.0 | gender:between | > > > test_product:between^5.0 | product_id:between)) > > > +DisjunctionMaxQuery((category_path_tf:rs150^4.0 | gender:rs150 | > > > test_product:rs150^5.0 | product_id:rs150)) > > > +DisjunctionMaxQuery((category_path_tf:rs^4.0 | gender:rs | > > > test_product:rs^5.0 | product_id:rs)) > > > DisjunctionMaxQuery((category_path_tf:400^4.0 | gender:400 | > > > test_product:400^5.0 | product_id:400))) DisjunctionMaxQuery(("":" > > handbags > > > between rs150 ? rs 400")) (DisjunctionMaxQuery(("":"handbags > between")) > > > DisjunctionMaxQuery(("":"between rs150")) DisjunctionMaxQuery(("":"rs > > > 400"))) (DisjunctionMaxQuery(("":"handbags between rs150")) > > > DisjunctionMaxQuery(("":"between rs150")) > > DisjunctionMaxQuery(("":"rs150 ? > > > rs")) DisjunctionMaxQuery(("":"? rs 400"))) > > > FunctionQuery(sum(product(float(new_popularity),const( > > 100)),if(exists(float(third_price)),const(50),const(0)/no_coord > > > > > > but for dismax parser it is working perfect: > > > > > > (+(DisjunctionMaxQuery((category_path_tf:handbags^4.0 | > gender:handbag | > > > test_product:handbag^5.0 | product_id:handbags)) > > > DisjunctionMaxQuery((category_path_tf:between^4.0 | gender:between | > > > test_product:between^5.0 | product_id:between)) > > > DisjunctionMaxQuery((category_path_tf:rs150^4.0 | gender:rs150 | > > > test_product:rs150^5.0 | product_id:rs150)) > > > DisjunctionMaxQuery((product_id:and)) > > > DisjunctionMaxQuery((category_path_tf:rs^4.0 | gender:rs | > > > test_product:rs^5.0 | product_id:rs)) > > > DisjunctionMaxQuery((category_path_tf:400^4.0 | gender:400 | > > > test_product:400^5.0 | product_id:400))) DisjunctionMaxQuery(("":" > > handbags > > > between rs150 ? rs 400")) > > > FunctionQuery(sum(product(float(new_popularity),const( > > 100)),if(exists(float(third_price)),const(50),const(0)/no_coord > > > > > > > > > *according to me difference between dismax and edismax is based on some > > > extra features plus working of boosting fucntions.* > > > > > > > > > > > > Regards, > > > Abhishek > > > > >
Inconsistent recovery status of replicas
Hello guys I am using Solr cloud 7.7 on Kubernetes. During the adding of replica sometimes we see inconsistency after successful addition nodes go to recovery status sometimes it takes 2-3 minute to recover while sometimes it takes more than an hour. We are getting this error. We have 4 shards each shard has around 7GB of data. After seeing the system metrics we see bandwidth exchanges are high between the leader and the new replica node. Do we have any way to rate-limit the bandwidth exchange like we had some configuration for it in master-slave? maxMbpersec something like that? Error > 2020-12-01 13:40:34.983 ERROR > (recoveryExecutor-4-thread-1-processing-n:solr-olxid-statefulset-pull-9.solr-olxid-statefulset-headless.relevance:8983_solr > x:olxid-20200531_d6e431ec_shard2_replica_p3955 c:olxid-20200531_d6e431ec > s:shard2 r:core_node3956) [c:olxid-20200531_d6e431ec s:shard2 r:core_node3956 > x:olxid-20200531_d6e431ec_shard2_replica_p3955] o.a.s.c.RecoveryStrategy > Error while trying to > recover:org.apache.solr.client.solrj.SolrServerException: Timeout occured > while waiting response from server at: > http://solr-olxid-statefulset-tlog-7.solr-olxid-statefulset-headless.relevance:8983/solr/olxid-20200531_d6e431ec_shard2_replica_t139 > at > org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:654) > at > org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255) > at > org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244) > at > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194) > at > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211) > at > org.apache.solr.cloud.RecoveryStrategy.commitOnLeader(RecoveryStrategy.java:287) > at > org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:215) > at > org.apache.solr.cloud.RecoveryStrategy.doReplicateOnlyRecovery(RecoveryStrategy.java:382) > at > org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:328) > at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:307) > at > com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.net.SocketTimeoutException: Read timed out > at java.base/java.net.SocketInputStream.socketRead0(Native Method) > at > java.base/java.net.SocketInputStream.socketRead(SocketInputStream.java:115) > at java.base/java.net.SocketInputStream.read(SocketInputStream.java:168) > at java.base/java.net.SocketInputStream.read(SocketInputStream.java:140) > at > org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) > at > org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) > at > org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282) > at > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) > at > org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) > at > org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) > at > org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) > at > org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165) > at > org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) > at > org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) > at > org.apache.solr.util.stats.InstrumentedHttpRequestExecutor.execute(InstrumentedHttpRequestExecutor.java:120) > at > org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) > at > org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) > at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) > at > org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110) > at > org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) >
Migrating from solr 7.7 to solr 8.6 issues
We are trying to migrate from solr 7.7 to solr 8.6 on Kubernetes. We are using zookeeper-3.4.13. While adding a replica to the cluster, it returns 500 status code. While in the background it is added sometimes successfully while sometime it is in the inactive node. We are using http2 without SSL. Error: > { "responseHeader":{ "status":500, "QTime":307}, "failure":{ "solr-pklatest-statefulset-pull-0.solr-pklatest-statefulset-headless.relevance:8983_solr":"org.apache.solr.client.solrj.SolrServerException:IOException occured when talking to server at: null"}, "Operation addreplica caused exception:":"org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: ADDREPLICA failed to create replica", "exception":{ "msg":"ADDREPLICA failed to create replica", "rspCode":500}, "error":{ "metadata":[ "error-class","org.apache.solr.common.SolrException", "root-error-class","org.apache.solr.common.SolrException"], "msg":"ADDREPLICA failed to create replica", "trace":"org.apache.solr.common.SolrException: ADDREPLICA failed to create replica\n\tat org.apache.solr.client.solrj.SolrResponse.getException(SolrResponse.java:65)\n\tat org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:286)\n\tat org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:257)\n\tat org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214)\n\tat org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:854)\n\tat org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:818)\n\tat org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:566)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:415)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)\n\tat org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1215)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221)\n\tat org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)\n\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\n\tat org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:500)\n\tat org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)\n\tat org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547)\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)\n\tat org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)\n\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)\n\tat org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)\n\tat org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)\n\tat org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThr
solrcloud with EKS kubernetes
Hello guys, We are kind of facing some of the issues(Like timeout etc.) which are very inconsistent. By any chance can it be related to EKS? We are using solr 7.7 and zookeeper 3.4.13. Should we move to ECS? Regards, Abhishek
Re: solrcloud with EKS kubernetes
Hi Houston, Sorry for the late reply. Each shard has a 9GB size around. Yeah, we are providing enough resources to pods. We are currently using c5.4xlarge. XMS and XMX is 16GB. The machine is having 32 GB and 16 core. No, I haven't run it outside Kubernetes. But I do have colleagues who did the same on 7.2 and didn't face any issue regarding it. Storage volume is gp2 50GB. It's not the search query where we are facing inconsistencies or timeouts. Seems some internal admin APIs sometimes have issues. So while adding new replica in clusters sometimes result in inconsistencies. Like recovery takes some time more than one hour. Regards, Abhishek On Thu, Dec 10, 2020 at 10:23 AM Houston Putman wrote: > Hello Abhishek, > > It's really hard to provide any advice without knowing any information > about your setup/usage. > > Are you giving your Solr pods enough resources on EKS? > Have you run Solr in the same configuration outside of kubernetes in the > past without timeouts? > What type of storage volumes are you using to store your data? > Are you using headless services to connect your Solr Nodes, or ingresses? > > If this is the first time that you are using this data + Solr > configuration, maybe it's just that your data within Solr isn't optimized > for the type of queries that you are doing. > If you have run it successfully in the past outside of Kubernetes, then I > would look at the resources that you are giving your pods and the storage > volumes that you are using. > If you are using Ingresses, that might be causing slow connections between > nodes, or between your client and Solr. > > - Houston > > On Wed, Dec 9, 2020 at 3:24 PM Abhishek Mishra > wrote: > > > Hello guys, > > We are kind of facing some of the issues(Like timeout etc.) which are > very > > inconsistent. By any chance can it be related to EKS? We are using solr > 7.7 > > and zookeeper 3.4.13. Should we move to ECS? > > > > Regards, > > Abhishek > > >
Re: solrcloud with EKS kubernetes
Hi Jonathan, Merry Christmas. Thanks for the suggestion. To manage IOPS can we do something on rate-limiting behalf? Regards, Abhishek On Thu, Dec 17, 2020 at 5:07 AM Jonathan Tan wrote: > Hi Abhishek, > > We're running Solr Cloud 8.6 on GKE. > 3 node cluster, running 4 cpus (configured) and 8gb of min & max JVM > configured, all with anti-affinity so they never exist on the same node. > It's got 2 collections of ~13documents each, 6 shards, 3 replicas each, > disk usage on each node is ~54gb (we've got all the shards replicated to > all nodes) > > We're also using a 200gb zonal SSD, which *has* been necessary just so that > we've got the right IOPS & bandwidth. (That's approximately 6000 IOPS for > read & write each, and 96MB/s for read & write each) > > Various lessons learnt... > You definitely don't want them ever on the same kubernetes node. From a > resilience perspective, yes, but also when one SOLR node gets busy, they > tend to all get busy, so now you'll have resource contention. Recovery can > also get very busy and resource intensive, and again, sitting on the same > node is problematic. We also saw the need to move to SSDs because of how > IOPS bound we were. > > Did I mention use SSDs? ;) > > Good luck! > > On Mon, Dec 14, 2020 at 5:34 PM Abhishek Mishra > wrote: > > > Hi Houston, > > Sorry for the late reply. Each shard has a 9GB size around. > > Yeah, we are providing enough resources to pods. We are currently > > using c5.4xlarge. > > XMS and XMX is 16GB. The machine is having 32 GB and 16 core. > > No, I haven't run it outside Kubernetes. But I do have colleagues who did > > the same on 7.2 and didn't face any issue regarding it. > > Storage volume is gp2 50GB. > > It's not the search query where we are facing inconsistencies or > timeouts. > > Seems some internal admin APIs sometimes have issues. So while adding new > > replica in clusters sometimes result in inconsistencies. Like recovery > > takes some time more than one hour. > > > > Regards, > > Abhishek > > > > On Thu, Dec 10, 2020 at 10:23 AM Houston Putman > > > wrote: > > > > > Hello Abhishek, > > > > > > It's really hard to provide any advice without knowing any information > > > about your setup/usage. > > > > > > Are you giving your Solr pods enough resources on EKS? > > > Have you run Solr in the same configuration outside of kubernetes in > the > > > past without timeouts? > > > What type of storage volumes are you using to store your data? > > > Are you using headless services to connect your Solr Nodes, or > ingresses? > > > > > > If this is the first time that you are using this data + Solr > > > configuration, maybe it's just that your data within Solr isn't > optimized > > > for the type of queries that you are doing. > > > If you have run it successfully in the past outside of Kubernetes, > then I > > > would look at the resources that you are giving your pods and the > storage > > > volumes that you are using. > > > If you are using Ingresses, that might be causing slow connections > > between > > > nodes, or between your client and Solr. > > > > > > - Houston > > > > > > On Wed, Dec 9, 2020 at 3:24 PM Abhishek Mishra > > > wrote: > > > > > > > Hello guys, > > > > We are kind of facing some of the issues(Like timeout etc.) which are > > > very > > > > inconsistent. By any chance can it be related to EKS? We are using > solr > > > 7.7 > > > > and zookeeper 3.4.13. Should we move to ECS? > > > > > > > > Regards, > > > > Abhishek > > > > > > > > > >
How pull replica works
I want to know how pull replica replicate from leader in real? Does internally admin API get data from the leader in form of batches? Regards, Abhishek
Re: How pull replica works
Thanks, Tomas. It was really helpful. Regards, Abhishek On Thu, Jan 7, 2021 at 7:03 AM Tomás Fernández Löbbe wrote: > Hi Abhishek, > The pull replicas uses the "/replication" endpoint to copy full segment > files (sections of the index) from the leader. It works in a similar way to > the legacy leader/follower replication. This[1] talk tries to explain the > different replica types and how they work. > > HTH, > > Tomás > > [1] https://www.youtube.com/watch?v=C8C9GRTCSzY > > On Tue, Jan 5, 2021 at 10:29 PM Abhishek Mishra > wrote: > > > I want to know how pull replica replicate from leader in real? Does > > internally admin API get data from the leader in form of batches? > > > > Regards, > > Abhishek > > >
Re: Solr background merge in case of pull replicas
Hi Kshitij What I can guess over here. Pull replicas replicate segments from tlog, so whenever merge happens on tlog it will decrease the number of segments which is more than ideal case(i.e. adding a new segment). Afaik adding/deleting the segment is kind of a stop the world moment. This can be the reason for the increase in response time. Regards, Abhishek On Thu, Jan 7, 2021 at 12:43 PM kshitij tyagi wrote: > Hi, > > I am not querying on tlog replicas, solr version is 8.6 and 2 tlogs and 4 > pull replica setup. > > why should pull replicas be affected during background segment merges? > > Regards, > kshitij > > On Wed, Jan 6, 2021 at 9:48 PM Ritvik Sharma > wrote: > > > Hi > > It may be the cause of rebalancing and querying is not available not on > > tlog at that moment. > > You can check tlog logs and pull log when u are facing this issue. > > > > May i know which version of solr you are using? and what is the ration of > > tlog and pull nodes. > > > > On Wed, 6 Jan 2021 at 2:46 PM, kshitij tyagi > > wrote: > > > > > Hi, > > > > > > I am having a tlog + pull replica solr cloud setup. > > > > > > 1. I am observing that whenever background segment merge is triggered > > > automatically, i see high response time on all of my solr nodes. > > > > > > As far as I know merges must be happening on tlog and hence the > increase > > > response time, i am not able to understand that why my pull replicas > are > > > affected during background index merges. > > > > > > Can someone give some insights on this? What is affecting my pull > > replicas > > > during index merges? > > > > > > Regards, > > > kshitij > > > > > >
Re: solrcloud with EKS kubernetes
Hi Jonathan, it was really helpful. Some of the metrics were crossing threshold like network bandwidth etc. Regards, Abhishek On Sat, Dec 26, 2020 at 7:54 PM Jonathan Tan wrote: > Hi Abhishek, > > Merry Christmas to you too! > I think it's really a question regarding your indexing speed NFRs. > > Have you had a chance to take a look at your IOPS & write bytes/second > graphs for that host & PVC? > > I'd suggest that's the first thing to go look at, so that you can find out > whether you're actually IOPS bound or not. > If you are, then it becomes a question of *how* you're indexing, and > whether that can be "slowed down" or not. > > > > On Thu, Dec 24, 2020 at 5:55 PM Abhishek Mishra > wrote: > > > Hi Jonathan, > > Merry Christmas. > > Thanks for the suggestion. To manage IOPS can we do something on > > rate-limiting behalf? > > > > Regards, > > Abhishek > > > > > > On Thu, Dec 17, 2020 at 5:07 AM Jonathan Tan wrote: > > > > > Hi Abhishek, > > > > > > We're running Solr Cloud 8.6 on GKE. > > > 3 node cluster, running 4 cpus (configured) and 8gb of min & max JVM > > > configured, all with anti-affinity so they never exist on the same > node. > > > It's got 2 collections of ~13documents each, 6 shards, 3 replicas each, > > > disk usage on each node is ~54gb (we've got all the shards replicated > to > > > all nodes) > > > > > > We're also using a 200gb zonal SSD, which *has* been necessary just so > > that > > > we've got the right IOPS & bandwidth. (That's approximately 6000 IOPS > for > > > read & write each, and 96MB/s for read & write each) > > > > > > Various lessons learnt... > > > You definitely don't want them ever on the same kubernetes node. From a > > > resilience perspective, yes, but also when one SOLR node gets busy, > they > > > tend to all get busy, so now you'll have resource contention. Recovery > > can > > > also get very busy and resource intensive, and again, sitting on the > same > > > node is problematic. We also saw the need to move to SSDs because of > how > > > IOPS bound we were. > > > > > > Did I mention use SSDs? ;) > > > > > > Good luck! > > > > > > On Mon, Dec 14, 2020 at 5:34 PM Abhishek Mishra > > > wrote: > > > > > > > Hi Houston, > > > > Sorry for the late reply. Each shard has a 9GB size around. > > > > Yeah, we are providing enough resources to pods. We are currently > > > > using c5.4xlarge. > > > > XMS and XMX is 16GB. The machine is having 32 GB and 16 core. > > > > No, I haven't run it outside Kubernetes. But I do have colleagues who > > did > > > > the same on 7.2 and didn't face any issue regarding it. > > > > Storage volume is gp2 50GB. > > > > It's not the search query where we are facing inconsistencies or > > > timeouts. > > > > Seems some internal admin APIs sometimes have issues. So while adding > > new > > > > replica in clusters sometimes result in inconsistencies. Like > recovery > > > > takes some time more than one hour. > > > > > > > > Regards, > > > > Abhishek > > > > > > > > On Thu, Dec 10, 2020 at 10:23 AM Houston Putman < > > houstonput...@gmail.com > > > > > > > > wrote: > > > > > > > > > Hello Abhishek, > > > > > > > > > > It's really hard to provide any advice without knowing any > > information > > > > > about your setup/usage. > > > > > > > > > > Are you giving your Solr pods enough resources on EKS? > > > > > Have you run Solr in the same configuration outside of kubernetes > in > > > the > > > > > past without timeouts? > > > > > What type of storage volumes are you using to store your data? > > > > > Are you using headless services to connect your Solr Nodes, or > > > ingresses? > > > > > > > > > > If this is the first time that you are using this data + Solr > > > > > configuration, maybe it's just that your data within Solr isn't > > > optimized > > > > > for the type of queries that you are doing. > > > > > If you have run it successfully in the past outside of Kubernetes, > > > then I > > > > > would look at the resources that you are giving your pods and the > > > storage > > > > > volumes that you are using. > > > > > If you are using Ingresses, that might be causing slow connections > > > > between > > > > > nodes, or between your client and Solr. > > > > > > > > > > - Houston > > > > > > > > > > On Wed, Dec 9, 2020 at 3:24 PM Abhishek Mishra < > solrmis...@gmail.com > > > > > > > > wrote: > > > > > > > > > > > Hello guys, > > > > > > We are kind of facing some of the issues(Like timeout etc.) which > > are > > > > > very > > > > > > inconsistent. By any chance can it be related to EKS? We are > using > > > solr > > > > > 7.7 > > > > > > and zookeeper 3.4.13. Should we move to ECS? > > > > > > > > > > > > Regards, > > > > > > Abhishek > > > > > > > > > > > > > > > > > > > > >