Hi Jonathan, it was really helpful. Some of the metrics were crossing threshold like network bandwidth etc.
Regards, Abhishek On Sat, Dec 26, 2020 at 7:54 PM Jonathan Tan <jty....@gmail.com> wrote: > Hi Abhishek, > > Merry Christmas to you too! > I think it's really a question regarding your indexing speed NFRs. > > Have you had a chance to take a look at your IOPS & write bytes/second > graphs for that host & PVC? > > I'd suggest that's the first thing to go look at, so that you can find out > whether you're actually IOPS bound or not. > If you are, then it becomes a question of *how* you're indexing, and > whether that can be "slowed down" or not. > > > > On Thu, Dec 24, 2020 at 5:55 PM Abhishek Mishra <solrmis...@gmail.com> > wrote: > > > Hi Jonathan, > > Merry Christmas. > > Thanks for the suggestion. To manage IOPS can we do something on > > rate-limiting behalf? > > > > Regards, > > Abhishek > > > > > > On Thu, Dec 17, 2020 at 5:07 AM Jonathan Tan <jty....@gmail.com> wrote: > > > > > Hi Abhishek, > > > > > > We're running Solr Cloud 8.6 on GKE. > > > 3 node cluster, running 4 cpus (configured) and 8gb of min & max JVM > > > configured, all with anti-affinity so they never exist on the same > node. > > > It's got 2 collections of ~13documents each, 6 shards, 3 replicas each, > > > disk usage on each node is ~54gb (we've got all the shards replicated > to > > > all nodes) > > > > > > We're also using a 200gb zonal SSD, which *has* been necessary just so > > that > > > we've got the right IOPS & bandwidth. (That's approximately 6000 IOPS > for > > > read & write each, and 96MB/s for read & write each) > > > > > > Various lessons learnt... > > > You definitely don't want them ever on the same kubernetes node. From a > > > resilience perspective, yes, but also when one SOLR node gets busy, > they > > > tend to all get busy, so now you'll have resource contention. Recovery > > can > > > also get very busy and resource intensive, and again, sitting on the > same > > > node is problematic. We also saw the need to move to SSDs because of > how > > > IOPS bound we were. > > > > > > Did I mention use SSDs? ;) > > > > > > Good luck! > > > > > > On Mon, Dec 14, 2020 at 5:34 PM Abhishek Mishra <solrmis...@gmail.com> > > > wrote: > > > > > > > Hi Houston, > > > > Sorry for the late reply. Each shard has a 9GB size around. > > > > Yeah, we are providing enough resources to pods. We are currently > > > > using c5.4xlarge. > > > > XMS and XMX is 16GB. The machine is having 32 GB and 16 core. > > > > No, I haven't run it outside Kubernetes. But I do have colleagues who > > did > > > > the same on 7.2 and didn't face any issue regarding it. > > > > Storage volume is gp2 50GB. > > > > It's not the search query where we are facing inconsistencies or > > > timeouts. > > > > Seems some internal admin APIs sometimes have issues. So while adding > > new > > > > replica in clusters sometimes result in inconsistencies. Like > recovery > > > > takes some time more than one hour. > > > > > > > > Regards, > > > > Abhishek > > > > > > > > On Thu, Dec 10, 2020 at 10:23 AM Houston Putman < > > houstonput...@gmail.com > > > > > > > > wrote: > > > > > > > > > Hello Abhishek, > > > > > > > > > > It's really hard to provide any advice without knowing any > > information > > > > > about your setup/usage. > > > > > > > > > > Are you giving your Solr pods enough resources on EKS? > > > > > Have you run Solr in the same configuration outside of kubernetes > in > > > the > > > > > past without timeouts? > > > > > What type of storage volumes are you using to store your data? > > > > > Are you using headless services to connect your Solr Nodes, or > > > ingresses? > > > > > > > > > > If this is the first time that you are using this data + Solr > > > > > configuration, maybe it's just that your data within Solr isn't > > > optimized > > > > > for the type of queries that you are doing. > > > > > If you have run it successfully in the past outside of Kubernetes, > > > then I > > > > > would look at the resources that you are giving your pods and the > > > storage > > > > > volumes that you are using. > > > > > If you are using Ingresses, that might be causing slow connections > > > > between > > > > > nodes, or between your client and Solr. > > > > > > > > > > - Houston > > > > > > > > > > On Wed, Dec 9, 2020 at 3:24 PM Abhishek Mishra < > solrmis...@gmail.com > > > > > > > > wrote: > > > > > > > > > > > Hello guys, > > > > > > We are kind of facing some of the issues(Like timeout etc.) which > > are > > > > > very > > > > > > inconsistent. By any chance can it be related to EKS? We are > using > > > solr > > > > > 7.7 > > > > > > and zookeeper 3.4.13. Should we move to ECS? > > > > > > > > > > > > Regards, > > > > > > Abhishek > > > > > > > > > > > > > > > > > > > > >