we personally run solr on google cloud kubernetes engine and each node has a
512Gb persistent ssd (network attached) storage which gives roughly this
performance (read/write):
Sustained random IOPS limit 15,360.00 15,360.00
Sustained throughput limit (MB/s) 245.76 245.76
and we get very good p
hi all
what about cephfs or lustre distrubuted filesystem for such purpose ?
Karl Stoney writes:
> we personally run solr on google cloud kubernetes engine and each node has a
> 512Gb persistent ssd (network attached) storage which gives roughly this
> performance (read/write):
>
> Sustained
Walter’s comment (that I’ve seen too BTW) is something
to pursue if (and only if) you have proof that Solr is spinning
up thousands of threads. Do you have any proof of that?
Having several hundred threads running is quite common BTW.
Attach jconsole or take a thread dump and it’ll be obvious.
H
Could we expose some high level recovery info as part of metrics api? Then
people could track number of cores recovering, recovery time, recovery phase,
number of recoveries failed etc, and also build alerts on top of that.
Jan Høydahl
> 6. feb. 2020 kl. 19:42 skrev Erick Erickson :
>
> There
I was wondering about using metrics myself. I confess I didn’t look to see what
was already there either ;)
Actually, using metrics might be easiest all told, but I also confess I have no
clue what it takes to build a new metric in. Nor how to use the same (?)
collection process for the 5 situa
I wrote some Python that checks CLUSTERSTATUS and reports replica status to
Telegraf. Great for charts and alerts, but it only shows status, not progress.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 7, 2020, at 7:58 AM, Erick Erickson wrote:
Hello Everyone,
Let's say I have an analyzer which has following token stream as an output.
*token stream : [], a, ab, [], c, [], d, de, def .*
Now let's say I want to add another filter which will drop a certain tokens
based on whether adjacent token on the right side is [] or some string.
Hi All,
Our dataset is of 50M records and we are using complex graph query and now
trying to do innerjoin on the records and facing the below issue .
This is a critical issue .
Parent
{
parentId:"1"
parent.name:"foo"
type:"parent"
}
Child
{
childId:"2"
parentId:"1"
child.name:"bar"
type:"child"
This is working as designed I believe. I issue is that innerJoin relies on
the sort order of the streams in order to perform streaming merge join. The
first join works because the sorts line up on childId.
innerJoin(search(collection_name,
q="type:grandchild",