Re: REBALANCELEADERS is not reliable

2019-01-09 Thread Bernd Fehling

Yes, your findings are also very strange.
I wonder if we can discover the "inventor" of all this and ask him
how it should work or better how he originally wanted it to work.

Comments in the code (RebalanceLeaders.java) state that it is possible
to have more than one electionNode with the same sequence number.
Absolutely strange.

I wonder why the queue is not rotated until the new and preferred
leader is at front (position 0)?
But why is it a queue anyway?
Wherever I see any java code to get the content from the queue it
is sorted. Where is the sense of this?

Also, the elctionNodes have another attribute with name "ephemeral".
Where is that for and why is it not tested in TestRebalanceLeaders.java?

Regards, Bernd


Am 09.01.19 um 02:31 schrieb Erick Erickson:

It's weirder than that. In the current test on master, the
assumption is that the node recorded as leader in ZK
is actually the leader, see
TestRebalanceLeaders.checkZkLeadersAgree(). The theory
is that the identified leader node in ZK is actually the leader
after the rebalance command. But you're right, I don't see
an actual check that the collection's status agrees.

That aside, though, there are several problems I'm uncovering

1> BALANCESHARDUNIQUE can wind up with multiple
"preferredLeader" properties defined. Some time between
the original code and now someone refactored a bunch of
code and missed removing a unique property if it was
already assigned and being assigned to another replica
in the same slice.

2> to make it much worse, I've rewritten the tests
extensively and I can beast the rewritten tests 1,000
times and no failures. If I test manually by just issuing
the commands, everything works fine. By "testing manually"
I mean (working with 4 Vms, 10 shards 4 replicas)

create the collection
issue the BALANCESHARDUNIQUE command
issue the REBALANCELEADERS command



However, if instead I

create the collection
issue the BALANCESHARDUNIQUE command
shut down 3 of 4 Solr instances so all the leaders

are on the same host.

restart the 3 instances
issue the REBALANCELEADERS command then

it doesn't work.

At least that's what I think I'm seeing, but it makes no
real sense yet.

So I'm first trying to understand why my manual test
fails so regularly, then I can incorporate that setup
into the unit test (I'm thinking of just shutting down
and restarting some of the Jetty instances).

But it's a total mystery to me why restarting Solr instances
should have any effect. But that's certainly not
something that happens in the current test so I have
hopes that tracking that down will lead to understanding
what the invalid assumption I'm making is and we can
test for that too.,

On Tue, Jan 8, 2019 at 1:42 AM Bernd Fehling
 wrote:


Hi Erick,

after some more hours of debugging the rough result is, who ever invented
this leader election did not check if an action returns the estimated
result. There are only checks for exceptions, true/false, new sequence
numbers and so on, but never if a leader election to the preferredleader
really took place.

If doing a rebalanceleaders to preferredleader I also have to check if:
- a rebalance took place
- the preferredleader has really become leader (and not anyone else)

Currently this is not checked and the call rebalanceleaders to preferredleader
is like a shot into the dark with hope of success. And thats why any
problems have never been discovered or reported.

Bernd


Am 21.12.18 um 18:00 schrieb Erick Erickson:

I looked at the test last night and it's...disturbing. It succeeds
100% of the time. Manual testing seems to fail very often.
Of course it was late and I was a bit cross-eyed, so maybe
I wasn't looking at the manual tests correctly. Or maybe the
test is buggy.

I beasted the test 100x last night and all of them succeeded.

This was with all NRT replicas.

Today I'm going to modify the test into a stand-alone program
to see if it's something in the test environment that causes
it to succeed. I've got to get this to fail as a unit test before I
have confidence in any fixes, and also confidence that things
like this will be caught going forward.

Erick

On Fri, Dec 21, 2018 at 3:59 AM Bernd Fehling
 wrote:


As far as I could see with debugger there is still a problem in requeing.

There is a watcher and it is recognized that the watcher is not a 
preferredleader.
So it tries to locate a preferredleader with success.
It then calls makeReplicaFirstWatcher and gets a new sequence number for
the preferredleader replica. But now we have two replicas with the same
sequence number. One replica which already owns that sequence number and
the replica which got the new (and the same) number as new sequence number.
It now tries to solve this with queueNodesWithSameSequence.
Might be something in rejoinElection.
At least the call to rejoinElection seems right. For preferredleader it
is true for rejoinAtHead and for the other replica with same sequence number
it is false for rejoinAtHead.

A test case s

Solr Cloud wiping all cores when restart without proper zookeeper directories

2019-01-09 Thread Yogendra Kumar Soni
We are running a solr cloud cluster using solr 7.4 with 8 shards. When we
started our solr cloud with a zookeeper node (without collections directory
but with only solr.xml and configs) our data directory containing
core.propery and cores data becomes empty.




-- 
*Thanks and Regards,*
*Yogendra Kumar Soni*


how to recover state.json files

2019-01-09 Thread Yogendra Kumar Soni
How to know attributes like shard name and hash ranges with associated core
names if we lost state.json file from zookeeper.
core.properties only contains core level information but hash ranges are
not stored there.

Does solr stores collection information, shards information anywhere.



-- 
*Thanks and Regards,*
*Yogendra Kumar Soni*


Re: how to recover state.json files

2019-01-09 Thread Bernd Fehling

Have you lost dataDir from all zookeepers?

If not, first take a backup of remaining dataDir and then start that zookeeper.
Take ZooInspector to connect to dataDir at localhost and get your
state.json including all other configs and setting.


Am 09.01.19 um 12:25 schrieb Yogendra Kumar Soni:

How to know attributes like shard name and hash ranges with associated core
names if we lost state.json file from zookeeper.
core.properties only contains core level information but hash ranges are
not stored there.

Does solr stores collection information, shards information anywhere.





Re: Haystack Relevance Conference Announced; CFP ends Jan 9!

2019-01-09 Thread Charlie Hull

Hi all,

Just to let you know the CFP has been extended until January 30th and 
we're really looking forward to seeing your proposals! 
http://haystackconf.com


Cheers

Charlie


On 27/11/2018 22:33, Doug Turnbull wrote:

Hey everyone,

Many of you may know about/have been to Haystack - The Search Relevance
Conference.
http://haystackconf.com

We're excited to announce 2019's Haystack, April 22-25 in Charlottesville,
VA, USA. Our CFP due January 9th.

We want to bring together practitioners that work on really interesting
search relevance problems. We want talks that really get into the
nitty-gritty of improving relevance, getting into technically meaty talks
in applied Information Retrieval based on open source search.

We know the Solr community is chock full of great ideas and problems
solved, and we look forward to hearing about the tough problems you've
solved with Solr/Lucene/Elasticsearch/Vespa/A Team of Trained
Hamsters/whatever.

Best
-Doug




--
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk


Re: Solr Cloud wiping all cores when restart without proper zookeeper directories

2019-01-09 Thread Erick Erickson
Solr doesn't just remove directories, this is very likely
something in your environment that's doing this.

In any case, there's no information here to help
diagnose. You must tell us _exactly_ what steps
you take in order to have any hope of helping.

Best,
Erick

On Wed, Jan 9, 2019 at 2:48 AM Yogendra Kumar Soni
 wrote:
>
> We are running a solr cloud cluster using solr 7.4 with 8 shards. When we
> started our solr cloud with a zookeeper node (without collections directory
> but with only solr.xml and configs) our data directory containing
> core.propery and cores data becomes empty.
>
>
>
>
> --
> *Thanks and Regards,*
> *Yogendra Kumar Soni*


Re: how to recover state.json files

2019-01-09 Thread Erick Erickson
How did you "lose" the data? Exactly what happened?

Where does the dataDir variable point in your
zoo.cfg file? By default it points to /tmp/zookeeper,
which can be deleted by the op system when
the machine is restarted.

Otherwise you can get/put arbitrary znodes by
using "bin/solr zk cp". Try "bin/solr zk -help" to
see the options. What I'd do to start is create
a new collection and use the state.json
as a template.

Assuming, of course, that Bernd's suggestion
is impossible.

Best,
Erick

On Wed, Jan 9, 2019 at 5:20 AM Bernd Fehling
 wrote:
>
> Have you lost dataDir from all zookeepers?
>
> If not, first take a backup of remaining dataDir and then start that 
> zookeeper.
> Take ZooInspector to connect to dataDir at localhost and get your
> state.json including all other configs and setting.
>
>
> Am 09.01.19 um 12:25 schrieb Yogendra Kumar Soni:
> > How to know attributes like shard name and hash ranges with associated core
> > names if we lost state.json file from zookeeper.
> > core.properties only contains core level information but hash ranges are
> > not stored there.
> >
> > Does solr stores collection information, shards information anywhere.
> >
> >
> >


Re: how to recover state.json files

2019-01-09 Thread Gus Heck
Not a direct solution, but manipulating data in Zookeeper can be made
easier with https://github.com/rgs1/zk_shell

On Wed, Jan 9, 2019 at 10:26 AM Erick Erickson 
wrote:

> How did you "lose" the data? Exactly what happened?
>
> Where does the dataDir variable point in your
> zoo.cfg file? By default it points to /tmp/zookeeper,
> which can be deleted by the op system when
> the machine is restarted.
>
> Otherwise you can get/put arbitrary znodes by
> using "bin/solr zk cp". Try "bin/solr zk -help" to
> see the options. What I'd do to start is create
> a new collection and use the state.json
> as a template.
>
> Assuming, of course, that Bernd's suggestion
> is impossible.
>
> Best,
> Erick
>
> On Wed, Jan 9, 2019 at 5:20 AM Bernd Fehling
>  wrote:
> >
> > Have you lost dataDir from all zookeepers?
> >
> > If not, first take a backup of remaining dataDir and then start that
> zookeeper.
> > Take ZooInspector to connect to dataDir at localhost and get your
> > state.json including all other configs and setting.
> >
> >
> > Am 09.01.19 um 12:25 schrieb Yogendra Kumar Soni:
> > > How to know attributes like shard name and hash ranges with associated
> core
> > > names if we lost state.json file from zookeeper.
> > > core.properties only contains core level information but hash ranges
> are
> > > not stored there.
> > >
> > > Does solr stores collection information, shards information anywhere.
> > >
> > >
> > >
>


-- 
http://www.the111shift.com


Re: Web Server HTTP Header Internal IP Disclosure SOLR port

2019-01-09 Thread Gus Heck
This sounds like something that might crop up if the admin UI were exposed
to an alternate (or public) network space through a tunnel or proxy. The
server knows nothing about the proxy/tunnel, and the cloud page has nice
clickable machine names that point at the internal dns or ip names of the
nodes. This does not however give access to said nodes or the network
space. One might I suppose worry that it reveals which internal IP space is
in use, but if someone you don't trust with that information can already
see the admin UI you have much bigger problems.

On Mon, Jan 7, 2019 at 3:15 AM Jan Høydahl  wrote:

> Are you saying that the redirect from http://my.ip:8983/ to
> http://my.ip.8983/solr/ is a security issue for you? Please tell us how
> this could be by providing a real example where you believe that Solr
> exposes some secret information that the requesting client should not gain
> access to?? Remember that Solr is not any random Web server and must be
> firewalled and not exposed to the internet. Your security scan tool may
> have other assumptions?
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> > 7. jan. 2019 kl. 05:55 skrev Muniraj M :
> >
> > Hi,
> >
> > I am using Apache SOLR 6.6.5 as my search engine and when we do security
> > scan on our server, we got the below response
> >
> > *When processing the following request : GET / HTTP/1.0 this web server
> > leaks the following private IP address : X.X.X.X as found in the
> following
> > collection of HTTP headers : HTTP/1.1 302 Found
> > Location: http://X.X.X.X:8983/solr/
> >  Content-Length: 0*
> >
> > I have checked for more time however haven't find any solutions to fix
> this
> > problem. Any idea of how to solve this would be really appreciated.
> >
> > --
> > Regards,
> > *Muniraj M*
>
>

-- 
http://www.the111shift.com


Re: REBALANCELEADERS is not reliable

2019-01-09 Thread Erick Erickson
Executive summary:

The central problem is "how can I insert an ephemeral node
in a specific place in a ZK queue". The code could be much,
much simpler if there were a reliable way to do just that. I haven't
looked at more recent ZKs to see if it's possible, I'd love it if
there were a better way.

On to details:

bq.  wonder if we can discover the "inventor" of all this and ask him
how it should work

Yeah, I can contact that clown. That would be me ;)

The way leader election works is a  ZK recipe where each
ephemeral node only watches the one in front of it. When a
node is deleted, the one watching it is notified.

So let's say we have nodes in this order: 1 watched by 2
watched by 3... In this case 1 is the leader. If 2 should
disappear, 3 gets notified and now watches 1. Nothing
else really happens.

But if the nodes are 1, 2, 3 and 1 disappears 2 gets notified
and says, in effect "I'm first in line so I will be leader". 3
doesn't get any notification.

bq. I wonder why the queue is not rotated until the new and preferred
leader is at front (position 0)

Because then you'd have N leader changes where N is the number
of nodes between the preferred leader and the actual leader in
the leader election queue at the start. That could result in 100s of
changes when only 1 is desired. The original case for this was
exactly that, there could be 100s of shards and several tens
to 100s of replicas.

H, I suppose you wouldn't have that many leader changes if
you sent the ephemeral node in the _second_ position to the end
of the queue until the preferredLeader became the one in the second
position. You'd still have a watch fired for every requeueing though. I
haven't measured the cost there. That would also be an added
burden for the Overseer, which has been overwhelmed in the past.

I'm not against that solution, I don't have any real data to
evaluate.

bq. is possible to have more than one electionNode with the
same sequence number.

Yeah, and it causes complexity, but I don't have a good way around it. This is
related to your sorting question. ZK itself has a simple ordering, sequential
sequence numbers. Having two the same is the only way I could see (actually I
patterned it off some other code) to insert an ephemeral node second. What you
have then is two nodes "tied" for second by having the same sequence numbers.

Which one ZK thinks is second (and
thus which one becomes the leader if the zero'th ephemeral node disappears)
is based on sorting which includes the session ID, so there's code in there that
has to deal with sending the non-preferred node that's tied for second to the
end of the queue. That's the code that got changed during various
refactorings that I didn't take part in, and the code that's messed up.

bq. Wherever I see any java code to get the content from the queue it
is sorted. Where is the sense of this?

This is implied by the above, but explicitly so the Java code can see the queue
in the same order that ZK does and "do the right thing". In this case
assign the preferred leader's ephemeral node with the same sequence number
that the current second-in-line has and move the current second-in-line to the
end of the queue.

All that said, this code was written several years ago and I haven't looked at
whether there are better methods available now. The actions that are necessary
are:

1> ensure that the preferredLeader is the only node watching the leader in the
leader election queue

2> re-queue the leader at the end of the leader election queue. Since we'd be
sure the preferredLeader is watching the leader, this action would elect
the proper node as the leader.

Hmmm, note to self. It would help folks in the future if I, you know, documented
those two points in the code. Siiggghhh.

Last night I found what I  _think_ is the problem I was having. Note that the
current code has three problems. I think I have fixes for all of them:

1> assigning the preferredLeader (or any SHARDUNIQUE property) does not
 properly remove that property from other replicas in the shard if
present. So you may have multiple preferredLeaders in a shard.

2> the code to resolve tied sequence numbers had been changed
 during some refactoring so the wrong node could be elected.

3> the response from the rebalanceleaders command isn't very useful, it's
 on my plate to fix that. Partly it was not reporting useful
 info, and partly your comment from the other day that it returns
 without verifying the leadership has actually changed is well taken. At
 present, it just changes the election queue and assumes that the
 right thing happens. The test code was supposed to point out when
 that assumption was incorrect, but you know the story there.

Currently, the code is pretty ugly in terms of all the junk I put in trying to
track this down, but when I clean it up I'll put up a patch. I added some
code to restart some of the jettys in the test (it's now "@Slow:") that
catches th

Concurrent User

2019-01-09 Thread Senthil0809
*Hi Team , *

 I am new to this tool and we are planning to implement Apache Solr for
Search and match process . Here I have added some of my requirement .  

1. We have 500 concurrent user for search and match process to pull the
details from record 
2. Around 5 request will happen on daily basis .

please suggest me what are all the requirement to be added in solr .

*Thanks 
Senthil Kumar *



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Bootstrapping a Collection on SolrCloud

2019-01-09 Thread Frank Greguska
Hello,

I am trying to bootstrap a SolrCloud installation and I ran into an issue
that seems rather odd. I see it is possible to bootstrap a configuration
set from an existing SOLR_HOME using

./server/scripts/cloud-scripts/zkcli.sh -zkhost ${ZK_HOST} -cmd bootstrap
-solrhome ${SOLR_HOME}

but this does not create a collection, it just uploads a configuration set.

Furthermore, I can not use

bin/solr create

to create a collection and link it to my bootstrapped configuration set
because it requires Solr to already be running.

I'm hoping someone can shed some light on why this is the case? It seems
like a collection is just some znodes stored in zookeeper that contain
configuration settings and such. Why should I not be able to create those
nodes before Solr is running?

I'd like to open a feature request for this if one does not already exist
and if I am not missing something obvious.

Thank you,

Frank Greguska


Re: Concurrent User

2019-01-09 Thread Shawn Heisey

On 1/9/2019 9:00 AM, Senthil0809 wrote:

  I am new to this tool and we are planning to implement Apache Solr for
Search and match process . Here I have added some of my requirement .

1. We have 500 concurrent user for search and match process to pull the
details from record
2. Around 5 request will happen on daily basis .

please suggest me what are all the requirement to be added in solr .


We cannot answer that question. There's not enough information here to 
even make a guess.


Yes, I'm being completely serious.

And even with more information, all we will be able to do is guess.  The 
guesses we can make will always err on the side of more hardware with 
higher cost, and usually you'll be reminded that it's just a guess that 
could turn out to be wrong.


The only way to REALLY know what you're going to need is to actually 
build it and try it.


https://lucidworks.com/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

Thanks,
Shawn



Re: Solr Cloud wiping all cores when restart without proper zookeeper directories

2019-01-09 Thread lstusr 5u93n4
We've seen the same thing on solr 7.5 by doing:
 - create a collection
 - add some data
 - stop solr on all servers
 - delete all contents of the solr node from zookeeper
 - start solr on all nodes
 - create a collection with the same name as in the first step

When doing this, solr wipes out the previous collection data and starts
new.

In our case, this was due to a startup script that checked for the
existence of a collection and created it if non-existent.  When not present
in ZK, solr (as it should) didn't return that collection in it's list of
collections so we created it...

Possible that you have something similar in your workflow?

Kyle

On Wed, 9 Jan 2019 at 10:22, Erick Erickson  wrote:

> Solr doesn't just remove directories, this is very likely
> something in your environment that's doing this.
>
> In any case, there's no information here to help
> diagnose. You must tell us _exactly_ what steps
> you take in order to have any hope of helping.
>
> Best,
> Erick
>
> On Wed, Jan 9, 2019 at 2:48 AM Yogendra Kumar Soni
>  wrote:
> >
> > We are running a solr cloud cluster using solr 7.4 with 8 shards. When we
> > started our solr cloud with a zookeeper node (without collections
> directory
> > but with only solr.xml and configs) our data directory containing
> > core.propery and cores data becomes empty.
> >
> >
> >
> >
> > --
> > *Thanks and Regards,*
> > *Yogendra Kumar Soni*
>


Re: Re: Re: Page faults

2019-01-09 Thread Branham, Jeremy (Experis)
Thanks for the information Erick –
I’ve learned there are 2 ‘classes’ of documents being stored in this collection.
There are about 4x as many documents in class A as class B.
When the documents are indexed, the document ID includes the key prefix like 
‘A/1!’ or ‘B/1!’, which I understand spreads the documents over ½ of the 
available shards.

I don’t suppose there is a way to say “I want 75% of the shards to store class 
A, and 25% to store class B”.
If we dropped the ‘/1’ from the prefix, all the documents would be indexed on a 
single shard, correct?


Currently, half the servers are under heavy load, and the other half are 
under-utilized. [8 servers total, 4 shards with replication factor of 2]
I’ve considered a few remedies, but I’m not sure which would be best.

We could drop the document ID prefix and let SOLR distribute the documents 
evenly, then use a discriminator field to filter queries.
- Requires re-indexing
- Code changes in our APIs and indexing process
We could create 2 separate collections.
- Requires re-indexing
- Code changes in our APIs and indexing process
- Lost ability to query all the docs at once
We could split the shards.
- More than 1 shard would be on a node. What if we end up with 2 big replicas 
on a single node?

If we split the shards, I’m unsure how the prefix would work in this scenario.
Would ‘A/1!’ continue to use the original shard range?

Like if we split just the 2 big shards –
4 shards become 6
Does ‘A/1!’ spread the documents across 3 shards [half of the new total] or 
across the 4 new shards?

Or if we split all 4 shards, ‘A/1!’ should spread across 8 shards, which would 
be half of the new total.
Could it be difficult trying to balance 8 shards across 8 servers?
I’m concerned 2 big shards would end up on the same server, and we would have 
imbalance again.

I think dropping the prefix all-together would be the easiest to maintain and 
scale, but has a code-impact on our apps.
Or maybe I’m over-thinking the complexity of splitting the shards, and they 
will balance out naturally.

I’ll split the shards in our test environment to see what happens.

 
Jeremy Branham
jb...@allstate.com

On 1/7/19, 6:13 PM, "Erick Erickson"  wrote:

having some replicas at 90G and some at 18G is totally unexpected with
compisiteID routing unless you're using "multi-level routing", see:

https://urldefense.proofpoint.com/v2/url?u=https-3A__lucidworks.com_2014_01_06_multi-2Dlevel-2Dcomposite-2Did-2Drouting-2Dsolrcloud_&d=DwIFaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=3W1fPV3il56N1yZXMpkr8tctxVeKkZ9Bi5S74c2AmSo&s=h67H58KbeLZIoOUaly3kVCFHllH-0Mi2FiqRDckIlBo&e=

But let's be clear what we're talking about here. I'm talking about
specifically the size of the index on disk for any particular
_replica_, meaning the size in places similar to:
pdv201806_shard1_replica1/data/index. I've never seen as much
disparity as you're talking about so we should get to the bottom of
that.

Do you have massive numbers of deleted docs in any of those shards?
The admin screen for any particular replica will show this number.


On another note: Your cache sizes are probably not part of the page
fault question, but on the surface they're badly misconfigured, at
least the filterCache and queryResultCache. Each entry in the
filterCache is a map entry, the key is roughly the query and the value
is bounded by maxDoc/8. So if you have, say, 8M documents, your
filterCache could theoretically be 1M each (give or take) and you
could have up to 20,000 of them. You're probably just being lucky and
either not having very many distinct fq clauses or are indexing often
enough that it isn't growing for very long before being flushed.

Your queryResultCache takes up a lot less space, but still it's quite
large. It has two primary purposes:
> paging. It generally stores a few integers (40 is common, maybe several 
hundred but who cares?) so hitting the next page won't have to search again. 
This isn't terribly important in modern installations.

> being used in autowarming to pre-load parts of the index into memory.

I'd consider knocking each of these back to the defaults (512), except
I'd put the autowarm count at, say, 16 or so.

The document cache is less clear, the recommendation is (number of
simultaneous queries you expect) X (your average row parameter)

Best,
Erick

On Mon, Jan 7, 2019 at 12:43 PM Branham, Jeremy (Experis)
 wrote:
>
> Thanks Erick/Chris for the information.
> The page faults are occurring on each node of the cluster.
> These are VMs running SOLR v7.2.1 on RHEL 7. CPUx8, 64GB mem.
>
> We’re collecting GC information and using a DynaTrace agent, so I’m not 
sure if / how much that contributes to the overhead.
>
> This cluster is used strictly for type-ahead/auto-complete f

Re: Re: Re: Page faults

2019-01-09 Thread Erick Erickson
bq: We could create 2 separate collections.
- Requires re-indexing
- Code changes in our APIs and indexing process
- Lost ability to query all the docs at once ***

*** Not quite true. You can create an alias that points to multiple
collections. HOWEVER,
since the scores are computed using different stats (term frequencies,
field length,
even terms) the scores may not be comparable so your results (assuming you're
sorting by score) may be skewed.

That said, my first choice would be the first one you suggested: Drop the prefix
and let Solr distribute docs. Then use an "fq" clause as your discriminator

What people often do with this is create a new collection and index to it in the
background, i.e. don't server any queries from it. When you're happy
it's in good
shape, create a collection alias to it to seamlessly switch your
queries over to it.
Of course you have to be indexing to _both_ collections if your
current collection
needs to be up to date.

There are various tricks you can play to minimize hardware requirements once
you decide on whether you want to do that or not.

Best,
Erick

On Wed, Jan 9, 2019 at 11:56 AM Branham, Jeremy (Experis)
 wrote:
>
> Thanks for the information Erick –
> I’ve learned there are 2 ‘classes’ of documents being stored in this 
> collection.
> There are about 4x as many documents in class A as class B.
> When the documents are indexed, the document ID includes the key prefix like 
> ‘A/1!’ or ‘B/1!’, which I understand spreads the documents over ½ of the 
> available shards.
>
> I don’t suppose there is a way to say “I want 75% of the shards to store 
> class A, and 25% to store class B”.
> If we dropped the ‘/1’ from the prefix, all the documents would be indexed on 
> a single shard, correct?
>
>
> Currently, half the servers are under heavy load, and the other half are 
> under-utilized. [8 servers total, 4 shards with replication factor of 2]
> I’ve considered a few remedies, but I’m not sure which would be best.
>
> We could drop the document ID prefix and let SOLR distribute the documents 
> evenly, then use a discriminator field to filter queries.
> - Requires re-indexing
> - Code changes in our APIs and indexing process
> We could create 2 separate collections.
> - Requires re-indexing
> - Code changes in our APIs and indexing process
> - Lost ability to query all the docs at once
> We could split the shards.
> - More than 1 shard would be on a node. What if we end up with 2 big replicas 
> on a single node?
>
> If we split the shards, I’m unsure how the prefix would work in this scenario.
> Would ‘A/1!’ continue to use the original shard range?
>
> Like if we split just the 2 big shards –
> 4 shards become 6
> Does ‘A/1!’ spread the documents across 3 shards [half of the new total] or 
> across the 4 new shards?
>
> Or if we split all 4 shards, ‘A/1!’ should spread across 8 shards, which 
> would be half of the new total.
> Could it be difficult trying to balance 8 shards across 8 servers?
> I’m concerned 2 big shards would end up on the same server, and we would have 
> imbalance again.
>
> I think dropping the prefix all-together would be the easiest to maintain and 
> scale, but has a code-impact on our apps.
> Or maybe I’m over-thinking the complexity of splitting the shards, and they 
> will balance out naturally.
>
> I’ll split the shards in our test environment to see what happens.
>
>
> Jeremy Branham
> jb...@allstate.com
>
> On 1/7/19, 6:13 PM, "Erick Erickson"  wrote:
>
> having some replicas at 90G and some at 18G is totally unexpected with
> compisiteID routing unless you're using "multi-level routing", see:
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lucidworks.com_2014_01_06_multi-2Dlevel-2Dcomposite-2Did-2Drouting-2Dsolrcloud_&d=DwIFaQ&c=gtIjdLs6LnStUpy9cTOW9w&r=0SwsmPELGv6GC1_5JSQ9T7ZPMLljrIkbF_2jBCrKXI0&m=3W1fPV3il56N1yZXMpkr8tctxVeKkZ9Bi5S74c2AmSo&s=h67H58KbeLZIoOUaly3kVCFHllH-0Mi2FiqRDckIlBo&e=
>
> But let's be clear what we're talking about here. I'm talking about
> specifically the size of the index on disk for any particular
> _replica_, meaning the size in places similar to:
> pdv201806_shard1_replica1/data/index. I've never seen as much
> disparity as you're talking about so we should get to the bottom of
> that.
>
> Do you have massive numbers of deleted docs in any of those shards?
> The admin screen for any particular replica will show this number.
>
>
> On another note: Your cache sizes are probably not part of the page
> fault question, but on the surface they're badly misconfigured, at
> least the filterCache and queryResultCache. Each entry in the
> filterCache is a map entry, the key is roughly the query and the value
> is bounded by maxDoc/8. So if you have, say, 8M documents, your
> filterCache could theoretically be 1M each (give or take) and you
> could have up to 20,000 of them. You're probably just being lucky and
> either not havin

Re: Bootstrapping a Collection on SolrCloud

2019-01-09 Thread Erick Erickson
How would you envision that working? When would the
replicas actually be created and under what heuristics?

Imagine this is possible, and there are a bunch of
placeholders in ZK for a 10-shard, collection with
a replication factor of 10 (100 replicas all told). Now
I bring up a single Solr instance. Should all 100 replicas
be created immediately? Wait for N Solr nodes to be
brought online? On some command?

My gut feel is that this would be fraught with problems
and not very valuable to many people. If you could create
the "template" in ZK without any replicas actually being created,
then at some other point say "make it so", I don't see the advantage
over just the current setup. And I do think that it would be
considerable effort.

Net-net is I'd like to see a much stronger justification
before anyone embarks on something like this. First as
I mentioned above I think it'd be a lot of effort, second I
virtually guarantee it'd introduce significant bugs. How
would it interact with autoscaling for instance?

Best,
Erick

On Wed, Jan 9, 2019 at 9:59 AM Frank Greguska  wrote:
>
> Hello,
>
> I am trying to bootstrap a SolrCloud installation and I ran into an issue
> that seems rather odd. I see it is possible to bootstrap a configuration
> set from an existing SOLR_HOME using
>
> ./server/scripts/cloud-scripts/zkcli.sh -zkhost ${ZK_HOST} -cmd bootstrap
> -solrhome ${SOLR_HOME}
>
> but this does not create a collection, it just uploads a configuration set.
>
> Furthermore, I can not use
>
> bin/solr create
>
> to create a collection and link it to my bootstrapped configuration set
> because it requires Solr to already be running.
>
> I'm hoping someone can shed some light on why this is the case? It seems
> like a collection is just some znodes stored in zookeeper that contain
> configuration settings and such. Why should I not be able to create those
> nodes before Solr is running?
>
> I'd like to open a feature request for this if one does not already exist
> and if I am not missing something obvious.
>
> Thank you,
>
> Frank Greguska


Solr Query running slow in Prod node

2019-01-09 Thread Dasarathi Minjur
Hello,
We have a Solr query that runs much slower in Production Solr cluster
compared with lower environments.(Yes they may not be apples to apples
comparison but it's really slow in prod as HDFS gets pounded)
What are the general ways to track/trouble shoot slowness in the query. Is
there any best practices or troubleshoot links. Thanks in advance for your
responses.


Re: Bootstrapping a Collection on SolrCloud

2019-01-09 Thread Frank Greguska
Thanks for the response. You do raise good points.

Say I reverse your example and I have a 10 node cluster with a 10-shard
collection and a replication factor of 10. Now I kill 9 of my nodes, do all
100 replicas move to the one remaining node? I believe the answer is, well
that depends on the configuration.

I'm thinking about it from the initial cluster planning side of things. The
decisions about auto-scaling, how many replicas, and even how many shards
are at least partially dependent on the available hardware. So at
deployment time I would think there would be a way of defining what the
collection *should* look like based on the hardware I am deploying to.
Obviously this could change during runtime and I may need to add nodes,
split shards, etc...

As it is now it seems like I need to deploy my cluster then write a custom
script to ensure each node I expect to be there is running and only then
create my collection with desired shards and replication.

- Frank

On Wed, Jan 9, 2019 at 2:14 PM Erick Erickson 
wrote:

> How would you envision that working? When would the
> replicas actually be created and under what heuristics?
>
> Imagine this is possible, and there are a bunch of
> placeholders in ZK for a 10-shard, collection with
> a replication factor of 10 (100 replicas all told). Now
> I bring up a single Solr instance. Should all 100 replicas
> be created immediately? Wait for N Solr nodes to be
> brought online? On some command?
>
> My gut feel is that this would be fraught with problems
> and not very valuable to many people. If you could create
> the "template" in ZK without any replicas actually being created,
> then at some other point say "make it so", I don't see the advantage
> over just the current setup. And I do think that it would be
> considerable effort.
>
> Net-net is I'd like to see a much stronger justification
> before anyone embarks on something like this. First as
> I mentioned above I think it'd be a lot of effort, second I
> virtually guarantee it'd introduce significant bugs. How
> would it interact with autoscaling for instance?
>
> Best,
> Erick
>
> On Wed, Jan 9, 2019 at 9:59 AM Frank Greguska  wrote:
> >
> > Hello,
> >
> > I am trying to bootstrap a SolrCloud installation and I ran into an issue
> > that seems rather odd. I see it is possible to bootstrap a configuration
> > set from an existing SOLR_HOME using
> >
> > ./server/scripts/cloud-scripts/zkcli.sh -zkhost ${ZK_HOST} -cmd bootstrap
> > -solrhome ${SOLR_HOME}
> >
> > but this does not create a collection, it just uploads a configuration
> set.
> >
> > Furthermore, I can not use
> >
> > bin/solr create
> >
> > to create a collection and link it to my bootstrapped configuration set
> > because it requires Solr to already be running.
> >
> > I'm hoping someone can shed some light on why this is the case? It seems
> > like a collection is just some znodes stored in zookeeper that contain
> > configuration settings and such. Why should I not be able to create those
> > nodes before Solr is running?
> >
> > I'd like to open a feature request for this if one does not already exist
> > and if I am not missing something obvious.
> >
> > Thank you,
> >
> > Frank Greguska
>


Re: Bootstrapping a Collection on SolrCloud

2019-01-09 Thread Erick Erickson
bq.  do all 100 replicas move to the one remaining node?

No. The replicas are in a "down" state the Solr instances
are brought back up (I'm skipping autoscaling here, but
even that wouldn't move all the replicas to the one remaining
node).

bq.  what the collection *should* look like based on the
hardware I am deploying to.

With the caveat that the Solr instances have to be up, this
is entirely possible. First of all, you can provide a "createNodeSet"
to the create command to specify exactly what Solr nodes you
want used for your collection. There's a special "EMPTY"
value that _almost_ does what you want, that is it creates
no replicas, just the configuration in ZooKeeper. Thereafter,
though, you have to ADDREPLICA (which you can do with
"node" parameter to place it exactly where you want.

bq. how many shards are at least partially dependent on the
available hardware

Not if you're using compositeID routing. The number of shards
is fixed at creation time, although you can split them later.

I don't  think you can use bin/solr create_collection with the
EMPTY createNodeSet, so you need at least one
Solr node running to create your skeleton collection.

I think the thing I'm getting stuck on is how in the world the
Solr code could know enough to "do the right thing". How many
docs do you have? How big are they? How much to you expect
to grow? What kinds of searches do you want to support?

But more power to you if you can figure out how to support the kind
of thing you want. Personally I think it's harder than you might
think and not broadly useful. I've been wrong more times than I like
to recall, so maybe you have an approach that would get around
the tigers hiding in the grass I think are out there...

Best,
Erick


On Wed, Jan 9, 2019 at 3:04 PM Frank Greguska  wrote:
>
> Thanks for the response. You do raise good points.
>
> Say I reverse your example and I have a 10 node cluster with a 10-shard
> collection and a replication factor of 10. Now I kill 9 of my nodes, do all
> 100 replicas move to the one remaining node? I believe the answer is, well
> that depends on the configuration.
>
> I'm thinking about it from the initial cluster planning side of things. The
> decisions about auto-scaling, how many replicas, and even how many shards
> are at least partially dependent on the available hardware. So at
> deployment time I would think there would be a way of defining what the
> collection *should* look like based on the hardware I am deploying to.
> Obviously this could change during runtime and I may need to add nodes,
> split shards, etc...
>
> As it is now it seems like I need to deploy my cluster then write a custom
> script to ensure each node I expect to be there is running and only then
> create my collection with desired shards and replication.
>
> - Frank
>
> On Wed, Jan 9, 2019 at 2:14 PM Erick Erickson 
> wrote:
>
> > How would you envision that working? When would the
> > replicas actually be created and under what heuristics?
> >
> > Imagine this is possible, and there are a bunch of
> > placeholders in ZK for a 10-shard, collection with
> > a replication factor of 10 (100 replicas all told). Now
> > I bring up a single Solr instance. Should all 100 replicas
> > be created immediately? Wait for N Solr nodes to be
> > brought online? On some command?
> >
> > My gut feel is that this would be fraught with problems
> > and not very valuable to many people. If you could create
> > the "template" in ZK without any replicas actually being created,
> > then at some other point say "make it so", I don't see the advantage
> > over just the current setup. And I do think that it would be
> > considerable effort.
> >
> > Net-net is I'd like to see a much stronger justification
> > before anyone embarks on something like this. First as
> > I mentioned above I think it'd be a lot of effort, second I
> > virtually guarantee it'd introduce significant bugs. How
> > would it interact with autoscaling for instance?
> >
> > Best,
> > Erick
> >
> > On Wed, Jan 9, 2019 at 9:59 AM Frank Greguska  wrote:
> > >
> > > Hello,
> > >
> > > I am trying to bootstrap a SolrCloud installation and I ran into an issue
> > > that seems rather odd. I see it is possible to bootstrap a configuration
> > > set from an existing SOLR_HOME using
> > >
> > > ./server/scripts/cloud-scripts/zkcli.sh -zkhost ${ZK_HOST} -cmd bootstrap
> > > -solrhome ${SOLR_HOME}
> > >
> > > but this does not create a collection, it just uploads a configuration
> > set.
> > >
> > > Furthermore, I can not use
> > >
> > > bin/solr create
> > >
> > > to create a collection and link it to my bootstrapped configuration set
> > > because it requires Solr to already be running.
> > >
> > > I'm hoping someone can shed some light on why this is the case? It seems
> > > like a collection is just some znodes stored in zookeeper that contain
> > > configuration settings and such. Why should I not be able to create those
> > > nodes before Solr is running?
> > >

Is there a recommended open source GUI tool for monitoring 'zookeeper'?

2019-01-09 Thread 유정인
Hi 

Is there a recommended open source GUI tool for monitoring 'zookeeper'?



Re: Is there a recommended open source GUI tool for monitoring 'zookeeper'?

2019-01-09 Thread Otis Gospodnetić
Hi,

Sematext's monitoring agent with a ZooKeeper integration is open-source:
https://github.com/sematext/sematext-agent-java

The ZK integration is at
https://github.com/sematext/sematext-agent-integrations/tree/master/zookeeper
(Solr and SolrCloud integrations are in the same repo)

If you can't use Sematext Cloud  for
monitoring, you can use the above open-source agent and ship
ZooKeeper/Solr/SolrCloud metrics to InfluxDB (open-source) and view it with
something like Grafana (also open-source).

Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



On Wed, Jan 9, 2019 at 7:55 PM 유정인  wrote:

> Hi
>
> Is there a recommended open source GUI tool for monitoring 'zookeeper'?
>
>


Re: Bootstrapping a Collection on SolrCloud

2019-01-09 Thread Frank Greguska
Thanks, I am no Solr expert so I may be over-simplifying things a bit in my
ignorance.

"No. The replicas are in a "down" state the Solr instances are brought back
up" Why can't I dictate (at least initially) the "up" state somehow? It
seems Solr keeps track of where replicas were deployed so that the cluster
'heals' itself when all nodes are back. At deployment, I know which nodes
should be available so the collection could be unavailable until all
expected nodes are up.

Thank you for the pointer to the createNodeSet parameter, that might prove
useful.

"I think the thing I'm getting stuck on is how in the world the
Solr code could know enough to "do the right thing". How many
docs do you have? How big are they? How much to you expect
to grow? What kinds of searches do you want to support?"

Solr can't know these things. But me as the deployer/developer might.
For example say I know my initial data size and can say the index will be
10 TB. If I have 2 nodes with 5 TB disks well then I have to have 2 shards
because it won't fit on one node. If instead I have 4 nodes with 5 TB
disks, well I could still have 2 shards but with replicas. Or I could
choose no replicas but more shards. This is what I mean by the
shard/replica decision being partially dependent on available hardware;
there are some decisions I could make knowing my planned deployment so that
when I start the cluster it can be immediately functional. Rather than
first starting the cluster, then creating the collection, then making it
available.

You may be right that it is a small and complicated concern because I
really only need to care about it once when I am first deploying my
cluster. But everyone who needs to stand up a SolrCloud cluster needs to do
it. My guess is most people either do it manually as a one-time operations
thing or they write a custom script to do it for them automatically as I am
attempting. Seems like a good candidate for a new feature.

- Frank

On Wed, Jan 9, 2019 at 4:18 PM Erick Erickson 
wrote:

> bq.  do all 100 replicas move to the one remaining node?
>
> No. The replicas are in a "down" state the Solr instances
> are brought back up (I'm skipping autoscaling here, but
> even that wouldn't move all the replicas to the one remaining
> node).
>
> bq.  what the collection *should* look like based on the
> hardware I am deploying to.
>
> With the caveat that the Solr instances have to be up, this
> is entirely possible. First of all, you can provide a "createNodeSet"
> to the create command to specify exactly what Solr nodes you
> want used for your collection. There's a special "EMPTY"
> value that _almost_ does what you want, that is it creates
> no replicas, just the configuration in ZooKeeper. Thereafter,
> though, you have to ADDREPLICA (which you can do with
> "node" parameter to place it exactly where you want.
>
> bq. how many shards are at least partially dependent on the
> available hardware
>
> Not if you're using compositeID routing. The number of shards
> is fixed at creation time, although you can split them later.
>
> I don't  think you can use bin/solr create_collection with the
> EMPTY createNodeSet, so you need at least one
> Solr node running to create your skeleton collection.
>
> I think the thing I'm getting stuck on is how in the world the
> Solr code could know enough to "do the right thing". How many
> docs do you have? How big are they? How much to you expect
> to grow? What kinds of searches do you want to support?
>
> But more power to you if you can figure out how to support the kind
> of thing you want. Personally I think it's harder than you might
> think and not broadly useful. I've been wrong more times than I like
> to recall, so maybe you have an approach that would get around
> the tigers hiding in the grass I think are out there...
>
> Best,
> Erick
>
>
> On Wed, Jan 9, 2019 at 3:04 PM Frank Greguska  wrote:
> >
> > Thanks for the response. You do raise good points.
> >
> > Say I reverse your example and I have a 10 node cluster with a 10-shard
> > collection and a replication factor of 10. Now I kill 9 of my nodes, do
> all
> > 100 replicas move to the one remaining node? I believe the answer is,
> well
> > that depends on the configuration.
> >
> > I'm thinking about it from the initial cluster planning side of things.
> The
> > decisions about auto-scaling, how many replicas, and even how many shards
> > are at least partially dependent on the available hardware. So at
> > deployment time I would think there would be a way of defining what the
> > collection *should* look like based on the hardware I am deploying to.
> > Obviously this could change during runtime and I may need to add nodes,
> > split shards, etc...
> >
> > As it is now it seems like I need to deploy my cluster then write a
> custom
> > script to ensure each node I expect to be there is running and only then
> > create my collection with desired shards and replication.
> >
> > - Frank
> >
> > On Wed, Jan 9, 2019 

Re: Bootstrapping a Collection on SolrCloud

2019-01-09 Thread Erick Erickson
First, for a given data set, I can easily double or halve
the size of the index on disk depending on what options
I choose for my fields; things like how many times I may
need to copy fields to support various use-cases,
whether I need to store the input for some, all or no
fields, whether I enable docValues, whether I need to
support phrase queries and on and on

Even assuming you can estimate the eventual size,
it doesn't help much. As one example, if you choose
stored="true", the index size will grow by roughly 50% of
the raw data size. But that data doesn't really affect
searching that much in that it doesn't need to be
RAM resident in the same way your terms data needs
to be. So In  order to be performant I may need anywhere
from a fraction of the raw index size on disk to multiples
of the index size on disk in terms of RAM.

So you see where this is going. I'm not against your
suggestion, but I have strong doubts as to its
feasibility give all the variables I've seen. We can revisit
this after you've had a chance to kick the tires, I suspect
we'll have more shared context on which to base
the discussion.

Best,
Erick

On Wed, Jan 9, 2019 at 5:12 PM Frank Greguska  wrote:
>
> Thanks, I am no Solr expert so I may be over-simplifying things a bit in my
> ignorance.
>
> "No. The replicas are in a "down" state the Solr instances are brought back
> up" Why can't I dictate (at least initially) the "up" state somehow? It
> seems Solr keeps track of where replicas were deployed so that the cluster
> 'heals' itself when all nodes are back. At deployment, I know which nodes
> should be available so the collection could be unavailable until all
> expected nodes are up.
>
> Thank you for the pointer to the createNodeSet parameter, that might prove
> useful.
>
> "I think the thing I'm getting stuck on is how in the world the
> Solr code could know enough to "do the right thing". How many
> docs do you have? How big are they? How much to you expect
> to grow? What kinds of searches do you want to support?"
>
> Solr can't know these things. But me as the deployer/developer might.
> For example say I know my initial data size and can say the index will be
> 10 TB. If I have 2 nodes with 5 TB disks well then I have to have 2 shards
> because it won't fit on one node. If instead I have 4 nodes with 5 TB
> disks, well I could still have 2 shards but with replicas. Or I could
> choose no replicas but more shards. This is what I mean by the
> shard/replica decision being partially dependent on available hardware;
> there are some decisions I could make knowing my planned deployment so that
> when I start the cluster it can be immediately functional. Rather than
> first starting the cluster, then creating the collection, then making it
> available.
>
> You may be right that it is a small and complicated concern because I
> really only need to care about it once when I am first deploying my
> cluster. But everyone who needs to stand up a SolrCloud cluster needs to do
> it. My guess is most people either do it manually as a one-time operations
> thing or they write a custom script to do it for them automatically as I am
> attempting. Seems like a good candidate for a new feature.
>
> - Frank
>
> On Wed, Jan 9, 2019 at 4:18 PM Erick Erickson 
> wrote:
>
> > bq.  do all 100 replicas move to the one remaining node?
> >
> > No. The replicas are in a "down" state the Solr instances
> > are brought back up (I'm skipping autoscaling here, but
> > even that wouldn't move all the replicas to the one remaining
> > node).
> >
> > bq.  what the collection *should* look like based on the
> > hardware I am deploying to.
> >
> > With the caveat that the Solr instances have to be up, this
> > is entirely possible. First of all, you can provide a "createNodeSet"
> > to the create command to specify exactly what Solr nodes you
> > want used for your collection. There's a special "EMPTY"
> > value that _almost_ does what you want, that is it creates
> > no replicas, just the configuration in ZooKeeper. Thereafter,
> > though, you have to ADDREPLICA (which you can do with
> > "node" parameter to place it exactly where you want.
> >
> > bq. how many shards are at least partially dependent on the
> > available hardware
> >
> > Not if you're using compositeID routing. The number of shards
> > is fixed at creation time, although you can split them later.
> >
> > I don't  think you can use bin/solr create_collection with the
> > EMPTY createNodeSet, so you need at least one
> > Solr node running to create your skeleton collection.
> >
> > I think the thing I'm getting stuck on is how in the world the
> > Solr code could know enough to "do the right thing". How many
> > docs do you have? How big are they? How much to you expect
> > to grow? What kinds of searches do you want to support?
> >
> > But more power to you if you can figure out how to support the kind
> > of thing you want. Personally I think it's harder than you might
> > think and not

Single query to get the count for all individual collections

2019-01-09 Thread Zheng Lin Edwin Yeo
Hi,

I would like to find out, is there any way that I can send a single query
to retrieve the numFound for all the individual collections?

I have tried with this query
http://localhost:8983/solr/collection1/select?q=*:*&collection=collection1,collection2
However, this query is doing the sum of all the collections, instead of
showing the count for each of the collection.

I am using Solr 7.5.0.

Regards,
Edwin


Re: Solr Query running slow in Prod node

2019-01-09 Thread Zheng Lin Edwin Yeo
Hi,

You have to check if both of settings are using the same configurations,
and if the production Solr server have other programs running?
Also, the query performance might be affected if there is indexing going on
at the same time.

Regards,
Edwin

On Thu, 10 Jan 2019 at 06:50, Dasarathi Minjur  wrote:

> Hello,
> We have a Solr query that runs much slower in Production Solr cluster
> compared with lower environments.(Yes they may not be apples to apples
> comparison but it's really slow in prod as HDFS gets pounded)
> What are the general ways to track/trouble shoot slowness in the query. Is
> there any best practices or troubleshoot links. Thanks in advance for your
> responses.
>


Re: Solr Cloud wiping all cores when restart without proper zookeeper directories

2019-01-09 Thread Yogendra Kumar Soni
I have an existing collection
http://10.2.12.239:11080/solr/test/select?q=*:*&rows=0

{
  "responseHeader":{
"zkConnected":true,
"status":0,
"QTime":121,
"params":{
  "q":"*:*",
  "rows":"0"}},
  "response":{"numFound":150,"start":0,"maxScore":1.0,"docs":[]
  }}

ls ls data?/index/shard?/

d*ata1/index/shard1:
test_shard2_replica_n2  solr

data1/index/shard2:
test_shard4_replica_n5

data2/index/shard1:
test_shard3_replica_n3

data2/index/shard2:
test_shard1_replica_n1 *

...



1. deleted /collections from zookeeper

bin/solr zk rm -r /collections -z localhost:2181

2. restart solr cloud

bin/solr stop -all

bin/solr -c -s data1/index/shard1/ -p 11080 -z localhost:2181
bin/solr -c -s data1/index/shard2/ -p 12080 -z localhost:2181
bin/solr -c -s data2/index/shard1/ -p 13080 -z localhost:2181
bin/solr -c -s data2/index/shard2/ -p 14080 -z localhost:2181
bin/solr -c -s data3/index/shard1/ -p 15080 -z localhost:2181
bin/solr -c -s data3/index/shard2/ -p 16080 -z localhost:2181
bin/solr -c -s data4/index/shard1/ -p 17080 -z localhost:2181
bin/solr -c -s data4/index/shard2/ -p 18080 -z localhost:2181

3. checked  again for data

ls data?/index/shard?/


*data1/index/shard1/:*

*data1/index/shard2/:

data2/index/shard1/:

data2/index/shard2/:

data3/index/shard1/:

data3/index/shard2/:

data4/index/shard1/:

data4/index/shard2/:*


All cores are wiped




On Wed, Jan 9, 2019 at 11:39 PM lstusr 5u93n4  wrote:

> We've seen the same thing on solr 7.5 by doing:
>  - create a collection
>  - add some data
>  - stop solr on all servers
>  - delete all contents of the solr node from zookeeper
>  - start solr on all nodes
>  - create a collection with the same name as in the first step
>
> When doing this, solr wipes out the previous collection data and starts
> new.
>
> In our case, this was due to a startup script that checked for the
> existence of a collection and created it if non-existent.  When not present
> in ZK, solr (as it should) didn't return that collection in it's list of
> collections so we created it...
>
> Possible that you have something similar in your workflow?
>
> Kyle
>
> On Wed, 9 Jan 2019 at 10:22, Erick Erickson 
> wrote:
>
> > Solr doesn't just remove directories, this is very likely
> > something in your environment that's doing this.
> >
> > In any case, there's no information here to help
> > diagnose. You must tell us _exactly_ what steps
> > you take in order to have any hope of helping.
> >
> > Best,
> > Erick
> >
> > On Wed, Jan 9, 2019 at 2:48 AM Yogendra Kumar Soni
> >  wrote:
> > >
> > > We are running a solr cloud cluster using solr 7.4 with 8 shards. When
> we
> > > started our solr cloud with a zookeeper node (without collections
> > directory
> > > but with only solr.xml and configs) our data directory containing
> > > core.propery and cores data becomes empty.
> > >
> > >
> > >
> > >
> > > --
> > > *Thanks and Regards,*
> > > *Yogendra Kumar Soni*
> >
>


-- 
*Thanks and Regards,*
*Yogendra Kumar Soni*