[scottchu] How to specify multiple zk nodes using solr start command under Windows

2016-05-17 Thread scott.chu
I start 3 zk nodes at port 2181,2182, and 2183 on my local machine.
Go into Solr 5.4.1 root folder and issue and issue the command in article 
'Setting Up an External ZooKeeper Ensemble' in reference guide 

bin\Solr start -c -s mynodes\node1 -z 
localhost:2181,localhost:2181,localhost:2183

but it doesn't run but just show help page of start command in solr.cmd. How 
should I issue the correct command?


How to encrypte the password for basic authentication of Solr

2016-05-17 Thread tjlp
 Hi,
In the Wiki 
https://cwiki.apache.org/confluence/display/solr/Basic+Authentication+Plugin, 
the basic authentication is introduced. With basic authentication, the sample 
security.json has the line 

"credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0= 
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="}
How to get the encrypted password for the user solr?
ThanksLiu Peng


Creating a collection with 1 shard gives a weird range

2016-05-17 Thread John Smith
I'm trying to create a collection starting with only one shard
(numShards=1) using a compositeID router. The purpose is to start small
and begin splitting shards when the index grows larger. The shard
created gets a weird range value: 8000-7fff, which doesn't look
effective. Indeed, if a try to import some documents using a DIH, none
gets added.

If I create the same collection with 2 shards, the ranges seem more
logical (0-7fff & 8000-). In this case documents are
indexed correctly.

Is this behavior by design, i.e. is a minimum of 2 shards required? If
not, how can I create a working collection with a single shard?

This is Solr-6.0.0 in cloud mode with zookeeper-3.4.8.

Thanks,
John


Sub faceting on string field using json facet runs extremly slow

2016-05-17 Thread Vijay Tiwary
Hello all,
I have an index of 8 shards having 1 replica each distubuted across 8 node
solr cloud . Size of index is 300 gb having 30 million documents. Solr json
facet runs extremly slow if I am sub faceting on string field even if
tnumfound is only around 2 (also I am not returning any rows i.e
rows=0).
Is there any way to improve the performance?

Thanks,
Vijay


Re: Creating a collection with 1 shard gives a weird range

2016-05-17 Thread Tom Evans
On Tue, May 17, 2016 at 9:40 AM, John Smith  wrote:
> I'm trying to create a collection starting with only one shard
> (numShards=1) using a compositeID router. The purpose is to start small
> and begin splitting shards when the index grows larger. The shard
> created gets a weird range value: 8000-7fff, which doesn't look
> effective. Indeed, if a try to import some documents using a DIH, none
> gets added.
>
> If I create the same collection with 2 shards, the ranges seem more
> logical (0-7fff & 8000-). In this case documents are
> indexed correctly.
>
> Is this behavior by design, i.e. is a minimum of 2 shards required? If
> not, how can I create a working collection with a single shard?
>
> This is Solr-6.0.0 in cloud mode with zookeeper-3.4.8.
>

I believe this is as designed, see this email from Shawn:

https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201604.mbox/%3c570d0a03.5010...@elyograg.org%3E

Cheers

Tom


Re: [scottchu] How to specify multiple zk nodes using solr start commandunder Windows

2016-05-17 Thread scott.chu

I issue  '-z localhost:2181 -z localhost:2182 -z localhost:2183' for each 
node's start command and later when I create collection, all 3 zk nodes has 
registered my configset. 
Never try but I think maybe only use -z localhost:2181, then all 3 nodes in zk 
ensemble will synchronize themselves.

scott.chu,scott@udngroup.com
2016/5/17 (週二)
- Original Message - 
From: scott(自己) 
To: solr-user 
CC: 
Date: 2016/5/17 (週二) 15:35
Subject: [scottchu] How to specify multiple zk nodes using solr start 
commandunder Windows


I start 3 zk nodes at port 2181,2182, and 2183 on my local machine. 
Go into Solr 5.4.1 root folder and issue and issue the command in article 
'Setting Up an External ZooKeeper Ensemble' in reference guide 

bin\Solr start -c -s mynodes\node1 -z 
localhost:2181,localhost:2181,localhost:2183 

but it doesn't run but just show help page of start command in solr.cmd. How 
should I issue the correct command? 


- 
未在此訊息中找到病毒。 
已透過 AVG 檢查 - www.avg.com 
版本: 2015.0.6201 / 病毒庫: 4568/12245 - 發佈日期: 05/16/16


RE: http request to MiniSolrCloudCluster

2016-05-17 Thread Rohana Rajapakse
Thanks Chris for your reply.

I am aware of the requirement for "baseDir" to be empty. In my test code, I 
delete the "baseDir" completely (in Junit test setup) and re-create it to make 
sure there is nothing in it. So, there shouldn't be a problem with that.

What I am puzzled is that I have no problem with accessing the same 
MiniSolrCloudCluster using solrj.  This means my cluster is fine, but it is not 
responding to requests made directly over http.

Any chance for someone to pass me a simple jUnit test code in which you create 
a MiniSolrCloudCluster which can be pinged from the command line (using cURL), 
Or can someone test this and let me know if it works. Alternatively, I can pass 
a sample jUnit test class (in a gist), if anyone is willing to have a look at 
it.

Thanks

Rohana

 

-Original Message-
From: Chris Hostetter [mailto:hossman_luc...@fucit.org] 
Sent: 16 May 2016 19:20
To: solr-user@lucene.apache.org
Subject: RE: http request to MiniSolrCloudCluster



Hmmm... is baseDir empty before you call new MiniSolrCloudCluster ?

My best guess is you are getting bit by this...

https://issues.apache.org/jira/browse/SOLR-8999


: I am only setting up a MiniSolrCloudCluster with 2 servers like this:
: 
:   JettyConfig  jettyConfig = 
JettyConfig.builder().waitForLoadingCoresToFinish(null).setContext("/solr").build();
:   MiniSolrCloudCluster  miniCluster = new MiniSolrCloudCluster(2, 
Paths.get(baseDir), jettyConfig);
: 
: I can see the "zookeeper", "node1",  "ndoe2" folders being created  (with 
content in them) in my $baseDir. I have not added any data to Solr index yet.
: 
: I don't know what "overseer" is and how to check status of it. My only 
concern is if things are not cleared in zookeeper. Is there any way to check 
zookeeper DB?
: 
: As I mentioned before, the cluster works fine when I access it via SolrClient 
(solrj). The issue is when making http requests.
: 
: Can someone please test making an http request to a MiniSolrCloudCluster 
(created outside of Solr) and let me know if it works fine.
: 
: The log messages during starting up the mini cloud include the following 
error messages:
: 
: ...
: 15:34:11,750 INFO  ~ Watcher 
org.apache.solr.common.cloud.ConnectionManager@6e374fbf 
name:ZooKeeperConnection Watcher:127.0.0.1:15570/solr got event WatchedEvent 
state:SyncConnected type:None path:null path:null type:None
: 15:34:11,750 INFO  ~ Client is connected to ZooKeeper
: 15:34:11,752 INFO  ~ Got user-level KeeperException when processing 
sessionid:0x154aa89febb0006 type:create cxid:0x1 zxid:0x10 txntype:-1 
reqpath:n/a Error Path:/solr/overseer Error:KeeperErrorCode = NodeExists for 
/solr/overseer
: 15:34:11,774 INFO  ~ makePath: /overseer/queue
: 15:34:11,774 INFO  ~ makePath: /overseer/queue
: 15:34:11,776 INFO  ~ Got user-level KeeperException when processing 
sessionid:0x154aa89febb0006 type:create cxid:0x5 zxid:0x12 txntype:-1 
reqpath:n/a Error Path:/solr/overseer/queue Error:KeeperErrorCode = NodeExists 
for /solr/overseer/queue
: 15:34:11,799 INFO  ~ Got user-level KeeperException when processing 
sessionid:0x154aa89febb0006 type:create cxid:0x6 zxid:0x13 txntype:-1 
reqpath:n/a Error Path:/solr/overseer Error:KeeperErrorCode = NodeExists for 
/solr/overseer
: 15:34:11,799 INFO  ~ Got user-level KeeperException when processing 
sessionid:0x154aa89febb0005 type:create cxid:0x7 zxid:0x14 txntype:-1 
reqpath:n/a Error Path:/solr/overseer Error:KeeperErrorCode = NodeExists for 
/solr/overseer
: 15:34:11,823 INFO  ~ makePath: /overseer/collection-queue-work
: 15:34:11,824 INFO  ~ makePath: /overseer/collection-queue-work
: 15:34:11,825 INFO  ~ Got user-level KeeperException when processing 
sessionid:0x154aa89febb0005 type:create cxid:0xb zxid:0x16 txntype:-1 
reqpath:n/a Error Path:/solr/overseer/collection-queue-work 
Error:KeeperErrorCode = NodeExists for /solr/overseer/collection-queue-work
: 15:34:11,847 INFO  ~ Got user-level KeeperException when processing 
sessionid:0x154aa89febb0005 type:create cxid:0xc zxid:0x17 txntype:-1 
reqpath:n/a Error Path:/solr/overseer Error:KeeperErrorCode = NodeExists for 
/solr/overseer
: 15:34:11,847 INFO  ~ Got user-level KeeperException when processing 
sessionid:0x154aa89febb0006 type:create cxid:0xc zxid:0x18 txntype:-1 
reqpath:n/a Error Path:/solr/overseer Error:KeeperErrorCode = NodeExists for 
/solr/overseer
: 15:34:11,860 INFO  ~ Got user-level KeeperException when processing 
sessionid:0x154aa89febb0005 type:create cxid:0xd zxid:0x19 txntype:-1 
reqpath:n/a Error Path:/solr/overseer Error:KeeperErrorCode = NodeExists for 
/solr/overseer
: 15:34:11,860 INFO  ~ Got user-level KeeperException when processing 
sessionid:0x154aa89febb0006 type:create cxid:0xd zxid:0x1a txntype:-1 
reqpath:n/a Error Path:/solr/overseer Error:KeeperErrorCode = NodeExists for 
/solr/overseer
: 15:34:11,885 INFO  ~ Got user-level KeeperException when processing 
sessionid:0x154aa89febb0005 type:create cxid:0xf zxid:0x1b 

Re: SolrCloud replicas consistently out of sync

2016-05-17 Thread Daniel Collins
Terminology question: by nodes I assume you mean machines? So "8 nodes,
with 4 shards a piece, all running one collection with about 900M
documents", is 1 collection split into 32 shards, with 4 shards located on
each machine?  Is each shard in its own JVM, or do you have 1 JVM on each
machine running 4 Solr Cores.

>From looking at those pastie logs, the system is trying to peersync but
always failing, and then attempting replication which claims to succeed.
As Erick says, that should be the same whether you start 1 machine or all
of them...  BTW, what are the timestamps you are using, I assume that's the
first column in the logs?



On 17 May 2016 at 04:50, Erick Erickson  wrote:

> OK, this is very strange. There's no _good_ reason that
> restarting the servers should make a difference. The fact
> that it took 1/2 hour leads me to believe, though, that your
> shards are somehow "incomplete", especially that you
> are indexing to the system and don't have, say,
> your autocommit settings done very well. The long startup
> implies (guessing) that you have pretty big tlogs that
> are replayed upon startup. While these were coming up,
> did you see any of the shards in the "recovering" state? That's
> the only way I can imagine that Solr "healed" itself.
>
> I've got to point back to the Solr logs. Are they showing
> any anomalies? Are any nodes in recovery when you restart?
>
> Best,
> Erick
>
>
>
> On Mon, May 16, 2016 at 4:14 PM, Stephen Weiss 
> wrote:
> > Just one more note - while experimenting, I found that if I stopped all
> nodes (full cluster shutdown), and then startup all nodes, they do in fact
> seem to repair themselves then.  We have a script to monitor the
> differences between replicas (just looking at numDocs) and before the full
> shutdown / restart, we had:
> >
> > wks53104:Downloads sweiss$ php testReplication.php
> > Found 32 mismatched shard counts.
> > instock_shard1   replica 1: 30785553 replica 2: 30777568
> > instock_shard10   replica 1: 30972662 replica 2: 30966215
> > instock_shard11   replica 2: 31036718 replica 1: 31033547
> > instock_shard12   replica 1: 30179823 replica 2: 30176067
> > instock_shard13   replica 2: 30604638 replica 1: 30599219
> > instock_shard14   replica 2: 30755117 replica 1: 30753469
> > instock_shard15   replica 2: 30891325 replica 1: 30888771
> > instock_shard16   replica 1: 30818260 replica 2: 30811728
> > instock_shard17   replica 1: 30422080 replica 2: 30414666
> > instock_shard18   replica 2: 30874530 replica 1: 30869977
> > instock_shard19   replica 2: 30917008 replica 1: 30913715
> > instock_shard2   replica 1: 31062073 replica 2: 31057583
> > instock_shard20   replica 1: 30188774 replica 2: 30186565
> > instock_shard21   replica 2: 30789012 replica 1: 30784160
> > instock_shard22   replica 2: 30820473 replica 1: 30814822
> > instock_shard23   replica 2: 30552105 replica 1: 30545802
> > instock_shard24   replica 1: 30973906 replica 2: 30971314
> > instock_shard25   replica 1: 30732287 replica 2: 30724988
> > instock_shard26   replica 1: 31465543 replica 2: 31463414
> > instock_shard27   replica 2: 30845514 replica 1: 30842665
> > instock_shard28   replica 2: 30549151 replica 1: 30543070
> > instock_shard29   replica 2: 30635711 replica 1: 30629240
> > instock_shard3   replica 1: 30930400 replica 2: 30928438
> > instock_shard30   replica 2: 30902221 replica 1: 30895176
> > instock_shard31   replica 2: 31174246 replica 1: 31169998
> > instock_shard32   replica 2: 30931550 replica 1: 30926256
> > instock_shard4   replica 2: 30755525 replica 1: 30748922
> > instock_shard5   replica 2: 31006601 replica 1: 30994316
> > instock_shard6   replica 2: 31006531 replica 1: 31003444
> > instock_shard7   replica 2: 30737098 replica 1: 30727509
> > instock_shard8   replica 2: 30619869 replica 1: 30609084
> > instock_shard9   replica 1: 31067833 replica 2: 31061238
> >
> >
> > This stayed consistent for several hours.
> >
> > After restart:
> >
> > wks53104:Downloads sweiss$ php testReplication.php
> > Found 3 mismatched shard counts.
> > instock_shard19   replica 2: 30917008 replica 1: 30913715
> > instock_shard22   replica 2: 30820473 replica 1: 30814822
> > instock_shard26   replica 1: 31465543 replica 2: 31463414
> > wks53104:Downloads sweiss$ php testReplication.php
> > Found 2 mismatched shard counts.
> > instock_shard19   replica 2: 30917008 replica 1: 30913715
> > instock_shard26   replica 1: 31465543 replica 2: 31463414
> > wks53104:Downloads sweiss$ php testReplication.php
> > Everything looks peachy
> >
> > Took about a half hour to get there.
> >
> > Maybe the question should be - any way to get solrcloud to trigger this
> *without* having to shut down / restart all nodes?  Even if we had to
> trigger that manually after indexing, it would be fine.  It's a very
> controlled indexing workflow that only happens once a day.
> >
> > --
> > Steve
> >
> > On Mon, May 16, 2016 at 6:52 PM, Stephen Weiss  > wrote:
> > Each node has 

json.facet streaming

2016-05-17 Thread Nick Vasilyev
I am on the nightly build of 6.1 and I am experimenting with json.facet
streaming, however the response I am getting back looks like regular query
response. I was expecting something like the streaming api. Is this right
or am I missing something?

Hhere is the json.facet string.

'json.facet':str({ "groups":{
"type": "terms",
"field": "group",
"method":"stream"
}}),

The group field is a string field with DocValues enabled.

Thanks


Re: json.facet streaming

2016-05-17 Thread Yonik Seeley
Perhaps try turning on request debugging and see what is actually
being received by Solr?

-Yonik


On Tue, May 17, 2016 at 8:33 AM, Nick Vasilyev  wrote:
> I am on the nightly build of 6.1 and I am experimenting with json.facet
> streaming, however the response I am getting back looks like regular query
> response. I was expecting something like the streaming api. Is this right
> or am I missing something?
>
> Hhere is the json.facet string.
>
> 'json.facet':str({ "groups":{
> "type": "terms",
> "field": "group",
> "method":"stream"
> }}),
>
> The group field is a string field with DocValues enabled.
>
> Thanks


Re: json.facet streaming

2016-05-17 Thread Nick Vasilyev
I enabled query debugging, here is the facet-trace snippet.

"facet-trace":{
  "processor":"FacetQueryProcessor",
  "elapse":0,
  "query":null,
  "domainSize":43046041,
  "sub-facet":[{
  "processor":"FacetFieldProcessorStream",
  "elapse":0,
  "field":"group",
  "limit":10,
  "domainSize":8980542},
{
  "processor":"FacetFieldProcessorStream",
  "elapse":0,
  "field":"group",
  "limit":10,
  "domainSize":9005295},
{
  "processor":"FacetFieldProcessorStream",
  "elapse":0,
  "field":"group",
  "limit":10,
  "domainSize":7555021},
{
  "processor":"FacetFieldProcessorStream",
  "elapse":0,
  "field":"group",
  "limit":10,
  "domainSize":8928379},
{
  "processor":"FacetFieldProcessorStream",
  "elapse":0,
  "field":"group",
  "limit":10,
  "domainSize":8576804}]},
"json":{"facet":{"groups":{
  "type":"terms",
  "field":"group",
  "method":"stream"}}},

On Tue, May 17, 2016 at 8:42 AM, Yonik Seeley  wrote:

> Perhaps try turning on request debugging and see what is actually
> being received by Solr?
>
> -Yonik
>
>
> On Tue, May 17, 2016 at 8:33 AM, Nick Vasilyev 
> wrote:
> > I am on the nightly build of 6.1 and I am experimenting with json.facet
> > streaming, however the response I am getting back looks like regular
> query
> > response. I was expecting something like the streaming api. Is this right
> > or am I missing something?
> >
> > Hhere is the json.facet string.
> >
> > 'json.facet':str({ "groups":{
> > "type": "terms",
> > "field": "group",
> > "method":"stream"
> > }}),
> >
> > The group field is a string field with DocValues enabled.
> >
> > Thanks
>


Re: state.json being downloaded every 10 seconds

2016-05-17 Thread Shawn Heisey
On 5/16/2016 10:28 PM, Jeff Wartes wrote:
> One thing that still feels a bit odd though is that the health check query 
> was referencing a collection that no longer existed in the cluster. So it 
> seems like it was downloading the state for ALL non-hosted collections, not a 
> requested one.
>
> This touches a bit on a sore point with me. I dislike that those 
> collection-not-here proxy requests aren’t logged on the server doing the 
> proxy, because you end up with traffic visible at the http interface but not 
> the solr level. Honestly, I dislike that transparent proxy approach in 
> general, because it means I lose the ability to dedicate entire nodes to the 
> fan-out and shard-aggregation process like I could pre-solrcloud.

Recently I was informed that CloudSolrClient operates on cached state
information, and doesn't make requests to zookeeper very often.  I don't
know if internal cloud queries are handled the same, but they probably
are.  I think you've found an exception to that behavior -- collections
that don't exist.

On a small cloud, this is likely not a performance bottleneck at all ...
but a cloud with dozens of servers and hundreds of collections would be
a different story.

I just tried a query against a small cloud install (running 4.2.1) for a
nonexistent collection, and yes indeed, there's nothing logged *at
all*.  I would have expected a log entry of SOME kind.

So, I think you've found two problems that each need an issue in Jira. 
I hesitate slightly at calling them bugs, because it's probably working
as designed ... but if so, I think the design is incorrect.

1) There are no log entry for queries to nonexistent collection.  The
full SolrCore log entry showing all the parameters would be nice, but
even something short about a query to a nonexistent collection would be
enough.

2) When a collection doesn't exist, that fact should be cached in the
same way that a good clusterstate is cached, to reduce traffic to zookeeper.


Additional detail for 2) above:

I'm envisioning a separate cache memory structure from the
clusterstate(s), which should probably be kept fairly small.  Denial of
service attacks on publicly-accessible servers are the only time that a
cloud is *likely* to receive requests for many collections that don't
exist.  Misconfigurations are more likely to request a few nonexistent
collections repeatedly.

For older 4.x servers, nonexistent collections might actually be cached,
because in the older versions, the entire clusterstate for all
collections is contained in a single file.  If that file is cached, then
so is the fact that a given collection doesn't exist.

Thanks,
Shawn



Re: How to encrypte the password for basic authentication of Solr

2016-05-17 Thread Shawn Heisey
On 5/17/2016 2:23 AM, t...@sina.com wrote:
> In the Wiki
> https://cwiki.apache.org/confluence/display/solr/Basic+Authentication+Plugin,
> the basic authentication is introduced. With basic authentication, the
> sample security.json has the line
> "credentials":{"solr":"IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0=
> Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="} How to get the
> encrypted password for the user solr?

You can't get the password from the encrypted version.  That's the
entire point of encrypting the password.

The cleartext version of the password for the example security.json file
is mentioned in the text below that example file in the documentation
link that you mentioned.  It is "SolrRocks" without the quotes.

Thanks,
Shawn



Re: How to encrypte the password for basic authentication of Solr

2016-05-17 Thread Shawn Heisey
On 5/17/2016 7:23 AM, Shawn Heisey wrote:
> On 5/17/2016 2:23 AM, t...@sina.com wrote:
>> How to get the encrypted password for the user solr? 
> You can't get the password from the encrypted version. That's the
> entire point of encrypting the password. 

It occurred to me after I sent this that I perhaps have answered the
wrong question.

Once you have the example security.json in place, you can use that solr
user (and the default password) with the set-user functionality in the
/admin/authentication API to create a new user or to change the password
on the solr user.  This is described in the documentation page you
linked in a section titled "Add a User or Edit a Password".  If you
choose to add a user, you can use that user to delete the initial solr
user with the delete-user functionality.

There is no utility provided with Solr that can generate encrypted
passwords.  I think we need one.

Thanks,
Shawn



Re: json.facet streaming

2016-05-17 Thread Yonik Seeley
So it looks like facets are being computed... do you not see them in
the response?
-Yonik


On Tue, May 17, 2016 at 9:12 AM, Nick Vasilyev  wrote:
> I enabled query debugging, here is the facet-trace snippet.
>
> "facet-trace":{
>   "processor":"FacetQueryProcessor",
>   "elapse":0,
>   "query":null,
>   "domainSize":43046041,
>   "sub-facet":[{
>   "processor":"FacetFieldProcessorStream",
>   "elapse":0,
>   "field":"group",
>   "limit":10,
>   "domainSize":8980542},
> {
>   "processor":"FacetFieldProcessorStream",
>   "elapse":0,
>   "field":"group",
>   "limit":10,
>   "domainSize":9005295},
> {
>   "processor":"FacetFieldProcessorStream",
>   "elapse":0,
>   "field":"group",
>   "limit":10,
>   "domainSize":7555021},
> {
>   "processor":"FacetFieldProcessorStream",
>   "elapse":0,
>   "field":"group",
>   "limit":10,
>   "domainSize":8928379},
> {
>   "processor":"FacetFieldProcessorStream",
>   "elapse":0,
>   "field":"group",
>   "limit":10,
>   "domainSize":8576804}]},
> "json":{"facet":{"groups":{
>   "type":"terms",
>   "field":"group",
>   "method":"stream"}}},
>
> On Tue, May 17, 2016 at 8:42 AM, Yonik Seeley  wrote:
>
>> Perhaps try turning on request debugging and see what is actually
>> being received by Solr?
>>
>> -Yonik
>>
>>
>> On Tue, May 17, 2016 at 8:33 AM, Nick Vasilyev 
>> wrote:
>> > I am on the nightly build of 6.1 and I am experimenting with json.facet
>> > streaming, however the response I am getting back looks like regular
>> query
>> > response. I was expecting something like the streaming api. Is this right
>> > or am I missing something?
>> >
>> > Hhere is the json.facet string.
>> >
>> > 'json.facet':str({ "groups":{
>> > "type": "terms",
>> > "field": "group",
>> > "method":"stream"
>> > }}),
>> >
>> > The group field is a string field with DocValues enabled.
>> >
>> > Thanks
>>


Re: [scottchu] How to specify multiple zk nodes using solr start commandunder Windows

2016-05-17 Thread John Bickerstaff
In your original command, you listed the same port twice.  That may have
been at least part of the difficulty.

It's probably fine to just use one zk node - as the zookeeper instances
should be aware of each other.

I also assume that if your solr.in.sh (or windows equavalent) has the
properly formatted entry for all the zk nodes, Solr will be able to find a
different one if the one you pass in goes down...

Here's the entry from my file in case it's helpful.  I got away without the
port number (I assume) because I'm using the default 2181 on all my
zookeeper nodes which are separate servers.

# Set the ZooKeeper connection string if using an external ZooKeeper
ensemble
# e.g. host1:2181,host2:2181/chroot
# Leave empty if not using SolrCloud
ZK_HOST="192.168.56.5,192.168.56.6,192.168.56.7/solr5_4"


On Tue, May 17, 2016 at 4:30 AM, scott.chu  wrote:

>
> I issue  '-z localhost:2181 -z localhost:2182 -z localhost:2183' for each
> node's start command and later when I create collection, all 3 zk nodes has
> registered my configset.
> Never try but I think maybe only use -z localhost:2181, then all 3 nodes
> in zk ensemble will synchronize themselves.
>
> scott.chu,scott@udngroup.com
> 2016/5/17 (週二)
> - Original Message -
> From: scott(自己)
> To: solr-user
> CC:
> Date: 2016/5/17 (週二) 15:35
> Subject: [scottchu] How to specify multiple zk nodes using solr start
> commandunder Windows
>
>
> I start 3 zk nodes at port 2181,2182, and 2183 on my local machine.
> Go into Solr 5.4.1 root folder and issue and issue the command in article
> 'Setting Up an External ZooKeeper Ensemble' in reference guide
>
> bin\Solr start -c -s mynodes\node1 -z
> localhost:2181,localhost:2181,localhost:2183
>
> but it doesn't run but just show help page of start command in solr.cmd.
> How should I issue the correct command?
>
>
> -
> 未在此訊息中找到病毒。
> 已透過 AVG 檢查 - www.avg.com
> 版本: 2015.0.6201 / 病毒庫: 4568/12245 - 發佈日期: 05/16/16
>


Re: json.facet streaming

2016-05-17 Thread Nick Vasilyev
Hi Yonik, I do see them in the response, but the JSON format is like
standard facet output. I am not sure what streaming facet response would
look like, but I expected it to be similar to the streaming API. Is this
the case?

On Tue, May 17, 2016 at 9:35 AM, Yonik Seeley  wrote:

> So it looks like facets are being computed... do you not see them in
> the response?
> -Yonik
>
>
> On Tue, May 17, 2016 at 9:12 AM, Nick Vasilyev 
> wrote:
> > I enabled query debugging, here is the facet-trace snippet.
> >
> > "facet-trace":{
> >   "processor":"FacetQueryProcessor",
> >   "elapse":0,
> >   "query":null,
> >   "domainSize":43046041,
> >   "sub-facet":[{
> >   "processor":"FacetFieldProcessorStream",
> >   "elapse":0,
> >   "field":"group",
> >   "limit":10,
> >   "domainSize":8980542},
> > {
> >   "processor":"FacetFieldProcessorStream",
> >   "elapse":0,
> >   "field":"group",
> >   "limit":10,
> >   "domainSize":9005295},
> > {
> >   "processor":"FacetFieldProcessorStream",
> >   "elapse":0,
> >   "field":"group",
> >   "limit":10,
> >   "domainSize":7555021},
> > {
> >   "processor":"FacetFieldProcessorStream",
> >   "elapse":0,
> >   "field":"group",
> >   "limit":10,
> >   "domainSize":8928379},
> > {
> >   "processor":"FacetFieldProcessorStream",
> >   "elapse":0,
> >   "field":"group",
> >   "limit":10,
> >   "domainSize":8576804}]},
> > "json":{"facet":{"groups":{
> >   "type":"terms",
> >   "field":"group",
> >   "method":"stream"}}},
> >
> > On Tue, May 17, 2016 at 8:42 AM, Yonik Seeley  wrote:
> >
> >> Perhaps try turning on request debugging and see what is actually
> >> being received by Solr?
> >>
> >> -Yonik
> >>
> >>
> >> On Tue, May 17, 2016 at 8:33 AM, Nick Vasilyev <
> nick.vasily...@gmail.com>
> >> wrote:
> >> > I am on the nightly build of 6.1 and I am experimenting with
> json.facet
> >> > streaming, however the response I am getting back looks like regular
> >> query
> >> > response. I was expecting something like the streaming api. Is this
> right
> >> > or am I missing something?
> >> >
> >> > Hhere is the json.facet string.
> >> >
> >> > 'json.facet':str({ "groups":{
> >> > "type": "terms",
> >> > "field": "group",
> >> > "method":"stream"
> >> > }}),
> >> >
> >> > The group field is a string field with DocValues enabled.
> >> >
> >> > Thanks
> >>
>


Re: json.facet streaming

2016-05-17 Thread Yonik Seeley
On Tue, May 17, 2016 at 9:41 AM, Nick Vasilyev  wrote:
> Hi Yonik, I do see them in the response, but the JSON format is like
> standard facet output. I am not sure what streaming facet response would
> look like, but I expected it to be similar to the streaming API. Is this
> the case?

Nope.
The method is an execution hint (calculate the facets via this
method), and should not normally affect what the response looks like.

-Yonik


Re: json.facet streaming

2016-05-17 Thread Nick Vasilyev
Got it. Thanks for clarifying.

On Tue, May 17, 2016 at 9:58 AM, Yonik Seeley  wrote:

> On Tue, May 17, 2016 at 9:41 AM, Nick Vasilyev 
> wrote:
> > Hi Yonik, I do see them in the response, but the JSON format is like
> > standard facet output. I am not sure what streaming facet response would
> > look like, but I expected it to be similar to the streaming API. Is this
> > the case?
>
> Nope.
> The method is an execution hint (calculate the facets via this
> method), and should not normally affect what the response looks like.
>
> -Yonik
>


Re: Sorting for MLT results

2016-05-17 Thread Alessandro Benedetti
using the more like this query parser should solve your problem !
Just use that query parser and than sort as usual.

Cheers

On Wed, May 11, 2016 at 4:53 AM, Zheng Lin Edwin Yeo 
wrote:

> Hi,
>
> Would like to check, is there a function to do the sorting for MLT results
> in Solr? I understand that there is a sort function, but that only works
> for the main query results. It does not do any sorting for the MLT results
>
> I'm using Solr 5.4.0.
>
> Regards,
> Edwin
>



-- 
--

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England


Conditional atomic update

2016-05-17 Thread Chris Yee
I'm looking for a way to do an atomic update, but if a certain condition
exists on the existing document, abort the update.

Each document has the fields id, count, and value.  The source data has
just id and value.

When the source data is indexed, I use atomic updates to:
- Increment the count value in the existing document
- Add the source value to the existing document's value

What I'd like to do is abort the update if the existing document has a
count of 5.  Is there a way to do this with a custom update processor?


Re: Dynamically change solr suggest field

2016-05-17 Thread Lasitha Wattaladeniya
Hi Alessandro,

Yes, using suggester is the correct way of doing. But in our scenario we
thought of going with the spellchecker component since we had some legacy
setup. Our plan is to move to suggester later on.

So far spellchecker component also does the needed work for us.

Regards,
Lasitha

Lasitha Wattaladeniya
Software Engineer

Mobile : +6593896893
Blog : techreadme.blogspot.com

On Tue, May 17, 2016 at 2:15 AM, Alessandro Benedetti <
benedetti.ale...@gmail.com> wrote:

> The scenario you described should be done with the suggester component.
> Nothing prevent you to configure multiple dictionaries for the suggester as
> well.
> The you pass the dictionary to the suggester at query time as a request
> parameter for your suggester request handler.
>
> Cheers
> On 16 May 2016 4:29 pm, "Abdel Belkasri"  wrote:
>
> > Clever and real cool.
> > --Abdel
> >
> > On Sun, May 15, 2016 at 10:42 AM, Lasitha Wattaladeniya <
> watt...@gmail.com
> > >
> > wrote:
> >
> > > Hello all,
> > >
> > > I found a way of doing this and thought of sharing this info with you.
> I
> > > found a way to dynamically change the field which gives the
> suggestions.
> > > It's using the solr spellchecker (Not suggester). You can basically
> > > configure a  indexed field as default *spellcheck.dictionary* in the
> > config
> > > file. Later you can set what ever the field you want suggestions from
> in
> > > the request (Can set more than 1) as the spellcheck.dictionary. This
> way
> > > you can set even multiple fields as spellchecker dictionaries and
> > > suggestions will be returned according to the indexed values of those
> > > field.
> > >
> > > Regards,
> > > Lasitha.
> > >
> > > Lasitha Wattaladeniya
> > > Software Engineer
> > >
> > > Mobile : +6593896893
> > > Blog : techreadme.blogspot.com
> > >
> > > On Thu, May 12, 2016 at 8:05 AM, Lasitha Wattaladeniya <
> > watt...@gmail.com>
> > > wrote:
> > >
> > > > Hi Nick,
> > > >
> > > > Thanks for the reply. According to my requirement I can use only
> option
> > > > one. I thought about that solution but I was bit lazy to implement
> that
> > > > since I have many modules and solr cores. If I'm going to configure
> > > request
> > > > handlers for each drop down value in each component it seems like a
> lot
> > > of
> > > > work. Anyway this seems like the only way forward.
> > > >
> > > > I can't use the option two because the combo box select the filed,
> not
> > a
> > > > value specific to a single field
> > > >
> > > > Best regards,
> > > > Lasitha
> > > >
> > > > Lasitha Wattaladeniya
> > > > Software Engineer
> > > >
> > > > Mobile : +6593896893
> > > > Blog : techreadme.blogspot.com
> > > >
> > > > On Wed, May 11, 2016 at 11:41 PM, Nick D 
> wrote:
> > > >
> > > >> There are only two ways I can think of to accomplish this and
> neither
> > of
> > > >> them are dynamically setting the suggester field as is looks
> according
> > > to
> > > >> the Doc (which does sometimes have lacking info so I might be wrong)
> > you
> > > >> cannot set something like *suggest.fl=combo_box_field* at query
> time.
> > > But
> > > >> maybe they can help you get started.
> > > >>
> > > >> 1. Multiple suggester request handlers for each option in combo box.
> > > This
> > > >> way you just change the request handler in the query you submit
> based
> > on
> > > >> the context.
> > > >>
> > > >> 2. Use copy fields to put all possible suggestions into same field
> > name,
> > > >> so
> > > >> no more dynamic field settings, with another field defining whatever
> > the
> > > >> option would be for that document out of the combo box and use
> context
> > > >> filters which can be passed at query time to limit the suggestions
> to
> > > >> those
> > > >> filtered by whats in the combo box.
> > > >>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/solr/Suggester#Suggester-ContextFiltering
> > > >>
> > > >> Hope this helps a bit
> > > >>
> > > >> Nick
> > > >>
> > > >> On Wed, May 11, 2016 at 7:05 AM, Lasitha Wattaladeniya <
> > > watt...@gmail.com
> > > >> >
> > > >> wrote:
> > > >>
> > > >> > Hello devs,
> > > >> >
> > > >> > I'm trying to implement auto complete text suggestions using
> solr. I
> > > >> have a
> > > >> > text box and next to that there's a combo box. So the auto
> complete
> > > >> should
> > > >> > suggest based on the value selected in the combo box.
> > > >> >
> > > >> > Basically I should be able to change the suggest field based on
> the
> > > >> value
> > > >> > selected in the combo box. I was trying to solve this problem
> whole
> > > day
> > > >> but
> > > >> > not much luck. Can anybody tell me is there a way of doing this ?
> > > >> >
> > > >> > Regards,
> > > >> > Lasitha.
> > > >> >
> > > >> > Lasitha Wattaladeniya
> > > >> > Software Engineer
> > > >> >
> > > >> > Mobile : +6593896893
> > > >> > Blog : techreadme.blogspot.com
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > Abdel K. Belkasri, PhD
> >
>


Re: [scottchu] How to specify multiple zk nodes using solr start command under Windows

2016-05-17 Thread Abdel Belkasri
Hi Scott,
what worked for me in Windows is this (no ",")
bin\Solr start -c -s mynodes\node1 -z localhost:2181 -z localhost:2181 -z
localhost:2183

-- Hope this helps
Abdel.

On Tue, May 17, 2016 at 3:35 AM, scott.chu  wrote:

> I start 3 zk nodes at port 2181,2182, and 2183 on my local machine.
> Go into Solr 5.4.1 root folder and issue and issue the command in article
> 'Setting Up an External ZooKeeper Ensemble' in reference guide
>
> bin\Solr start -c -s mynodes\node1 -z
> localhost:2181,localhost:2181,localhost:2183
>
> but it doesn't run but just show help page of start command in solr.cmd.
> How should I issue the correct command?
>



-- 
Abdel K. Belkasri, PhD


Specifying dynamic field type without polluting actual field names with type indicators

2016-05-17 Thread Horváth Péter Gergely
Hi All,

By default Solr allows you to define the type of a dynamic field by
appending a post-fix to the name itself. E.g. creating a color_s field
instructs Solr to create a string field. My understanding is that if we do
this, all queries must refer the post-fixed field name as well. So
instead of a query like color:"red", we will have to write something like
color_s:"red" -- and so on for other field types as well.

I am wondering if it is possible to specify the data type used for a field
in Solr 6.0.0, without having to modify the field name. (Or at least in a
way that would allow us to use the original field name) Do you have any
idea, how to achieve this? I am fine, if we have to specify the field type
during the insertion of a document, however, I do not want to keep using
post-fixes while running queries...

Thanks,
Peter


Specifying dynamic field type without polluting actual field names with type indicators

2016-05-17 Thread Horváth Péter Gergely
Hi All,

By default Solr allows you to define the type of a dynamic field by
appending a post-fix to the name itself. E.g. creating a color_s field
instructs Solr to create a string field. My understanding is that if we do
this, all queries must refer the post-fixed field name as well. So
instead of a query like color:"red", we will have to write something like
color_s:"red" -- and so on for other field types as well.

I am wondering if it is possible to specify the data type used for a field
in Solr 6.0.0, without having to modify the field name. (Or at least in a
way that would allow us to use the original field name) Do you have any
idea, how to achieve this? I am fine, if we have to specify the field type
during the insertion of a document, however, I do not want to keep using
post-fixes while running queries...

Thanks,
Peter


Re: Adding information to Solr response in custom filter query code?

2016-05-17 Thread ruby
Thanks for your reply. 

Is there any way a custom search component can access data created in custom
post filter query so that the data can be added to the response?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Adding-information-to-Solr-response-in-custom-filter-query-code-tp4276976p4277236.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Specifying dynamic field type without polluting actual field names with type indicators

2016-05-17 Thread Steve Rowe
Hi Peter,

Are you familiar with the Schema API?: 


You can use it to create fields, field types, etc. prior to ingesting your data.

--
Steve
www.lucidworks.com

> On May 17, 2016, at 11:05 AM, Horváth Péter Gergely 
>  wrote:
> 
> Hi All,
> 
> By default Solr allows you to define the type of a dynamic field by
> appending a post-fix to the name itself. E.g. creating a color_s field
> instructs Solr to create a string field. My understanding is that if we do
> this, all queries must refer the post-fixed field name as well. So
> instead of a query like color:"red", we will have to write something like
> color_s:"red" -- and so on for other field types as well.
> 
> I am wondering if it is possible to specify the data type used for a field
> in Solr 6.0.0, without having to modify the field name. (Or at least in a
> way that would allow us to use the original field name) Do you have any
> idea, how to achieve this? I am fine, if we have to specify the field type
> during the insertion of a document, however, I do not want to keep using
> post-fixes while running queries...
> 
> Thanks,
> Peter



Re: SolrCloud replicas consistently out of sync

2016-05-17 Thread Stephen Weiss
Yes, after startup there was a recovery process, you are right.  It's just that 
this process doesn't seem to happen unless we do a full restart.

These are our autocommit settings - to be honest, we did not really use 
autocommit until we switched up to SolrCloud so it's totally possible they are 
not very good settings.  We wanted to minimize the frequency of commits because 
the commits seem to create a performance drag during indexing.   Perhaps it's 
gone overboard?


${solr.autoCommit.maxTime:120}
${solr.autoCommit.maxDocs:10}
false


${solr.autoSoftCommit.maxTime:60}


By nodes, I am indeed referring to machines.  There are 8 shards per machine (2 
replicas of each), all in one JVM a piece.  We haven't specified any specific 
timestamps for the logs - they are just whatever happens by default.

--
Steve

On Mon, May 16, 2016 at 11:50 PM, Erick Erickson 
mailto:erickerick...@gmail.com>> wrote:
OK, this is very strange. There's no _good_ reason that
restarting the servers should make a difference. The fact
that it took 1/2 hour leads me to believe, though, that your
shards are somehow "incomplete", especially that you
are indexing to the system and don't have, say,
your autocommit settings done very well. The long startup
implies (guessing) that you have pretty big tlogs that
are replayed upon startup. While these were coming up,
did you see any of the shards in the "recovering" state? That's
the only way I can imagine that Solr "healed" itself.

I've got to point back to the Solr logs. Are they showing
any anomalies? Are any nodes in recovery when you restart?

Best,
Erick



On Mon, May 16, 2016 at 4:14 PM, Stephen Weiss 
mailto:steve.we...@wgsn.com>> wrote:
> Just one more note - while experimenting, I found that if I stopped all nodes 
> (full cluster shutdown), and then startup all nodes, they do in fact seem to 
> repair themselves then.  We have a script to monitor the differences between 
> replicas (just looking at numDocs) and before the full shutdown / restart, we 
> had:
>
> wks53104:Downloads sweiss$ php testReplication.php
> Found 32 mismatched shard counts.
> instock_shard1   replica 1: 30785553 replica 2: 30777568
> instock_shard10   replica 1: 30972662 replica 2: 30966215
> instock_shard11   replica 2: 31036718 replica 1: 31033547
> instock_shard12   replica 1: 30179823 replica 2: 30176067
> instock_shard13   replica 2: 30604638 replica 1: 30599219
> instock_shard14   replica 2: 30755117 replica 1: 30753469
> instock_shard15   replica 2: 30891325 replica 1: 30888771
> instock_shard16   replica 1: 30818260 replica 2: 30811728
> instock_shard17   replica 1: 30422080 replica 2: 30414666
> instock_shard18   replica 2: 30874530 replica 1: 30869977
> instock_shard19   replica 2: 30917008 replica 1: 30913715
> instock_shard2   replica 1: 31062073 replica 2: 31057583
> instock_shard20   replica 1: 30188774 replica 2: 30186565
> instock_shard21   replica 2: 30789012 replica 1: 30784160
> instock_shard22   replica 2: 30820473 replica 1: 30814822
> instock_shard23   replica 2: 30552105 replica 1: 30545802
> instock_shard24   replica 1: 30973906 replica 2: 30971314
> instock_shard25   replica 1: 30732287 replica 2: 30724988
> instock_shard26   replica 1: 31465543 replica 2: 31463414
> instock_shard27   replica 2: 30845514 replica 1: 30842665
> instock_shard28   replica 2: 30549151 replica 1: 30543070
> instock_shard29   replica 2: 30635711 replica 1: 30629240
> instock_shard3   replica 1: 30930400 replica 2: 30928438
> instock_shard30   replica 2: 30902221 replica 1: 30895176
> instock_shard31   replica 2: 31174246 replica 1: 31169998
> instock_shard32   replica 2: 30931550 replica 1: 30926256
> instock_shard4   replica 2: 30755525 replica 1: 30748922
> instock_shard5   replica 2: 31006601 replica 1: 30994316
> instock_shard6   replica 2: 31006531 replica 1: 31003444
> instock_shard7   replica 2: 30737098 replica 1: 30727509
> instock_shard8   replica 2: 30619869 replica 1: 30609084
> instock_shard9   replica 1: 31067833 replica 2: 31061238
>
>
> This stayed consistent for several hours.
>
> After restart:
>
> wks53104:Downloads sweiss$ php testReplication.php
> Found 3 mismatched shard counts.
> instock_shard19   replica 2: 30917008 replica 1: 30913715
> instock_shard22   replica 2: 30820473 replica 1: 30814822
> instock_shard26   replica 1: 31465543 replica 2: 31463414
> wks53104:Downloads sweiss$ php testReplication.php
> Found 2 mismatched shard counts.
> instock_shard19   replica 2: 30917008 replica 1: 30913715
> instock_shard26   replica 1: 31465543 replica 2: 31463414
> wks53104:Downloads sweiss$ php testReplication.php
> Everything looks peachy
>
> Took about a half hour to get there.
>
> Maybe the question should be - any way to get solrcloud to trigger this 
> *without* having to shut down / restart all nodes?  Even if we had to trigger 
> that manually after indexing, it would be fine.  It's a very controlled 
> indexing workflow that only happens once a day.
>
> --
> S

can custom search component access data in custom post filter component?

2016-05-17 Thread ruby
Is there any way a custom search component can access data created in custom
post filter query component so that the data can be added to the response?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/can-custom-search-component-access-data-in-custom-post-filter-component-tp4277245.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud replicas consistently out of sync

2016-05-17 Thread Erick Erickson
OK, these autocommit settings need revisiting.

First off, I'd remove the maxDocs entirely although with the setting
you're using it probably doesn't matter.

The maxTime of 1,200,000 is 20 minutes. Which means if you evern
un-gracefully kill your shards you'll have up to 20 minutes worth of
data to replay from the tlog or resynch from the leader. Make this
much shorter (6 or less) and be sure to gracefully kill your Solrs.
no "kill -9" for intance

To be sure, before you bounce servers try either waiting 20 minutes
after the indexing stops or issue a manual commit before shutting
down your servers with
http:///solr/collection/update?commit=true

I have a personal annoyance with the bin/solr script where it forcefully
(ungracefully) kills Solr after 5 seconds. I think this is much too short
so you might consider making it longer in prod, it's a shell script so
it's easy.


${solr.autoCommit.maxTime:120}
${solr.autoCommit.maxDocs:10}
false



this is probably the  crux of "shards being out of sync". They're _not_
out of sync, it's just that some of them have docs visible to searches
and some do not since the wall-clock time these are triggered are
_not_ the same. So you have a 10 minute window where two or more
replicas for a single shard are out-of-sync.



${solr.autoSoftCommit.maxTime:60}


You can test all this one of two ways:
1> if you have a timestamp when the docs were indexed, do all
the shards match if you do a query like
q=*:*×tamp:[* TO NOW/-15MINUTES]?
or, if indexing is _not_ occurring, issue a manual commit like
.../solr/collection/update?commit=true
and see if all the replicas match for each shard.

Here's a long blog on commits:
https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Best,
Erick

On Tue, May 17, 2016 at 8:18 AM, Stephen Weiss  wrote:
> Yes, after startup there was a recovery process, you are right.  It's just 
> that this process doesn't seem to happen unless we do a full restart.
>
> These are our autocommit settings - to be honest, we did not really use 
> autocommit until we switched up to SolrCloud so it's totally possible they 
> are not very good settings.  We wanted to minimize the frequency of commits 
> because the commits seem to create a performance drag during indexing.   
> Perhaps it's gone overboard?
>
> 
> ${solr.autoCommit.maxTime:120}
> ${solr.autoCommit.maxDocs:10}
> false
> 
> 
> ${solr.autoSoftCommit.maxTime:60}
> 
>
> By nodes, I am indeed referring to machines.  There are 8 shards per machine 
> (2 replicas of each), all in one JVM a piece.  We haven't specified any 
> specific timestamps for the logs - they are just whatever happens by default.
>
> --
> Steve
>
> On Mon, May 16, 2016 at 11:50 PM, Erick Erickson 
> mailto:erickerick...@gmail.com>> wrote:
> OK, this is very strange. There's no _good_ reason that
> restarting the servers should make a difference. The fact
> that it took 1/2 hour leads me to believe, though, that your
> shards are somehow "incomplete", especially that you
> are indexing to the system and don't have, say,
> your autocommit settings done very well. The long startup
> implies (guessing) that you have pretty big tlogs that
> are replayed upon startup. While these were coming up,
> did you see any of the shards in the "recovering" state? That's
> the only way I can imagine that Solr "healed" itself.
>
> I've got to point back to the Solr logs. Are they showing
> any anomalies? Are any nodes in recovery when you restart?
>
> Best,
> Erick
>
>
>
> On Mon, May 16, 2016 at 4:14 PM, Stephen Weiss 
> mailto:steve.we...@wgsn.com>> wrote:
>> Just one more note - while experimenting, I found that if I stopped all 
>> nodes (full cluster shutdown), and then startup all nodes, they do in fact 
>> seem to repair themselves then.  We have a script to monitor the differences 
>> between replicas (just looking at numDocs) and before the full shutdown / 
>> restart, we had:
>>
>> wks53104:Downloads sweiss$ php testReplication.php
>> Found 32 mismatched shard counts.
>> instock_shard1   replica 1: 30785553 replica 2: 30777568
>> instock_shard10   replica 1: 30972662 replica 2: 30966215
>> instock_shard11   replica 2: 31036718 replica 1: 31033547
>> instock_shard12   replica 1: 30179823 replica 2: 30176067
>> instock_shard13   replica 2: 30604638 replica 1: 30599219
>> instock_shard14   replica 2: 30755117 replica 1: 30753469
>> instock_shard15   replica 2: 30891325 replica 1: 30888771
>> instock_shard16   replica 1: 30818260 replica 2: 30811728
>> instock_shard17   replica 1: 30422080 replica 2: 30414666
>> instock_shard18   replica 2: 30874530 replica 1: 30869977
>> instock_shard19   replica 2: 30917008 replica 1: 30913715
>> instock_shard2   replica 1: 31062073 replica 2: 31057583
>> instock_shard20   replica 1: 30188774 replica 2: 30186565
>> instock_shard21   replica 2: 30789012 replica 1: 30784160
>> instock_shard22   replica 2: 30820473 re

Re: Updating error while add doc to Solrcloud

2016-05-17 Thread Erick Erickson
I _think_ you are using "schemaless" mode and the
issue is that Solr guesses the type of the field based
on the first doc it encounters. Thereafter, if any
incoming doc has a different field (say the "guess"
is an int type and later something that's not an
int is in that field) then it is rejected. This is somewhat
borne out by the fact that when you blow away the collection,
the problem doc indexes.

By default, modern Solr uses "managed schema", which is
different than, but related to "schemaless". What I'd do:

Go to managed schema or even "classic" schema, here's
a place to get you started:
https://cwiki.apache.org/confluence/display/solr/Schema+Factory+Definition+in+SolrConfig

Best,
Erick

On Mon, May 16, 2016 at 11:36 PM, scott.chu  wrote:
>
> I clear the cugna collection data (by renaming 'data' folder to 'xdata')and 
> restart Solrcloud. I add previous possible-error xml doc, it succeeds. So I'm 
> sure doc data has no problem. Is it because the index file size is too large? 
> If the zk nodes fails during adding doc, could it cause this updating error 
> since I do see some "can't find leader error' in solr log?
>
> scott.chu,scott@udngroup.com
> 2016/5/17 (週二)
> - Original Message -
> From: scott(自己)
> To: solr-user
> CC:
> Date: 2016/5/17 (週二) 14:29
> Subject: Updating error while add doc to Solrcloud
>
>
>
> I build Solrcloud with 2 nodes, 1 shard, 2 replica. I add doc in xml format 
> using post.jar up to 2.85M+ no. of docs and 10gb index size. When I add more 
> docs. the solr.log shows:
>
> --
> 2016-05-17 14:01:09,024 WARN (main) [ ] o.e.j.s.h.RequestLogHandler 
> !RequestLog
> 2016-05-17 14:01:09,275 WARN (main) [ ] o.e.j.s.SecurityHandler 
> ServletContext@o.e.j.w.WebAppContext@57fffcd7{/solr,file:/D:/portable_sw/solr-5.4.1/server/solr-webapp/webapp/,STARTING}{D:\portable_sw\solr-5.4.1\server/solr-webapp/webapp}
>  has uncovered http methods for path: /
> 2016-05-17 14:01:09,346 WARN (main) [ ] o.a.s.c.CoreContainer Couldn't 
> add files from D:\portable_sw\solr-5.4.1\mynodes\cloud\node1\lib to 
> classpath: D:\portable_sw\solr-5.4.1\mynodes\cloud\node1\lib
> 2016-05-17 14:01:11,419 WARN 
> (coreLoadExecutor-7-thread-1-processing-n:10.18.59.179:8983_solr) [c:cugna 
> s:shard1 r:core_node2 x:cugna_shard1_replica2] o.a.s.u.UpdateLog Exception 
> reverse reading log
> java.io.EOFException
> ...
> --
>
> Later I stop all and kill write.lock (I ususally do this in Solr 3 when add 
> doc fails) and add doc again but Solrcloud show can't find write.lock. So I 
> recover write.lock and call post.jar again. The output shows:
>
> --
> "SimplePostTool version 5.0.0
> Posting files to [base] url http://localhost:8983/solr/cugna/update using 
> content-type application/xml...
> POSTing file NMLBOym_a_UN2000_10_20160511_1014.xml to [base]
> SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for 
> url: http://localhost:8983/solr/cugna/update
> SimplePostTool: WARNING: Response: 
> 
> 400 name="QTime">3
> Exception writing document id un_555917 to the index; possible analysis 
> error.400
>   
> SimplePostTool: WARNING: IOException while reading response: 
> java.io.IOException: Server returned HTTP response code: 40
> 0 for URL: http://localhost:8983/solr/cugna/update
> 1 files indexed.
> COMMITting Solr index changes to 
> http://localhost:8983/solr/cugna/update...
> Time spent: 0:00:00.259"
> -
>
> I thought it's that doc un_555917 problem, then I comment out it in xml and 
> do again, it keeps showing same error to every single doc. I assume there's 
> something wrong with Solrcloud. Does anyone experience this before? What 
> could be the problem? Can I recover it? Or I have to add all docs again?
>
>
> scott.chu,scott@udngroup.com
> 2016/5/17 (週二)
> - Original Message -
> From: scott(自己)
> To: solr-user
> CC:
> Date: 2016/5/17 (週二) 09:39
> Subject: Re(2): [scottchu] Cab I migrate solrcloud by just copying 
> wholepackagefolder?
>
>
>
> OK! Thanks for reminding. I'll stick to the convention.
>
> scott.chu,scott@udngroup.com
> 2016/5/17 (週二)
> - Original Message -
> From: Chris Hostetter
> To: solr-user ; scott(自己)
> CC:
> Date: 2016/5/17 (週二) 02:43
> Subject: Re: [scottchu] Cab I migrate solrcloud by just copying whole 
> packagefolder?
>
>
>
> : Message-Id: <7fd5fd02628b55831193271f9b39a...@udngroup.com>
> : Subject: [scottchu] Cab I migrate solrcloud by just copying whole package
> : folder?
> : References:
> :  : 729...@udngroup.com>,
> : <856da447-e7b7-49a4-af87-f161b0fe5...@elyograg.org>
> : In-Reply-To: <856da447-e7b7-49a4-af87-f161b0fe5...@elyograg.org>
>
> https://people.apache.org/~hossman/#threadhijack
> Thread Hijacking on Mailing Lists
>
> When starting a new discussion on a mailing list, please do not reply to
> an existi

Re: [scottchu] How to specify multiple zk nodes using solr start command under Windows

2016-05-17 Thread Erick Erickson
Are you absolutely sure you're getting an _ensemble_ and
not just connecting to a single node? My suspicion (without
proof) is that you're just getting one -z option. It'll work as
long as that ZK instance stays up, but it won't be fault-tolerant.

And again you repeated the port (2181) twice.

Best,
Erick

On Tue, May 17, 2016 at 8:02 AM, Abdel Belkasri  wrote:
> Hi Scott,
> what worked for me in Windows is this (no ",")
> bin\Solr start -c -s mynodes\node1 -z localhost:2181 -z localhost:2181 -z
> localhost:2183
>
> -- Hope this helps
> Abdel.
>
> On Tue, May 17, 2016 at 3:35 AM, scott.chu  wrote:
>
>> I start 3 zk nodes at port 2181,2182, and 2183 on my local machine.
>> Go into Solr 5.4.1 root folder and issue and issue the command in article
>> 'Setting Up an External ZooKeeper Ensemble' in reference guide
>>
>> bin\Solr start -c -s mynodes\node1 -z
>> localhost:2181,localhost:2181,localhost:2183
>>
>> but it doesn't run but just show help page of start command in solr.cmd.
>> How should I issue the correct command?
>>
>
>
>
> --
> Abdel K. Belkasri, PhD


Re: Specifying dynamic field type without polluting actual field names with type indicators

2016-05-17 Thread Shawn Heisey
On 5/17/2016 9:05 AM, Horváth Péter Gergely wrote:
> By default Solr allows you to define the type of a dynamic field by
> appending a post-fix to the name itself. E.g. creating a color_s field
> instructs Solr to create a string field. My understanding is that if we do
> this, all queries must refer the post-fixed field name as well. So
> instead of a query like color:"red", we will have to write something like
> color_s:"red" -- and so on for other field types as well.
>
> I am wondering if it is possible to specify the data type used for a field
> in Solr 6.0.0, without having to modify the field name. (Or at least in a
> way that would allow us to use the original field name) Do you have any
> idea, how to achieve this? I am fine, if we have to specify the field type
> during the insertion of a document, however, I do not want to keep using
> post-fixes while running queries...

Dynamic fields are just a different way to specify field names.  Instead
of explictly naming the field, you can put a * at the beginning or end
of the name and define an entire series of potential field names with
one declaration, and tie those names to a specific type.

You can't specify a field type when doing a query, only the field name. 
Solr looks up the field name in the schema to determine the type.  If
you want to change the type of a given field name, you need to change
your schema, restart Solr or reload the core, and quite possibly also
reindex.

Thanks,
Shawn



Re: [scottchu] How to specify multiple zk nodes using solr start command under Windows

2016-05-17 Thread Abdel Belkasri
The repetition is just a cut and paste from Scott's post.

How can I check if I am getting the ensemble or just a single zk?

Also if this is not the way to specify an ensemble, what is the right way?

Because the comma delimited list does not work, I concur with Scott.

On Tue, May 17, 2016 at 11:49 AM, Erick Erickson 
wrote:

> Are you absolutely sure you're getting an _ensemble_ and
> not just connecting to a single node? My suspicion (without
> proof) is that you're just getting one -z option. It'll work as
> long as that ZK instance stays up, but it won't be fault-tolerant.
>
> And again you repeated the port (2181) twice.
>
> Best,
> Erick
>
> On Tue, May 17, 2016 at 8:02 AM, Abdel Belkasri 
> wrote:
> > Hi Scott,
> > what worked for me in Windows is this (no ",")
> > bin\Solr start -c -s mynodes\node1 -z localhost:2181 -z localhost:2181 -z
> > localhost:2183
> >
> > -- Hope this helps
> > Abdel.
> >
> > On Tue, May 17, 2016 at 3:35 AM, scott.chu 
> wrote:
> >
> >> I start 3 zk nodes at port 2181,2182, and 2183 on my local machine.
> >> Go into Solr 5.4.1 root folder and issue and issue the command in
> article
> >> 'Setting Up an External ZooKeeper Ensemble' in reference guide
> >>
> >> bin\Solr start -c -s mynodes\node1 -z
> >> localhost:2181,localhost:2181,localhost:2183
> >>
> >> but it doesn't run but just show help page of start command in solr.cmd.
> >> How should I issue the correct command?
> >>
> >
> >
> >
> > --
> > Abdel K. Belkasri, PhD
>



-- 
Abdel K. Belkasri, PhD


Re: [scottchu] How to specify multiple zk nodes using solr start command under Windows

2016-05-17 Thread John Bickerstaff
it's roundabout, but this might work -- ask for the healthcheck status
(from the solr box) and hit each zkNode separately.

I'm on Linux so you'll have to translate to Windows...  using the solr.cmd
file I assume...

./solr healthcheck -z 192.168.56.5:2181/solr5_4 -c collectionName
./solr healthcheck -z 192.168.56.6:2181/solr5_4 -c collectionName
./solr healthcheck -z 192.168.56.7:2181/solr5_4 -c collectionName

My original command included all the IP addresses without port numbers.
This does return healthcheck info (the same info) when I do it for each
zkNode separately...

I assume that if they've all got info for all your replicas and/or shards,
they're working as an ensemble.

You could also go looking at each zkNode (using the zkCli tool) and verify
that your collection appears where you expect it.



On Tue, May 17, 2016 at 10:17 AM, Abdel Belkasri  wrote:

> The repetition is just a cut and paste from Scott's post.
>
> How can I check if I am getting the ensemble or just a single zk?
>
> Also if this is not the way to specify an ensemble, what is the right way?
>
> Because the comma delimited list does not work, I concur with Scott.
>
> On Tue, May 17, 2016 at 11:49 AM, Erick Erickson 
> wrote:
>
> > Are you absolutely sure you're getting an _ensemble_ and
> > not just connecting to a single node? My suspicion (without
> > proof) is that you're just getting one -z option. It'll work as
> > long as that ZK instance stays up, but it won't be fault-tolerant.
> >
> > And again you repeated the port (2181) twice.
> >
> > Best,
> > Erick
> >
> > On Tue, May 17, 2016 at 8:02 AM, Abdel Belkasri 
> > wrote:
> > > Hi Scott,
> > > what worked for me in Windows is this (no ",")
> > > bin\Solr start -c -s mynodes\node1 -z localhost:2181 -z localhost:2181
> -z
> > > localhost:2183
> > >
> > > -- Hope this helps
> > > Abdel.
> > >
> > > On Tue, May 17, 2016 at 3:35 AM, scott.chu 
> > wrote:
> > >
> > >> I start 3 zk nodes at port 2181,2182, and 2183 on my local machine.
> > >> Go into Solr 5.4.1 root folder and issue and issue the command in
> > article
> > >> 'Setting Up an External ZooKeeper Ensemble' in reference guide
> > >>
> > >> bin\Solr start -c -s mynodes\node1 -z
> > >> localhost:2181,localhost:2181,localhost:2183
> > >>
> > >> but it doesn't run but just show help page of start command in
> solr.cmd.
> > >> How should I issue the correct command?
> > >>
> > >
> > >
> > >
> > > --
> > > Abdel K. Belkasri, PhD
> >
>
>
>
> --
> Abdel K. Belkasri, PhD
>


Solrj Basic Authentication randomly failing - request has come without principal

2016-05-17 Thread Shamik Bandopadhyay
Hi,

  I'm facing this issue where SolrJ calls are randomly failing on basic
authentication. Here's exception:

ERROR923629[qtp466002798-20] -
org.apache.solr.security.PKIAuthenticationPlugin.doAuthenticate(PKIAuthenticationPlugin.java:125)
- Invalid key
 INFO923630[qtp466002798-20] -
org.apache.solr.security.RuleBasedAuthorizationPlugin.checkPathPerm(RuleBasedAuthorizationPlugin.java:144)
- request has come without principal. failed permission
org.apache.solr.security.RuleBasedAuthorizationPlugin$Permission@1a343033
INFO923630[qtp466002798-20] -
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:429) -
USER_REQUIRED auth header null context : userPrincipal: [null] type:
[READ], collections: [knowledge,], Path: [/select] path : /select params
:df=text&distrib=false&qt=/select&preferLocalShards=false&fl=id&fl=score&shards.purpose=4&start=0&fsv=true&shard.url=
http://xx.xxx.x.222:8983/solr/knowledge/|http://xx.xxx.xxx.246:8983/solr/knowledge/&rows=3&version=2&q=*:*&NOW=1463512962899&isShard=true&wt=javabin

Here's my security.json. I've protected "browse" and "select" request
handler for my queries.

{
  "authentication": {
"blockUnknown": false,
"class": "solr.BasicAuthPlugin",
"credentials": {
  "solr": "IV0EHq1OnNrj6gvRCwvFwTrZ1+z1oBbnQdiVC3otuq0=
Ndd7LKvVBAaZIF0QAVi1ekCfAJXr1GGfLtRUXhgrF8c="
}
  },
  "authorization": {
"class": "solr.RuleBasedAuthorizationPlugin",
"user-role": {
  "solr": "admin",
  "solradmin": "admin",
  "beehive": "dev",
  "readuser": "read"
},
"permissions": [
  {
"name": "security-edit",
"role": "admin"
  },
  {
"name": "browse",
"collection": "knowledge",
"path": "/browse",
"role": [
  "admin",
  "dev",
  "read"
]
  },
  {
"name": "select",
"collection": "knowledge",
"path": "/select",
"role": [
  "admin",
  "dev",
  "read"
]
  },
  {
"name": "admin-ui",
"path": "/",
"role": [
  "admin",
  "dev"
]
  },
  {
"name": "update",
"role": [
  "admin",
  "dev"
]
  },
  {
"name": "collection-admin-edit",
"role": [
  "admin"
]
  },
  {
"name": "schema-edit",
"role": [
  "admin"
]
  },
  {
"name": "config-edit",
"role": [
  "admin"
]
  }
]
  }
}

Here's my sample code:

SolrClient client = new
CloudSolrClient("zoohost1:2181,zoohost2:2181,zoohost3:2181");
((CloudSolrClient)client).setDefaultCollection(DEFAULT_COLLECTION);
ModifiableSolrParams param = getSearchSolrQuery();
SolrRequest solrRequest = new QueryRequest(param);
solrRequest.setBasicAuthCredentials(USER, PASSWORD);
try{
 for(int j=0;j<20;j++){
NamedList results = client.request(solrRequest);
  }
}catch(Exception ex){

}

private static ModifiableSolrParams getSearchSolrQuery() {
ModifiableSolrParams solrParams = new ModifiableSolrParams();
solrParams.set("q", "*:*");
solrParams.set("qt","/select");
solrParams.set("rows", "3");
return solrParams;
}

The query sometime returns results, but fails probably half of the time,
there's no pattern though. This is applicable to any request handler
specified in the security.json

Looks like the SolrRequest loses the user/password on the flight.

Here's the exception recieved at SolrJ client:

org.apache.solr.common.SolrException.log(SolrException.java:148) -
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://xx.xxx.xxx.134:8983/solr/knowledge: Expected mime
type application/octet-stream but got text/html. 


Error 401 Unauthorized request, Response code: 401

HTTP ERROR 401
Problem accessing /solr/knowledge/select. Reason:
Unauthorized request, Response code:
401Powered by Jetty://




at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:544)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:240)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:229)
at
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:372)
at
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:325)
at
org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest(HttpShardHandlerFactory.java:246)
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:201)
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:163)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)

Re: SolrCloud replicas consistently out of sync

2016-05-17 Thread Stephen Weiss
OK, so we did as you suggest, read through that article, and we reconfigured 
the autocommit to:


${solr.autoCommit.maxTime:3}
false



${solr.autoSoftCommit.maxTime:60}


However, we see no change, aside from the fact that it's clearly committing 
more frequently.  I will say on our end, we clearly misunderstood the 
difference between soft and hard commit, but even now having it configured this 
way, we are still totally out of sync, long after all indexing has completed 
(it's been about 30 minutes now).  We manually pushed through a commit on the 
whole collection as suggested, however, all we get back for that is 
o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit., which 
makes sense, because it was all committed already anyway.

We still currently have all shards mismatched:

instock_shard1   replica 1: 30788491 replica 2: 30778865
instock_shard10   replica 1: 30973059 replica 2: 30971874
instock_shard11   replica 2: 31036815 replica 1: 31034715
instock_shard12   replica 2: 30177084 replica 1: 30170511
instock_shard13   replica 2: 30608225 replica 1: 30603923
instock_shard14   replica 2: 30755739 replica 1: 30753191
instock_shard15   replica 2: 30891713 replica 1: 30891528
instock_shard16   replica 1: 30818567 replica 2: 30817152
instock_shard17   replica 1: 30423877 replica 2: 30422742
instock_shard18   replica 2: 30874979 replica 1: 30872223
instock_shard19   replica 2: 30917208 replica 1: 3090
instock_shard2   replica 1: 31062339 replica 2: 31060575
instock_shard20   replica 1: 30192046 replica 2: 30190893
instock_shard21   replica 2: 30793817 replica 1: 30791135
instock_shard22   replica 2: 30821521 replica 1: 30818836
instock_shard23   replica 2: 30553773 replica 1: 30547336
instock_shard24   replica 1: 30975564 replica 2: 30971170
instock_shard25   replica 1: 30734696 replica 2: 30731682
instock_shard26   replica 1: 31465696 replica 2: 31464738
instock_shard27   replica 1: 30844884 replica 2: 30842445
instock_shard28   replica 2: 30549826 replica 1: 30547405
instock_shard29   replica 2: 3063 replica 1: 30634091
instock_shard3   replica 1: 30930723 replica 2: 30926483
instock_shard30   replica 2: 30904528 replica 1: 30902649
instock_shard31   replica 2: 31175813 replica 1: 31174921
instock_shard32   replica 2: 30932837 replica 1: 30926456
instock_shard4   replica 2: 30758100 replica 1: 30754129
instock_shard5   replica 2: 31008893 replica 1: 31002581
instock_shard6   replica 2: 31008679 replica 1: 31005380
instock_shard7   replica 2: 30738468 replica 1: 30737795
instock_shard8   replica 2: 30620929 replica 1: 30616715
instock_shard9   replica 1: 31071386 replica 2: 31066956

The fact that the min_rf numbers aren't coming back as 2 seems to indicate to 
me that documents simply aren't making it to both replicas - why would that 
have anything to do with committing anyway?

Something else is amiss here.  Too bad, committing sounded like an easy answer!

--
Steve


On Tue, May 17, 2016 at 11:39 AM, Erick Erickson 
mailto:erickerick...@gmail.com>> wrote:
OK, these autocommit settings need revisiting.

First off, I'd remove the maxDocs entirely although with the setting
you're using it probably doesn't matter.

The maxTime of 1,200,000 is 20 minutes. Which means if you evern
un-gracefully kill your shards you'll have up to 20 minutes worth of
data to replay from the tlog or resynch from the leader. Make this
much shorter (6 or less) and be sure to gracefully kill your Solrs.
no "kill -9" for intance

To be sure, before you bounce servers try either waiting 20 minutes
after the indexing stops or issue a manual commit before shutting
down your servers with
http:///solr/collection/update?commit=true

I have a personal annoyance with the bin/solr script where it forcefully
(ungracefully) kills Solr after 5 seconds. I think this is much too short
so you might consider making it longer in prod, it's a shell script so
it's easy.


${solr.autoCommit.maxTime:120}
${solr.autoCommit.maxDocs:10}
false



this is probably the  crux of "shards being out of sync". They're _not_
out of sync, it's just that some of them have docs visible to searches
and some do not since the wall-clock time these are triggered are
_not_ the same. So you have a 10 minute window where two or more
replicas for a single shard are out-of-sync.



${solr.autoSoftCommit.maxTime:60}


You can test all this one of two ways:
1> if you have a timestamp when the docs were indexed, do all
the shards match if you do a query like
q=*:*×tamp:[* TO NOW/-15MINUTES]?
or, if indexing is _not_ occurring, issue a manual commit like
.../solr/collection/update?commit=true
and see if all the replicas match for each shard.

Here's a long blog on commits:
https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Best,
Erick

On Tue, May 17, 2016 at 8:18 AM, Stephen Weiss 
mailto:steve.we...@wgsn.com>> wrote:
> Yes, after startup there was a r

Re: SolrCloud replicas consistently out of sync

2016-05-17 Thread Stephen Weiss
I should add - looking back through the logs, we're seeing frequent errors like 
this now:

78819692 WARN  (qtp110456297-1145) [   ] o.a.s.h.a.LukeRequestHandler Error 
getting file length for [segments_4o]
java.nio.file.NoSuchFileException: 
/var/solr/data/instock_shard5_replica1/data/index.20160516230059221/segments_4o

--
Steve


On Tue, May 17, 2016 at 5:07 PM, Stephen Weiss 
mailto:steve.we...@wgsn.com>> wrote:
OK, so we did as you suggest, read through that article, and we reconfigured 
the autocommit to:


${solr.autoCommit.maxTime:3}
false



${solr.autoSoftCommit.maxTime:60}


However, we see no change, aside from the fact that it's clearly committing 
more frequently.  I will say on our end, we clearly misunderstood the 
difference between soft and hard commit, but even now having it configured this 
way, we are still totally out of sync, long after all indexing has completed 
(it's been about 30 minutes now).  We manually pushed through a commit on the 
whole collection as suggested, however, all we get back for that is 
o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping IW.commit., which 
makes sense, because it was all committed already anyway.

We still currently have all shards mismatched:

instock_shard1   replica 1: 30788491 replica 2: 30778865
instock_shard10   replica 1: 30973059 replica 2: 30971874
instock_shard11   replica 2: 31036815 replica 1: 31034715
instock_shard12   replica 2: 30177084 replica 1: 30170511
instock_shard13   replica 2: 30608225 replica 1: 30603923
instock_shard14   replica 2: 30755739 replica 1: 30753191
instock_shard15   replica 2: 30891713 replica 1: 30891528
instock_shard16   replica 1: 30818567 replica 2: 30817152
instock_shard17   replica 1: 30423877 replica 2: 30422742
instock_shard18   replica 2: 30874979 replica 1: 30872223
instock_shard19   replica 2: 30917208 replica 1: 3090
instock_shard2   replica 1: 31062339 replica 2: 31060575
instock_shard20   replica 1: 30192046 replica 2: 30190893
instock_shard21   replica 2: 30793817 replica 1: 30791135
instock_shard22   replica 2: 30821521 replica 1: 30818836
instock_shard23   replica 2: 30553773 replica 1: 30547336
instock_shard24   replica 1: 30975564 replica 2: 30971170
instock_shard25   replica 1: 30734696 replica 2: 30731682
instock_shard26   replica 1: 31465696 replica 2: 31464738
instock_shard27   replica 1: 30844884 replica 2: 30842445
instock_shard28   replica 2: 30549826 replica 1: 30547405
instock_shard29   replica 2: 3063 replica 1: 30634091
instock_shard3   replica 1: 30930723 replica 2: 30926483
instock_shard30   replica 2: 30904528 replica 1: 30902649
instock_shard31   replica 2: 31175813 replica 1: 31174921
instock_shard32   replica 2: 30932837 replica 1: 30926456
instock_shard4   replica 2: 30758100 replica 1: 30754129
instock_shard5   replica 2: 31008893 replica 1: 31002581
instock_shard6   replica 2: 31008679 replica 1: 31005380
instock_shard7   replica 2: 30738468 replica 1: 30737795
instock_shard8   replica 2: 30620929 replica 1: 30616715
instock_shard9   replica 1: 31071386 replica 2: 31066956

The fact that the min_rf numbers aren't coming back as 2 seems to indicate to 
me that documents simply aren't making it to both replicas - why would that 
have anything to do with committing anyway?

Something else is amiss here.  Too bad, committing sounded like an easy answer!

--
Steve


On Tue, May 17, 2016 at 11:39 AM, Erick Erickson 
mailto:erickerick...@gmail.com>> wrote:
OK, these autocommit settings need revisiting.

First off, I'd remove the maxDocs entirely although with the setting
you're using it probably doesn't matter.

The maxTime of 1,200,000 is 20 minutes. Which means if you evern
un-gracefully kill your shards you'll have up to 20 minutes worth of
data to replay from the tlog or resynch from the leader. Make this
much shorter (6 or less) and be sure to gracefully kill your Solrs.
no "kill -9" for intance

To be sure, before you bounce servers try either waiting 20 minutes
after the indexing stops or issue a manual commit before shutting
down your servers with
http:///solr/collection/update?commit=true

I have a personal annoyance with the bin/solr script where it forcefully
(ungracefully) kills Solr after 5 seconds. I think this is much too short
so you might consider making it longer in prod, it's a shell script so
it's easy.


${solr.autoCommit.maxTime:120}
${solr.autoCommit.maxDocs:10}
false



this is probably the  crux of "shards being out of sync". They're _not_
out of sync, it's just that some of them have docs visible to searches
and some do not since the wall-clock time these are triggered are
_not_ the same. So you have a 10 minute window where two or more
replicas for a single shard are out-of-sync.



${solr.autoSoftCommit.maxTime:60}


You can test all this one of two ways:
1> if you have a timestamp when the docs were indexed, do all
the shards match if you do a query like
q=*:*×tamp:[* TO NOW/-15MINUTES]?
or,

RE: SolrCloud replicas consistently out of sync

2016-05-17 Thread Markus Jelsma
Hi, thats a known issue and unrelated:
https://issues.apache.org/jira/browse/SOLR-9120

M.
 
 
-Original message-
> From:Stephen Weiss 
> Sent: Tuesday 17th May 2016 23:10
> To: solr-user@lucene.apache.org; Aleksey Mezhva ; 
> Hans Zhou 
> Subject: Re: SolrCloud replicas consistently out of sync
> 
> I should add - looking back through the logs, we're seeing frequent errors 
> like this now:
> 
> 78819692 WARN  (qtp110456297-1145) [   ] o.a.s.h.a.LukeRequestHandler Error 
> getting file length for [segments_4o]
> java.nio.file.NoSuchFileException: 
> /var/solr/data/instock_shard5_replica1/data/index.20160516230059221/segments_4o
> 
> --
> Steve
> 
> 
> On Tue, May 17, 2016 at 5:07 PM, Stephen Weiss 
> mailto:steve.we...@wgsn.com>> wrote:
> OK, so we did as you suggest, read through that article, and we reconfigured 
> the autocommit to:
> 
> 
> ${solr.autoCommit.maxTime:3}
> false
> 
> 
> 
> ${solr.autoSoftCommit.maxTime:60}
> 
> 
> However, we see no change, aside from the fact that it's clearly committing 
> more frequently.  I will say on our end, we clearly misunderstood the 
> difference between soft and hard commit, but even now having it configured 
> this way, we are still totally out of sync, long after all indexing has 
> completed (it's been about 30 minutes now).  We manually pushed through a 
> commit on the whole collection as suggested, however, all we get back for 
> that is o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping 
> IW.commit., which makes sense, because it was all committed already anyway.
> 
> We still currently have all shards mismatched:
> 
> instock_shard1   replica 1: 30788491 replica 2: 30778865
> instock_shard10   replica 1: 30973059 replica 2: 30971874
> instock_shard11   replica 2: 31036815 replica 1: 31034715
> instock_shard12   replica 2: 30177084 replica 1: 30170511
> instock_shard13   replica 2: 30608225 replica 1: 30603923
> instock_shard14   replica 2: 30755739 replica 1: 30753191
> instock_shard15   replica 2: 30891713 replica 1: 30891528
> instock_shard16   replica 1: 30818567 replica 2: 30817152
> instock_shard17   replica 1: 30423877 replica 2: 30422742
> instock_shard18   replica 2: 30874979 replica 1: 30872223
> instock_shard19   replica 2: 30917208 replica 1: 3090
> instock_shard2   replica 1: 31062339 replica 2: 31060575
> instock_shard20   replica 1: 30192046 replica 2: 30190893
> instock_shard21   replica 2: 30793817 replica 1: 30791135
> instock_shard22   replica 2: 30821521 replica 1: 30818836
> instock_shard23   replica 2: 30553773 replica 1: 30547336
> instock_shard24   replica 1: 30975564 replica 2: 30971170
> instock_shard25   replica 1: 30734696 replica 2: 30731682
> instock_shard26   replica 1: 31465696 replica 2: 31464738
> instock_shard27   replica 1: 30844884 replica 2: 30842445
> instock_shard28   replica 2: 30549826 replica 1: 30547405
> instock_shard29   replica 2: 3063 replica 1: 30634091
> instock_shard3   replica 1: 30930723 replica 2: 30926483
> instock_shard30   replica 2: 30904528 replica 1: 30902649
> instock_shard31   replica 2: 31175813 replica 1: 31174921
> instock_shard32   replica 2: 30932837 replica 1: 30926456
> instock_shard4   replica 2: 30758100 replica 1: 30754129
> instock_shard5   replica 2: 31008893 replica 1: 31002581
> instock_shard6   replica 2: 31008679 replica 1: 31005380
> instock_shard7   replica 2: 30738468 replica 1: 30737795
> instock_shard8   replica 2: 30620929 replica 1: 30616715
> instock_shard9   replica 1: 31071386 replica 2: 31066956
> 
> The fact that the min_rf numbers aren't coming back as 2 seems to indicate to 
> me that documents simply aren't making it to both replicas - why would that 
> have anything to do with committing anyway?
> 
> Something else is amiss here.  Too bad, committing sounded like an easy 
> answer!
> 
> --
> Steve
> 
> 
> On Tue, May 17, 2016 at 11:39 AM, Erick Erickson 
> mailto:erickerick...@gmail.com>> wrote:
> OK, these autocommit settings need revisiting.
> 
> First off, I'd remove the maxDocs entirely although with the setting
> you're using it probably doesn't matter.
> 
> The maxTime of 1,200,000 is 20 minutes. Which means if you evern
> un-gracefully kill your shards you'll have up to 20 minutes worth of
> data to replay from the tlog or resynch from the leader. Make this
> much shorter (6 or less) and be sure to gracefully kill your Solrs.
> no "kill -9" for intance
> 
> To be sure, before you bounce servers try either waiting 20 minutes
> after the indexing stops or issue a manual commit before shutting
> down your servers with
> http:///solr/collection/update?commit=true
> 
> I have a personal annoyance with the bin/solr script where it forcefully
> (ungracefully) kills Solr after 5 seconds. I think this is much too short
> so you might consider making it longer in prod, it's a shell script so
> it's easy.
> 
> 
> ${solr.autoCommit.maxTime:120}
> ${solr.autoCommit.maxDocs:10}
> false
> 
> 
> 
> this is

Re: SolrCloud replicas consistently out of sync

2016-05-17 Thread Stephen Weiss
Gotcha - well that's nice.  Still, we seem to be permanently out of sync.

I see this thread with someone having a similar issue:

https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201601.mbox/%3c09fdab82-7600-49e0-b639-9cb9db937...@yahoo.com%3E

It seems like this is not really fixed in 5.4/6.0?  Is there any version of 
SolrCloud where this wasn't yet a problem that we could downgrade to?

--
Steve

On Tue, May 17, 2016 at 6:23 PM, Markus Jelsma 
mailto:markus.jel...@openindex.io>> wrote:
Hi, thats a known issue and unrelated:
https://issues.apache.org/jira/browse/SOLR-9120

M.


-Original message-
> From:Stephen Weiss mailto:steve.we...@wgsn.com>>
> Sent: Tuesday 17th May 2016 23:10
> To: solr-user@lucene.apache.org; Aleksey 
> Mezhva mailto:aleksey.mez...@wgsn.com>>; Hans Zhou 
> mailto:hans.z...@wgsn.com>>
> Subject: Re: SolrCloud replicas consistently out of sync
>
> I should add - looking back through the logs, we're seeing frequent errors 
> like this now:
>
> 78819692 WARN  (qtp110456297-1145) [   ] o.a.s.h.a.LukeRequestHandler Error 
> getting file length for [segments_4o]
> java.nio.file.NoSuchFileException: 
> /var/solr/data/instock_shard5_replica1/data/index.20160516230059221/segments_4o
>
> --
> Steve
>
>
> On Tue, May 17, 2016 at 5:07 PM, Stephen Weiss 
> mailto:steve.we...@wgsn.com>>>
>  wrote:
> OK, so we did as you suggest, read through that article, and we reconfigured 
> the autocommit to:
>
> 
> ${solr.autoCommit.maxTime:3}
> false
> 
>
> 
> ${solr.autoSoftCommit.maxTime:60}
> 
>
> However, we see no change, aside from the fact that it's clearly committing 
> more frequently.  I will say on our end, we clearly misunderstood the 
> difference between soft and hard commit, but even now having it configured 
> this way, we are still totally out of sync, long after all indexing has 
> completed (it's been about 30 minutes now).  We manually pushed through a 
> commit on the whole collection as suggested, however, all we get back for 
> that is o.a.s.u.DirectUpdateHandler2 No uncommitted changes. Skipping 
> IW.commit., which makes sense, because it was all committed already anyway.
>
> We still currently have all shards mismatched:
>
> instock_shard1   replica 1: 30788491 replica 2: 30778865
> instock_shard10   replica 1: 30973059 replica 2: 30971874
> instock_shard11   replica 2: 31036815 replica 1: 31034715
> instock_shard12   replica 2: 30177084 replica 1: 30170511
> instock_shard13   replica 2: 30608225 replica 1: 30603923
> instock_shard14   replica 2: 30755739 replica 1: 30753191
> instock_shard15   replica 2: 30891713 replica 1: 30891528
> instock_shard16   replica 1: 30818567 replica 2: 30817152
> instock_shard17   replica 1: 30423877 replica 2: 30422742
> instock_shard18   replica 2: 30874979 replica 1: 30872223
> instock_shard19   replica 2: 30917208 replica 1: 3090
> instock_shard2   replica 1: 31062339 replica 2: 31060575
> instock_shard20   replica 1: 30192046 replica 2: 30190893
> instock_shard21   replica 2: 30793817 replica 1: 30791135
> instock_shard22   replica 2: 30821521 replica 1: 30818836
> instock_shard23   replica 2: 30553773 replica 1: 30547336
> instock_shard24   replica 1: 30975564 replica 2: 30971170
> instock_shard25   replica 1: 30734696 replica 2: 30731682
> instock_shard26   replica 1: 31465696 replica 2: 31464738
> instock_shard27   replica 1: 30844884 replica 2: 30842445
> instock_shard28   replica 2: 30549826 replica 1: 30547405
> instock_shard29   replica 2: 3063 replica 1: 30634091
> instock_shard3   replica 1: 30930723 replica 2: 30926483
> instock_shard30   replica 2: 30904528 replica 1: 30902649
> instock_shard31   replica 2: 31175813 replica 1: 31174921
> instock_shard32   replica 2: 30932837 replica 1: 30926456
> instock_shard4   replica 2: 30758100 replica 1: 30754129
> instock_shard5   replica 2: 31008893 replica 1: 31002581
> instock_shard6   replica 2: 31008679 replica 1: 31005380
> instock_shard7   replica 2: 30738468 replica 1: 30737795
> instock_shard8   replica 2: 30620929 replica 1: 30616715
> instock_shard9   replica 1: 31071386 replica 2: 31066956
>
> The fact that the min_rf numbers aren't coming back as 2 seems to indicate to 
> me that documents simply aren't making it to both replicas - why would that 
> have anything to do with committing anyway?
>
> Something else is amiss here.  Too bad, committing sounded like an easy 
> answer!
>
> --
> Steve
>
>
> On Tue, May 17, 2016 at 11:39 AM, Erick Erickson 
> mailto:erickerick...@gmail.com>>>
>  wrote:
> OK, these autocommit settings need revisiting.
>
> First off, I'd remove the maxDocs entirely although with the setting
> you're using it probably doesn't matter.
>
> The maxTime of 1,200,000 is 20 minutes. Which means if you evern
> un-gracefully kill your shards you'll have up to 20 minutes worth of
>

SolrCloud multiple collections each with unique schema via SolrJ?

2016-05-17 Thread Boman
I load the defaul config using scripts/cloud-scripts/zkcli.sh -cmd upconfig
after which collections are created programmatically and the schema modified
as per each collection's requirements.

I now notice that it is the SAME "default" original schema that holds ALL
the modifications (new fields). What I really want is that during collection
creation time (using SolrJ) as follows: 

CollectionAdminRequest.Create createRequest = new
CollectionAdminRequest.Create();
createRequest.setConfigName("default-config");

the new collection would "inherit" a copy of the default schema, and
following any updates to that schema, it should remain Collection-specific.

Any suggestions on how to achieve this programmatically? Thanks.

--Boman.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-multiple-collections-each-with-unique-schema-via-SolrJ-tp4277397.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud multiple collections each with unique schema via SolrJ?

2016-05-17 Thread Shawn Heisey
On 5/17/2016 7:00 PM, Boman wrote:
> I load the defaul config using scripts/cloud-scripts/zkcli.sh -cmd upconfig
> after which collections are created programmatically and the schema modified
> as per each collection's requirements.
>
> I now notice that it is the SAME "default" original schema that holds ALL
> the modifications (new fields). What I really want is that during collection
> creation time (using SolrJ) as follows: 
>
> CollectionAdminRequest.Create createRequest = new
> CollectionAdminRequest.Create();
> createRequest.setConfigName("default-config");
>
> the new collection would "inherit" a copy of the default schema, and
> following any updates to that schema, it should remain Collection-specific.
>
> Any suggestions on how to achieve this programmatically? Thanks.

If you want a different config/schema combo for each collection, you
need to upload a different configset for every collection.  When your
collections are all using the same config, any change that you make for
one of them will affect them all (after reload).

You can't share just part of the configset -- it's a cohesive unit
covering the solrconfig.xml, the schema, and all the other files in the
configset.

Thanks,
Shawn



Re: SolrCloud multiple collections each with unique schema via SolrJ?

2016-05-17 Thread Boman
Got it! I now use uploadConfig to load the default config for each new
collection I create, and then modify the schema. Thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-multiple-collections-each-with-unique-schema-via-SolrJ-tp4277397p4277406.html
Sent from the Solr - User mailing list archive at Nabble.com.


回复:Re: How to encrypte the password for basic authentication of Solr

2016-05-17 Thread tjlp
Hi, Shawn,
Thanks. So, we can not add some credentials at first in the security.json, 
right? To simplify, I just want to hardcode some users before the Solr starts.
Googling the topic about "Basic authentication for Solr Admin panel", we can 
call jetty utility for generating encrypted password as follow:
java -cp jetty-util-8.1.10.v20130312.jar 
org.eclipse.jetty.util.security.Password  .
ThanksLiu Peng
- 原始邮件 -
发件人:Shawn Heisey 
收件人:solr-user@lucene.apache.org
主题:Re: How to encrypte the password for basic authentication of Solr
日期:2016年05月17日 21点32分


On 5/17/2016 7:23 AM, Shawn Heisey wrote:
> On 5/17/2016 2:23 AM, t...@sina.com wrote:
>> How to get the encrypted password for the user solr? 
> You can't get the password from the encrypted version. That's the
> entire point of encrypting the password. 
It occurred to me after I sent this that I perhaps have answered the
wrong question.
Once you have the example security.json in place, you can use that solr
user (and the default password) with the set-user functionality in the
/admin/authentication API to create a new user or to change the password
on the solr user.  This is described in the documentation page you
linked in a section titled "Add a User or Edit a Password".  If you
choose to add a user, you can use that user to delete the initial solr
user with the delete-user functionality.
There is no utility provided with Solr that can generate encrypted
passwords.  I think we need one.
Thanks,
Shawn


Re: [scottchu] How to specify multiple zk nodes using solr startcommand under Windows

2016-05-17 Thread scott.chu
I tested yesterday and it proves my theory. I'll share what I do under Windows 
on 1 PC here with you experienced guys and further newbies:

1>Download zookeeper 3.4.8. I unzip it and copy to 3 other different folders: 
zk_1, zk_2, zk_3.
2>For each zk_n folder, I do these things  (Note: {n} means the last digit in 
zk_n foler name):
 a. Create zoo_data folder under root and create 'myid' with notepad, 
the contents is pure '{n}' only.
 b. Create zoo.cfg under conf folder with following contents:
 clientPort=218{n}
 initLimit=5
 syncLimit=2
 dataDir=D:/zk_{n}/zoo_data
 ;if p2p-coneect-port or leader-election-port are all same, 
then we should set maxClientCnxns=n
 ;maxClientCnxns=3
 ;server.x=host:p2p-connect-port:leader-election-port
 server.1=localhost:2888:3888
 server.2=localhost:2889:3889
 server.3=localhost:2890:3890
 3> I download ZOOKEEPER-1122's zkServer.cmd. and go into each zk_n folder and 
issue command:
 bin\zkServer.cmd start

   [Question]: There's something I'd like to ask guys: When I start zk_1, 
zk_2, the console keeps shows some warning messages.
 Only after I start zk_3, the warning messages is 
stopped.  Is that normal?

4>  I use zkui_win to see them all go online successfully.
5> I goto Solr-5.4.1 folder, and issue following commands:
   bin\solr start -c -s mynodes\node1 -z localhost:2181
   bin\solr start -c -s mynodes\node1 -z localhost:2181 -p 7973
   bin\solr create -c cugna -d myconfigsets\cugna -shards 1 
-replicationFactor 2 -p 8983
6> By using zkui_win again,  I see:
  ** config 'cugna' are all synchronized on zk_1 to zk_3. So this 
proves my theory, we only have to specify only one zk nodes  and they'll sync 
themselves. **

[Question]: I go into zk_n folder and issue 'bin\zkServer stop'. However, this 
shows error message. It seems it can't taskkill the zk process for some reason. 
The only way I stop them
 is by closing DOS windows that has issued the 
'bin\zkServer start' command. Does anybody know why 'bin\zkServer stop' doesn't 
work?

Note: Gotta say sorry for the repitition of localhost:2181. It's my typo.

scott.chu,scott@udngroup.com
2016/5/18 (週三)
- Original Message - 
From: Abdel Belkasri 
To: solr-user 
CC: 
Date: 2016/5/18 (週三) 00:17
Subject: Re: [scottchu] How to specify multiple zk nodes using solr 
startcommand under Windows


The repetition is just a cut and paste from Scott's post. 

How can I check if I am getting the ensemble or just a single zk? 

Also if this is not the way to specify an ensemble, what is the right way? 


Because the comma delimited list does not work, I concur with Scott. 

On Tue, May 17, 2016 at 11:49 AM, Erick Erickson  

wrote: 

> Are you absolutely sure you're getting an _ensemble_ and 
> not just connecting to a single node? My suspicion (without 
> proof) is that you're just getting one -z option. It'll work as 
> long as that ZK instance stays up, but it won't be fault-tolerant. 
> 
> And again you repeated the port (2181) twice. 
> 
> Best, 
> Erick 
> 
> On Tue, May 17, 2016 at 8:02 AM, Abdel Belkasri  
> wrote: 
> > Hi Scott, 
> > what worked for me in Windows is this (no ",") 
> > bin\Solr start -c -s mynodes\node1 -z localhost:2181 -z localhost:2181 -z 
> > localhost:2183 
> > 
> > -- Hope this helps 
> > Abdel. 
> > 
> > On Tue, May 17, 2016 at 3:35 AM, scott.chu  
> wrote: 
> > 
> >> I start 3 zk nodes at port 2181,2182, and 2183 on my local machine. 
> >> Go into Solr 5.4.1 root folder and issue and issue the command in 
> article 
> >> 'Setting Up an External ZooKeeper Ensemble' in reference guide 
> >> 
> >> bin\Solr start -c -s mynodes\node1 -z 
> >> localhost:2181,localhost:2181,localhost:2183 
> >> 
> >> but it doesn't run but just show help page of start command in solr.cmd. 
> >> How should I issue the correct command? 
> >> 
> > 
> > 
> > 
> > -- 
> > Abdel K. Belkasri, PhD 
> 



-- 
Abdel K. Belkasri, PhD 



- 
未在此訊息中找到病毒。 
已透過 AVG 檢查 - www.avg.com 
版本: 2015.0.6201 / 病毒庫: 4568/12251 - 發佈日期: 05/17/16


A quick way to open stay-opened DOS window on any folder under Windows [scottchu]

2016-05-17 Thread scott.chu

I got many helps from this maillist. So this time I'd like to share some useful 
thing with guys.

First, I'm not a cygwin guy under Windows.

[Symptom]
When we test some open source projects under Windows. We from time to time 
need: open Windows explorer, change to specific folder, click once on top 
address bar, select and copy the path, open DOS window, issue 'cd /d ', 
right-click to paste the pathname, press ENTER, and then we can type in some 
commands, etc. This could be tedious if we test this, test that. At least it 's 
what I'm experienced these days.

[Cure]
I find a quick and universal way to do previous tedious actions, as follows:
1> Create a cmd.exe link.
2> Rename it to 'cdhere'.
3> Right-click on it and select 'properties'.
4>On textbox of target, type in 'C:\Windows\System\cmd.exe'.
5>On textbox of starting position, type in '%cd%'.
6>Press 'OK'.
Now you can copy cdhere.lnk to any folder you want, double-click it, and the 
DOS windows will be opened and stayed.

If you are kinda like me (i.e. not a cygwin guy), then this should help you. 
Enjoy it!

scott.chu,scott@udngroup.com
2016/5/18 (週三 means Wednesday)


API call for optimising a collection

2016-05-17 Thread Binoy Dalal
Is there no api call that can optimize an entire collection?

I tried the collections api page on the confluence wiki but couldn't find
anything, and a Google search also yielded no meaningful results.
-- 
Regards,
Binoy Dalal


Re: API call for optimising a collection

2016-05-17 Thread Nick Vasilyev
As far as I know, you have to run it on each core.
On May 18, 2016 1:04 AM, "Binoy Dalal"  wrote:

> Is there no api call that can optimize an entire collection?
>
> I tried the collections api page on the confluence wiki but couldn't find
> anything, and a Google search also yielded no meaningful results.
> --
> Regards,
> Binoy Dalal
>


Why error #400 bad request somtimes where I add doc to Solrcloud collection? [scottchu]

2016-05-17 Thread scott.chu

I add doc by running post.jar with xml files but sometimes get error #400(Bad 
Request). At first, I doubt the xml files has problem but after I stop nodes 
and restart them. I add same doc again and it's successful. According post.jar 
output below, though NMLBOym_a_UN2004_07_20160511_1018.xml  got error #400 but 
at last it commit the index change. However, it doesn't write the index 
successfully (I've query that id and not found). What could be the possible 
cause? 

The output of post.jar is shown below:
-
SimplePostTool version 5.0.0
Posting files to [base] url http://localhost:8983/solr/cugna/update using 
content-type application/xml...
POSTing file NMLBOym_a_UN2004_05_20160511_1018.xml to [base]
1 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/cugna/update...
Time spent: 0:00:50.017
SimplePostTool version 5.0.0
Posting files to [base] url http://localhost:8983/solr/cugna/update using 
content-type application/xml...
POSTing file NMLBOym_a_UN2004_06_20160511_1018.xml to [base]
1 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/cugna/update...
Time spent: 0:00:48.518
SimplePostTool version 5.0.0
Posting files to [base] url http://localhost:8983/solr/cugna/update using 
content-type application/xml...
POSTing file NMLBOym_a_UN2004_07_20160511_1018.xml to [base]
SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url: 
http://localhost:8983/solr/cugna/update
SimplePostTool: WARNING: Response: 

400110614
Exception writing document id un_2563325 to the index; possible analysis 
error.400

SimplePostTool: WARNING: IOException while reading response: 
java.io.IOException: Server returned HTTP response code: 400 for
 URL: http://localhost:8983/solr/cugna/update
1 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/cugna/update...
Time spent: 0:01:51.881
-

scott.chu,scott@udngroup.com
2016/5/18 (週三)


Re: [scottchu] How to specify multiple zk nodes using solr startcommand under Windows

2016-05-17 Thread John Bickerstaff
I think those zk server warning messages are expected.  Until you have 3
running instances you don't have a "Quorum" and the Zookeeper instances
complain.  Once the third one comes up they are "happy" and don't complain
any more.  You'd get similar messages if one of the Zookeeper nodes ever
went down.

As for the stopping of zk server - I've never had any problem issuing a
stop command, but I'm running Linux so I may not be much good to you in
that regard.

On Tue, May 17, 2016 at 8:41 PM, scott.chu  wrote:

> I tested yesterday and it proves my theory. I'll share what I do under
> Windows on 1 PC here with you experienced guys and further newbies:
>
> 1>Download zookeeper 3.4.8. I unzip it and copy to 3 other different
> folders: zk_1, zk_2, zk_3.
> 2>For each zk_n folder, I do these things  (Note: {n} means the last digit
> in zk_n foler name):
>  a. Create zoo_data folder under root and create 'myid' with
> notepad, the contents is pure '{n}' only.
>  b. Create zoo.cfg under conf folder with following contents:
>  clientPort=218{n}
>  initLimit=5
>  syncLimit=2
>  dataDir=D:/zk_{n}/zoo_data
>  ;if p2p-coneect-port or leader-election-port are all
> same, then we should set maxClientCnxns=n
>  ;maxClientCnxns=3
>  ;server.x=host:p2p-connect-port:leader-election-port
>  server.1=localhost:2888:3888
>  server.2=localhost:2889:3889
>  server.3=localhost:2890:3890
>  3> I download ZOOKEEPER-1122's zkServer.cmd. and go into each zk_n folder
> and issue command:
>  bin\zkServer.cmd start
>
>[Question]: There's something I'd like to ask guys: When I start
> zk_1, zk_2, the console keeps shows some warning messages.
>  Only after I start zk_3, the warning messages
> is stopped.  Is that normal?
>
> 4>  I use zkui_win to see them all go online successfully.
> 5> I goto Solr-5.4.1 folder, and issue following commands:
>bin\solr start -c -s mynodes\node1 -z localhost:2181
>bin\solr start -c -s mynodes\node1 -z localhost:2181 -p
> 7973
>bin\solr create -c cugna -d myconfigsets\cugna -shards
> 1 -replicationFactor 2 -p 8983
> 6> By using zkui_win again,  I see:
>   ** config 'cugna' are all synchronized on zk_1 to zk_3. So this
> proves my theory, we only have to specify only one zk nodes  and they'll
> sync themselves. **
>
> [Question]: I go into zk_n folder and issue 'bin\zkServer stop'. However,
> this shows error message. It seems it can't taskkill the zk process for
> some reason. The only way I stop them
>  is by closing DOS windows that has issued the
> 'bin\zkServer start' command. Does anybody know why 'bin\zkServer stop'
> doesn't work?
>
> Note: Gotta say sorry for the repitition of localhost:2181. It's my typo.
>
> scott.chu,scott@udngroup.com
> 2016/5/18 (週三)
> - Original Message -
> From: Abdel Belkasri
> To: solr-user
> CC:
> Date: 2016/5/18 (週三) 00:17
> Subject: Re: [scottchu] How to specify multiple zk nodes using solr
> startcommand under Windows
>
>
> The repetition is just a cut and paste from Scott's post.
>
> How can I check if I am getting the ensemble or just a single zk?
>
> Also if this is not the way to specify an ensemble, what is the right way?
>
>
> Because the comma delimited list does not work, I concur with Scott.
>
> On Tue, May 17, 2016 at 11:49 AM, Erick Erickson 
>
> wrote:
>
> > Are you absolutely sure you're getting an _ensemble_ and
> > not just connecting to a single node? My suspicion (without
> > proof) is that you're just getting one -z option. It'll work as
> > long as that ZK instance stays up, but it won't be fault-tolerant.
> >
> > And again you repeated the port (2181) twice.
> >
> > Best,
> > Erick
> >
> > On Tue, May 17, 2016 at 8:02 AM, Abdel Belkasri 
> > wrote:
> > > Hi Scott,
> > > what worked for me in Windows is this (no ",")
> > > bin\Solr start -c -s mynodes\node1 -z localhost:2181 -z localhost:2181
> -z
> > > localhost:2183
> > >
> > > -- Hope this helps
> > > Abdel.
> > >
> > > On Tue, May 17, 2016 at 3:35 AM, scott.chu 
> > wrote:
> > >
> > >> I start 3 zk nodes at port 2181,2182, and 2183 on my local machine.
> > >> Go into Solr 5.4.1 root folder and issue and issue the command in
> > article
> > >> 'Setting Up an External ZooKeeper Ensemble' in reference guide
> > >>
> > >> bin\Solr start -c -s mynodes\node1 -z
> > >> localhost:2181,localhost:2181,localhost:2183
> > >>
> > >> but it doesn't run but just show help page of start command in
> solr.cmd.
> > >> How should I issue the correct command?
> > >>
> > >
> > >
> > >
> > > --
> > > Abdel K. Belkasri, PhD
> >
>
>
>
> --
> Abdel K. Belkasri, PhD
>
>
>
> -
> 未在此訊息中找到病毒。
> 已透過 AVG 檢查 - www.avg.com
> 版本: 2015.0.6201 / 病毒庫: 4568/1225

Re: API call for optimising a collection

2016-05-17 Thread John Bickerstaff
Having run the optimize from the admin UI on one of my three cores in a
Solr Cloud collection, I find that when I got to try to run it on one of
the other cores, it is already "optimized"

I realize that's not the same thing as an API call, but thought it might
help.

On Tue, May 17, 2016 at 11:22 PM, Nick Vasilyev 
wrote:

> As far as I know, you have to run it on each core.
> On May 18, 2016 1:04 AM, "Binoy Dalal"  wrote:
>
> > Is there no api call that can optimize an entire collection?
> >
> > I tried the collections api page on the confluence wiki but couldn't find
> > anything, and a Google search also yielded no meaningful results.
> > --
> > Regards,
> > Binoy Dalal
> >
>


Re: [scottchu] How to specify multiple zk nodes using solrstartcommand under Windows

2016-05-17 Thread scott.chu

Thanks! Once I do experiment ok under Windows. I'll move to CentOS. Actually, I 
already install 3 zk and Solr on 3 servers with CentOS. It seems that open 
source project more or less has something unprepared for running under Windows.

scott.chu,scott@udngroup.com
2016/5/18 (週三)
- Original Message - 
From: John Bickerstaff 
To: solr-user ; scott(自己) 
CC: 
Date: 2016/5/18 (週三) 13:53
Subject: Re: [scottchu] How to specify multiple zk nodes using solrstartcommand 
under Windows


I think those zk server warning messages are expected. Until you have 3 
running instances you don't have a "Quorum" and the Zookeeper instances 
complain. Once the third one comes up they are "happy" and don't complain 

any more. You'd get similar messages if one of the Zookeeper nodes ever 
went down. 

As for the stopping of zk server - I've never had any problem issuing a 
stop command, but I'm running Linux so I may not be much good to you in 
that regard. 

On Tue, May 17, 2016 at 8:41 PM, scott.chu  wrote: 


> I tested yesterday and it proves my theory. I'll share what I do under 
> Windows on 1 PC here with you experienced guys and further newbies: 
> 
> 1>Download zookeeper 3.4.8. I unzip it and copy to 3 other different 
> folders: zk_1, zk_2, zk_3. 
> 2>For each zk_n folder, I do these things (Note: {n} means the last digit 
> in zk_n foler name): 
> a. Create zoo_data folder under root and create 'myid' with 
> notepad, the contents is pure '{n}' only. 
> b. Create zoo.cfg under conf folder with following contents: 
> clientPort=218{n} 
> initLimit=5 
> syncLimit=2 
> dataDir=D:/zk_{n}/zoo_data 
> ;if p2p-coneect-port or leader-election-port are all 
> same, then we should set maxClientCnxns=n 
> ;maxClientCnxns=3 
> ;server.x=host:p2p-connect-port:leader-election-port 
> server.1=localhost:2888:3888 
> server.2=localhost:2889:3889 
> server.3=localhost:2890:3890 
> 3> I download ZOOKEEPER-1122's zkServer.cmd. and go into each zk_n folder 
> and issue command: 
> bin\zkServer.cmd start 
> 
> [Question]: There's something I'd like to ask guys: When I start 
> zk_1, zk_2, the console keeps shows some warning messages. 
> Only after I start zk_3, the warning messages 
> is stopped. Is that normal? 
> 
> 4> I use zkui_win to see them all go online successfully. 
> 5> I goto Solr-5.4.1 folder, and issue following commands: 
> bin\solr start -c -s mynodes\node1 -z localhost:2181 
> bin\solr start -c -s mynodes\node1 -z localhost:2181 -p 
> 7973 
> bin\solr create -c cugna -d myconfigsets\cugna -shards 
> 1 -replicationFactor 2 -p 8983 
> 6> By using zkui_win again, I see: 
> ** config 'cugna' are all synchronized on zk_1 to zk_3. So this 
> proves my theory, we only have to specify only one zk nodes and they'll 

> sync themselves. ** 
> 
> [Question]: I go into zk_n folder and issue 'bin\zkServer stop'. However, 
> this shows error message. It seems it can't taskkill the zk process for 
> some reason. The only way I stop them 
> is by closing DOS windows that has issued the 
> 'bin\zkServer start' command. Does anybody know why 'bin\zkServer stop' 
> doesn't work? 
> 
> Note: Gotta say sorry for the repitition of localhost:2181. It's my typo. 
> 
> scott.chu,scott@udngroup.com 
> 2016/5/18 (週三) 
> - Original Message - 
> From: Abdel Belkasri 
> To: solr-user 
> CC: 
> Date: 2016/5/18 (週三) 00:17 
> Subject: Re: [scottchu] How to specify multiple zk nodes using solr 
> startcommand under Windows 
> 
> 
> The repetition is just a cut and paste from Scott's post. 
> 
> How can I check if I am getting the ensemble or just a single zk? 
> 
> Also if this is not the way to specify an ensemble, what is the right way? 
> 
> 
> Because the comma delimited list does not work, I concur with Scott. 
> 
> On Tue, May 17, 2016 at 11:49 AM, Erick Erickson  
> 
> wrote: 
> 
> > Are you absolutely sure you're getting an _ensemble_ and 
> > not just connecting to a single node? My suspicion (without 
> > proof) is that you're just getting one -z option. It'll work as 
> > long as that ZK instance stays up, but it won't be fault-tolerant. 
> > 
> > And again you repeated the port (2181) twice. 
> > 
> > Best, 
> > Erick 
> > 
> > On Tue, May 17, 2016 at 8:02 AM, Abdel Belkasri  
> > wrote: 
> > > Hi Scott, 
> > > what worked for me in Windows is this (no ",") 
> > > bin\Solr start -c -s mynodes\node1 -z localhost:2181 -z localhost:2181 
> -z 
> > > localhost:2183 
> > > 
> > > -- Hope this helps 
> > > Abdel. 
> > > 
> > > On Tue, May 17, 2016 at 3:35 AM, scott.chu  
> > wrote: 
> > > 
> > >> I start 3 zk nodes at port 2181,2182, and 2183 on my local machine. 

> > >> Go into Solr 5.4.1 root folder and issue and issue the command in 
> > article 
> > >> 'Setting Up an External ZooKeeper Ensemble' in reference guide 
> > >> 
> > >> bin\Solr start -c -s mynodes\node1 -z 
> > >> localhost:2181,localhost:2181,localhost:2183 
> > >> 
> > >> but it doesn't run but just show help page of start command in 
> solr.cmd.