Re: Writing Solr Custom Components

2016-10-14 Thread Ishan Chattopadhyaya
http://home.apache.org/~ctargett/RefGuidePOC/jekyll-full/adding-custom-plugins-in-solrcloud-mode.html

On Tue, Oct 4, 2016 at 12:23 PM, John Bickerstaff 
wrote:

> All,
>
> I'm looking for information on writing custom Solr components.  A quick
> search showed nothing really recent and before I dig deeper, I thought I'd
> ask the community for anything you are aware of.
>
> Thanks
>


group ts-dt with multiple shards

2016-10-14 Thread xiao.ma
Hi,

We are trying to messages based on ts-dt value in solr 6.2.1.

 

*One collection with only one shard, we are able to group ts-dt
without error, but performance is getting worse over time

*Multiple collections with only one shard each, we are able to group
ts-dt without error for each collection shard, but performance is getting
worse over time

*One collection with multiple shards, we can't group ts-dt. Example
error message" Error from server at
http://xx.xx.xx.xx:8983/solr/test4s2rf_shard4_replica2: Invalid Date
String:'Tue Sep 06 02:53:00 UTC 2016'" . In the error message, the Date
string is not in the format that we write into solr, enhance we can't find
it in solr. And every time we do grouping, the error message is different,
for instance we run grouping second time, error could be "Error from server
at http://xx.xx.xx.xx:8983/solr/test4s2rf_shard1_replica1: Invalid Date
String:'Tue Aug 02 03:42:03 UTC 2016'"

 

Has anyone had similar problem? Where could be the place we did wrong?

 

Br,

Xiao

 



Incongruent results of numdocs

2016-10-14 Thread Davide Isoardi
Hi all,
I have indexed more than 1 million of docs on a SolrCloud collections whit 5 
shards and 2 replicas.

After the indexing if I try to query (many times) q=id:*&rows=0 I have 
different result for the document number founds.

Why the result is not the same for all querys?

Thanks in advance
Davide Isoardi
eCube S.r.l.
isoa...@ecubecenter.it
http://www.ecubecenter.it
Tel.  +390113999301
Mobile +393288204915
Fax. +390113999309

 [ecube]  [ecube-firma-mail] 
 [TW1]    [IN1] 

Informativa ai sensi del Decr.Lgs Privacy n.196/2003
ECUBE tratta i dati personali secondo quanto specificato nella pagina "Privacy 
Policy" disponibile su http://www.ecubecenter.it/privacy.pdf. Le informazioni 
contenute nel presente messaggio sono destinate esclusivamente al/ai 
destinatario/i indicato/i. Qualora riceviate il presente messaggio per errore, 
vi preghiamo di voler cortesemente darcene notizia via e-mail 
(i...@ecubecenter.it) e di provvedere ad eliminare 
il messaggio ricevuto erroneamente, essendo illegittimo ed illecito ogni 
diverso utilizzo.




How to boost a field copied in the Text Field

2016-10-14 Thread Frederic MERCEUR

Hello,

we use Solr to describe datasets with several metadata (title, authors, 
description, etc). We copy all these metadata in the Text field to offer 
a default search to our end-users so they can make a search on all 
metadata in one search :


multiValued="true" required="true"/>
stored="true" multiValued="true"/>

...



...

We can then query the default Text field in this way :

http://solr[...]/select?q=fish

Is there a way to boost the Title field copied in the Text Field ? So 
the dataset that contain the word  "fish" in their title will be 
displayed first ?


I have find some information about the boost option in Solr but I cant' 
find if it is possible to boost a field copied in the Text field ?


Thanks,
Fred


--
Fred Merceur
http://annuaire.ifremer.fr/cv/16828/


Re: How to boost a field copied in the Text Field

2016-10-14 Thread Alexandre Rafalovitch
The second you copy a bunch of things into one field, it becomes very
hard to have a useful relevancy. Especially, since they are now parsed
according to that target field's analyzer chain and your original
field's chains become irrelevant.

There was somewhere an article on using payloads for boost, but it was
more a bit of a hack.

Instead, look at eDismax as you can specify the list of fields you
want to search and assign the individual boosts to them, just as you
requested.

Regards,
   Alex.

Solr Example reading group is starting November 2016, join us at
http://j.mp/SolrERG
Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 14 October 2016 at 05:48, Frederic MERCEUR
 wrote:
> Hello,
>
> we use Solr to describe datasets with several metadata (title, authors,
> description, etc). We copy all these metadata in the Text field to offer a
> default search to our end-users so they can make a search on all metadata in
> one search :
>
>  multiValued="true" required="true"/>
>  multiValued="true"/>
> ...
>
> 
> 
> ...
>
> We can then query the default Text field in this way :
>
> http://solr[...]/select?q=fish
>
> Is there a way to boost the Title field copied in the Text Field ? So the
> dataset that contain the word  "fish" in their title will be displayed first
> ?
>
> I have find some information about the boost option in Solr but I cant' find
> if it is possible to boost a field copied in the Text field ?
>
> Thanks,
> Fred
>
>
> --
> Fred Merceur
> http://annuaire.ifremer.fr/cv/16828/


showing http 403 when accessing admin UI with solr authorization configured

2016-10-14 Thread 李爽
hi, i stuck into this issue which is very strange

1.  i have a solr cloud enviroment, the solr version is 5.5.2
2.  i get kerberos authentication and rule-based authorization, the 
configuration is as follows:

3. when i access the admin ui, it shows 403 


4. in the jetty server log,it shows the following exception:

497745 INFO  (qtp702282179-19) [   ] o.a.s.s.RuleBasedAuthorizationPlugin This 
resource is configured to have a permission {
  "name":"all",
  "role":"admin"}, The principal 
u=loushang&p=loush...@nie.netease.com&t=kerberos&e=1476479049131 does not have 
the right role 
SLF4J: Failed toString() invocation on an object of type 
[org.apache.solr.servlet.HttpSolrCall$2]
java.lang.NullPointerException


I have grant all permission to user “loushang”, but it still prompts that i  
dont have permission


does anyone know why?greate apperication

Re: qf boosts with MoreLikeThis query parser

2016-10-14 Thread Ere Maijala
I've now attached a proposed patch to a pre-existing issue 
https://issues.apache.org/jira/browse/SOLR-9267.


--Ere

13.10.2016, 2.19, Ere Maijala kirjoitti:

Answering to myself.. I did some digging and found out that boosts work
if qf is repeated in the local params, at least in Solr 6.2, like this:

{!mlt qf=title^100 qf=author=^50}recordid

However, it doesn't work properly with CloudMLTQParser used in SolrCloud
mode. I'm working on a proposed fix for this and will post a Jira issue
with a patch when done. There appears to be another problem with
CloudMLTQParser too where it includes extraneous terms in the final
query, and I'll take a stab at fixing that too.

--Ere

1.8.2016, 9.12, Ere Maijala kirjoitti:

Hi All,

I, too, would like to know the answer to these questions. I saw a
similar question by Nikaash Puri on 22 June with subject "help with
moreLikeThis" go unanswered. Any insight?

Regards,
Ere

11.7.2016, 18.32, Demian Katz kirjoitti:

Hello,

I am currently using field-specific boosts in the qf setting of the
MoreLikeThis request handler:

https://github.com/vufind-org/vufind/blob/master/solr/vufind/biblio/conf/solrconfig.xml#L410



I would like to accomplish the same effect using the MoreLikeThis
query parser, so that I can take advantage of such benefits as
sharding support.

I am currently using Solr 5.5.0, and in spite of trying many
syntactical variations, I can't seem to get it to work. Some
discussion on this JIRA ticket seems to suggest there may have been
some problems caused by parsing limitations:

https://issues.apache.org/jira/browse/SOLR-7143

However, I think my work on this ticket should have eliminated those
limitations:

https://issues.apache.org/jira/browse/SOLR-2798

Anyway, this brings up a few questions:


1.)Is field-specific boosting in qf supported by the MLT query
parser, and if so, what syntax should I use?

2.)If this functionality is supported, but not in v5.5.0,
approximately when was it fixed?

3.)If the functionality is still not working, would it be worth my
time to try to fix it, or is it being excluded for a specific reason?

Any and all insight is appreciated. Apologies if the answers are
already out there somewhere, but I wasn't able to find them!

thanks,
Demian





--
Ere Maijala
Kansalliskirjasto / The National Library of Finland


Re: Slow indexing speed when index size is large?

2016-10-14 Thread Shawn Heisey
On 10/13/2016 9:58 PM, Zheng Lin Edwin Yeo wrote:
> Thanks for the reply Shawn. Currently, my heap allocation to each Solr
> instance is 22GB. Is that big enough? 

I can't answer that question.  I know little about your install.  Even
if I *did* know a few more things about your install, I could only make
a *guess* about how much heap you need, and I'd probably be wrong.

https://lucidworks.com/blog/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

I did write down what I consider to be a good way to figure out a
correct heap size, but it requires experimentation with your live
system, which might cause disruption of your search service:

https://wiki.apache.org/solr/SolrPerformanceProblems#How_much_heap_space_do_I_need.3F

Thanks,
Shawn



Re: SOLR Sizing

2016-10-14 Thread Shawn Heisey
On 10/14/2016 12:18 AM, Vasu Y wrote:
> Thank you all for the insight and help. Our SOLR instance has multiple
> collections.
> Do you know if the spreadsheet at LucidWorks (
> https://lucidworks.com/blog/2011/09/14/estimating-memory-and-storage-for-lucenesolr/)
> is meant to be used to calculate sizing per collection or is it meant to be
> used for the whole SOLR instance (that contains multiple collections).
>
> The reason I am asking this question is, there are some defaults like
> "Transient (MB)" (with a value 10 MB) specified "Disk Space Estimator"
> sheet; I am not sure if these default values are per collection or the
> whole SOLR instance.

You would need to include info for everything that would live on the
Solr instance ... but even that estimator can only provide you with a
*guess* as to how much heap size you need, and depending on how you
actually use Solr, it might be a completely incorrect guess.  You've
already been given the following URL in response to the initial
question, and gotten other replies about why there's no way for us to
give you an actual answer:

https://lucidworks.com/blog/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

Testing and adjusting the actual live production system is the only way
to be absolutely sure what your requirements are.  Anything before that
is guesswork.

Thanks,
Shawn



Re: Incongruent results of numdocs

2016-10-14 Thread Shawn Heisey
On 10/14/2016 3:35 AM, Davide Isoardi wrote:
> I have indexed more than 1 million of docs on a SolrCloud collections whit 5 
> shards and 2 replicas.
>
> After the indexing if I try to query (many times) q=id:*&rows=0 I have 
> different result for the document number founds.
>
> Why the result is not the same for all querys?

Assuming that you are not indexing new documents between requests, there
are two reasons for this problem:

1) You have documents with the same uniqueKey value in more than one of
your shards.  This typically happens when the router on the collection
is set to "implicit" ... which basically means "manual."
2) Your two replicas are out of sync, which might have any number of causes.

Side note:  "q=id:*" is a very inefficient query.  You would be better
off with a range query -- "q=id:[* TO *]".  That would be faster and use
less memory.  If the id field is your uniqueKey, then an even faster
query and 100% equivalent query is the one for all docs -- "q=*:*".

Thanks,
Shawn



SolrJ & Ridiculously Large Queries

2016-10-14 Thread Bram Van Dam
Hey folks,

I just noticed that Jetty barfs with HTTP 414 when request URIs are very
large, which makes sense. I think the default limit is ~8k.
Unfortunately I've got users who insist on executing queries that are
16k (!1!?!?) in size.

Two questions:

1) is it possible to POST these oversized monstrosities instead?

2) can I get SolrJ to POST them?

Suggestions are welcome!

Quick disclaimer: I don't write the queries, and only the default query
parser is available, so trying to reduce the query size is not an option :-(

Thanks

 - Bram


RE: SolrJ & Ridiculously Large Queries

2016-10-14 Thread Markus Jelsma
Yes, you can use HTTP POST with SolrJ for queries.

SolrRequest request = new QueryRequest((SolrParams)query, 
SolrRequest.METHOD.POST);
QueryResponse response = new QueryResponse(client.request(request), client);

https://lucene.apache.org/solr/6_2_1/solr-solrj/org/apache/solr/client/solrj/SolrRequest.html

M.

 
 
-Original message-
> From:Bram Van Dam 
> Sent: Friday 14th October 2016 15:37
> To: solr-user@lucene.apache.org
> Subject: SolrJ & Ridiculously Large Queries
> 
> Hey folks,
> 
> I just noticed that Jetty barfs with HTTP 414 when request URIs are very
> large, which makes sense. I think the default limit is ~8k.
> Unfortunately I've got users who insist on executing queries that are
> 16k (!1!?!?) in size.
> 
> Two questions:
> 
> 1) is it possible to POST these oversized monstrosities instead?
> 
> 2) can I get SolrJ to POST them?
> 
> Suggestions are welcome!
> 
> Quick disclaimer: I don't write the queries, and only the default query
> parser is available, so trying to reduce the query size is not an option :-(
> 
> Thanks
> 
>  - Bram
> 


Re: SolrJ & Ridiculously Large Queries

2016-10-14 Thread Shawn Heisey
On 10/14/2016 7:36 AM, Bram Van Dam wrote:
> I just noticed that Jetty barfs with HTTP 414 when request URIs are
> very large, which makes sense. I think the default limit is ~8k.
> Unfortunately I've got users who insist on executing queries that are
> 16k (!1!?!?) in size.

Although I think forcing SolrJ to do POST (which Markus provided) is a
better solution than what I did, just for completeness I'll mention the
other path you can take -- increasing the max header size.  I found this
in the 6.0 config file server/etc/jetty.xml:



On my systems, I have bumped this to 32768, because we have queries in
the 20K range.  I also needed to bump the header size in my load
balancer, haproxy.  I do not know what the max size of this value is,
but I personally would not go beyond 32-64K.  If the query can't fit in
that, POST is the only reasonable path.

Thanks,
Shawn



R: Incongruent results of numdocs

2016-10-14 Thread Davide Isoardi
thank you very much for the quick answare.



Yes, I am not indexing between request.



How can I risync two or all replicas?

If I look the overviews in the shard menu (attached the screenshot) I see that 
the num docs are mismatched.



[cid:image001.jpg@01D22642.62A0AD40]

Shard1_replica1

[cid:image002.jpg@01D22642.62A0AD40]

Shard2_replica2





Davide Isoardi

eCube S.r.l.

isoa...@ecubecenter.it

http://www.ecubecenter.it

Tel.  +390113999301

Mobile +393288204915

Fax. +390113999309





Informativa ai sensi del Decr.Lgs Privacy n.196/2003

ECUBE tratta i dati personali secondo quanto specificato nella pagina “Privacy 
Policy” disponibile su http://www.ecubecenter.it/privacy.pdf. Le informazioni 
contenute nel presente messaggio sono destinate esclusivamente al/ai 
destinatario/i indicato/i. Qualora riceviate il presente messaggio per errore, 
vi preghiamo di voler cortesemente darcene notizia via e-mail 
(i...@ecubecenter.it) e di provvedere ad eliminare il messaggio ricevuto 
erroneamente, essendo illegittimo ed illecito ogni diverso utilizzo.





-Messaggio originale-

Da: Shawn Heisey [mailto:apa...@elyograg.org]

Inviato: venerdì 14 ottobre 2016 14:32

A: solr-user@lucene.apache.org

Oggetto: Re: Incongruent results of numdocs



On 10/14/2016 3:35 AM, Davide Isoardi wrote:

> I have indexed more than 1 million of docs on a SolrCloud collections whit 5 
> shards and 2 replicas.

>

> After the indexing if I try to query (many times) q=id:*&rows=0 I have 
> different result for the document number founds.

>

> Why the result is not the same for all querys?



Assuming that you are not indexing new documents between requests, there are 
two reasons for this problem:



1) You have documents with the same uniqueKey value in more than one of your 
shards.  This typically happens when the router on the collection is set to 
"implicit" ... which basically means "manual."

2) Your two replicas are out of sync, which might have any number of causes.



Side note:  "q=id:*" is a very inefficient query.  You would be better off with 
a range query -- "q=id:[* TO *]".  That would be faster and use less memory.  
If the id field is your uniqueKey, then an even faster query and 100% 
equivalent query is the one for all docs -- "q=*:*".



Thanks,

Shawn




R: Incongruent results of numdocs

2016-10-14 Thread Davide Isoardi
thank you very much for the quick answare.



Yes, I am not indexing between request.



How can I risync two or all replicas?

If I look the overviews in the shard menu (attached the screenshot) I see that 
the num docs are mismatched.





Davide Isoardi

eCube S.r.l.

isoa...@ecubecenter.it

http://www.ecubecenter.it

Tel.  +390113999301

Mobile +393288204915

Fax. +390113999309





Informativa ai sensi del Decr.Lgs Privacy n.196/2003

ECUBE tratta i dati personali secondo quanto specificato nella pagina “Privacy 
Policy” disponibile su http://www.ecubecenter.it/privacy.pdf. Le informazioni 
contenute nel presente messaggio sono destinate esclusivamente al/ai 
destinatario/i indicato/i. Qualora riceviate il presente messaggio per errore, 
vi preghiamo di voler cortesemente darcene notizia via e-mail 
(i...@ecubecenter.it) e di provvedere ad eliminare 
il messaggio ricevuto erroneamente, essendo illegittimo ed illecito ogni 
diverso utilizzo.





-Messaggio originale-

Da: Shawn Heisey [mailto:apa...@elyograg.org]

Inviato: venerdì 14 ottobre 2016 14:32

A: solr-user@lucene.apache.org

Oggetto: Re: Incongruent results of numdocs



On 10/14/2016 3:35 AM, Davide Isoardi wrote:

> I have indexed more than 1 million of docs on a SolrCloud collections whit 5 
> shards and 2 replicas.

>

> After the indexing if I try to query (many times) q=id:*&rows=0 I have 
> different result for the document number founds.

>

> Why the result is not the same for all querys?



Assuming that you are not indexing new documents between requests, there are 
two reasons for this problem:



1) You have documents with the same uniqueKey value in more than one of your 
shards.  This typically happens when the router on the collection is set to 
"implicit" ... which basically means "manual."

2) Your two replicas are out of sync, which might have any number of causes.



Side note:  "q=id:*" is a very inefficient query.  You would be better off with 
a range query -- "q=id:[* TO *]".  That would be faster and use less memory.  
If the id field is your uniqueKey, then an even faster query and 100% 
equivalent query is the one for all docs -- "q=*:*".



Thanks,

Shawn




Access solr in web browser

2016-10-14 Thread Mugeesh Husain
Hi,

I have sucessfully installed Solrcloud having 2 solr cores [installed in
datanodes(using hortonworks)]
As my cluster is kerberos enabled am unable to access the cluster through
web browser.

I  want to access the Solr UI only through host's putty session. I have no
clue on how to operate solr on CLI.

could anyone suggest me how to do this.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Access-solr-in-web-browser-tp4301201.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: R: Incongruent results of numdocs

2016-10-14 Thread Shawn Heisey
On 10/14/2016 9:43 AM, Davide Isoardi wrote:
>
> thank you very much for the quick answare.
>
>  
>
> Yes, I am not indexing between request.
>
>  
>
> How can I risync two or all replicas?
>
> If I look the overviews in the shard menu (attached the screenshot) I
> see that the num docs are mismatched.
>
>  
>
> Shard1_replica1
>
>
> Shard2_replica2
>

I can't see those pictures, the attachments didn't make it.  You seem to
be comparing shard1 and shard2.  That's not a valid comparison.  There's
a very good chance that different shards will have different document
counts even if everything is working correctly.  You need to compare
replicas of shard1 to other replicas of shard1, shard2 to shard2, etc. 
They'll likely be on different servers.

Probably the best way to force a resync is to shutdown a Solr instance,
decide which replicas you want to delete on that instance, delete the
data directory for those replicas, and start Solr back up.  Any replica
where you delete the data directory will copy the index from the shard
leader, and they'll be back in sync when the copy finishes.  Before you
do this, make sure that you actually do have multiple replicas of each
shard.

Thanks,
Shawn



Re: group ts-dt with multiple shards

2016-10-14 Thread Erick Erickson
Please show the _exact_ query you're using. What is td-dt? A
date-field? A field name?
And if a field name is it literally "ds-dt"? If so, be aware that
hyphens are not officially
supported in field names. The recommendation is that your Solr fields
have follow the
recommendation here:

https://cwiki.apache.org/confluence/display/solr/Defining+Fields

"Field names should consist of alphanumeric or underscore characters
only and not start with a digit."

Best,
Erick

On Fri, Oct 14, 2016 at 4:47 AM,   wrote:
> Hi,
>
> We are trying to messages based on ts-dt value in solr 6.2.1.
>
>
>
> *One collection with only one shard, we are able to group ts-dt
> without error, but performance is getting worse over time
>
> *Multiple collections with only one shard each, we are able to group
> ts-dt without error for each collection shard, but performance is getting
> worse over time
>
> *One collection with multiple shards, we can't group ts-dt. Example
> error message" Error from server at
> http://xx.xx.xx.xx:8983/solr/test4s2rf_shard4_replica2: Invalid Date
> String:'Tue Sep 06 02:53:00 UTC 2016'" . In the error message, the Date
> string is not in the format that we write into solr, enhance we can't find
> it in solr. And every time we do grouping, the error message is different,
> for instance we run grouping second time, error could be "Error from server
> at http://xx.xx.xx.xx:8983/solr/test4s2rf_shard1_replica1: Invalid Date
> String:'Tue Aug 02 03:42:03 UTC 2016'"
>
>
>
> Has anyone had similar problem? Where could be the place we did wrong?
>
>
>
> Br,
>
> Xiao
>
>
>


Re: Access solr in web browser

2016-10-14 Thread Deeksha Sharma
So if you are looking for the information on collections that you created on 
solrCloud, then you may do so via API calls that are listed here. All you need 
is the host and port of one of the machines in the cluster.

https://cwiki.apache.org/confluence/display/solr/Collections+API

Thanks
Deeksha

From: Mugeesh Husain 
Sent: Friday, October 14, 2016 9:30 AM
To: solr-user@lucene.apache.org
Subject: Access solr in web browser

Hi,

I have sucessfully installed Solrcloud having 2 solr cores [installed in
datanodes(using hortonworks)]
As my cluster is kerberos enabled am unable to access the cluster through
web browser.

I  want to access the Solr UI only through host's putty session. I have no
clue on how to operate solr on CLI.

could anyone suggest me how to do this.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Access-solr-in-web-browser-tp4301201.html
Sent from the Solr - User mailing list archive at Nabble.com.


Filter response for lukerequest handler [SOLR 6.1.0]

2016-10-14 Thread slee
I have read the following documentation outline here:  LukeRequestHandler
  

The response always comes back as:


What I want is just to have the response output as followed:


I have a use-case where I have well over 100 dynamic fields, which could
possibly grow. On the UI side, I would give them a text-box with look-ahead
search that binds to this dynamic fields.
 

Can this be done via the given LukeRequestHandler api?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Filter-response-for-lukerequest-handler-SOLR-6-1-0-tp4301218.html
Sent from the Solr - User mailing list archive at Nabble.com.


Configuration options/concerns for multiple Solr versions

2016-10-14 Thread Tim Parker
We have a ColdFusion-based CMS product which can interface with Solr for 
search functionality.  ColdFusion ships with an ancient version of Solr 
(old enough that it crashes when the search criteria includes a leading 
wildcard), so to get current Solr functionality... we have to interface 
directly to Solr.  We don't bundle Solr with our product, so it's 
important to be able to interface to as wide a variety of Solr releases 
as possible.


Our initial implementation used Solr 4.10.2, and includes customizations 
in schema.xml and solrconfig.xml.  We have since made some minor changes 
for compatibility with Solr 5.x, but our goal is to minimize the 
version-specific information so we can create new cores without having 
to worry about what Solr version is in play.


Aside from keeping the luceneMatchVersion setting in agreement with the 
actual Solr release in use, what other configuration and/or schema 
changes do we need to worry about?


[enhancement request: set luceneMatchVersion to the running version if 
it's not found in solrconfig.xml - and/or allow a setting of 'current' 
so this doesn't have to be touched without specific reason to do so - 
any thoughts?  Am I missing something?]


--
Tim Parker
Senior Engineer
PaperThin, Inc.
300 Congress Street, Suite 303
Quincy, MA 02169
Ph: 617.471.4440 x203
CommonSpot helps organizations improve engagement across the web, mobile 
devices, and social media outlets to achieve better marketing results.  Find 
out what's new in CommonSpot at www.paperthin.com.



R: R: Incongruent results of numdocs

2016-10-14 Thread Davide Isoardi
I am sorry for my typos. I have compared numdocs of  shard1_replica1 with 
shard1_replica2.

If I  create another replica (replica3) and only after that I unload replica2, 
will the last replica be synchronized with replica1?

Inviata dal mio Windows Phone

Da: Shawn Heisey
Inviato: ‎14/‎10/‎2016 18:33
A: solr-user@lucene.apache.org
Oggetto: Re: R: Incongruent results of numdocs

On 10/14/2016 9:43 AM, Davide Isoardi wrote:
>
> thank you very much for the quick answare.
>
>
>
> Yes, I am not indexing between request.
>
>
>
> How can I risync two or all replicas?
>
> If I look the overviews in the shard menu (attached the screenshot) I
> see that the num docs are mismatched.
>
>
>
> Shard1_replica1
>
>
> Shard2_replica2
>

I can't see those pictures, the attachments didn't make it.  You seem to
be comparing shard1 and shard2.  That's not a valid comparison.  There's
a very good chance that different shards will have different document
counts even if everything is working correctly.  You need to compare
replicas of shard1 to other replicas of shard1, shard2 to shard2, etc.
They'll likely be on different servers.

Probably the best way to force a resync is to shutdown a Solr instance,
decide which replicas you want to delete on that instance, delete the
data directory for those replicas, and start Solr back up.  Any replica
where you delete the data directory will copy the index from the shard
leader, and they'll be back in sync when the copy finishes.  Before you
do this, make sure that you actually do have multiple replicas of each
shard.

Thanks,
Shawn



Re: R: R: Incongruent results of numdocs

2016-10-14 Thread Erick Erickson
Not quite. When you add replica 3, it will be synchronized to the leader.
So I'd shut down the solr node with the bad replica, add the new replica
and then delete the old one.

The disturbing bit is that the replicas for out of sync in the first place.
Anything in the logs that gives any clues as to why?

Best,
Erick

On Oct 14, 2016 18:32, "Davide Isoardi"  wrote:

> I am sorry for my typos. I have compared numdocs of  shard1_replica1 with
> shard1_replica2.
>
> If I  create another replica (replica3) and only after that I unload
> replica2, will the last replica be synchronized with replica1?
>
> Inviata dal mio Windows Phone
> 
> Da: Shawn Heisey
> Inviato: ‎14/‎10/‎2016 18:33
> A: solr-user@lucene.apache.org
> Oggetto: Re: R: Incongruent results of numdocs
>
> On 10/14/2016 9:43 AM, Davide Isoardi wrote:
> >
> > thank you very much for the quick answare.
> >
> >
> >
> > Yes, I am not indexing between request.
> >
> >
> >
> > How can I risync two or all replicas?
> >
> > If I look the overviews in the shard menu (attached the screenshot) I
> > see that the num docs are mismatched.
> >
> >
> >
> > Shard1_replica1
> >
> >
> > Shard2_replica2
> >
>
> I can't see those pictures, the attachments didn't make it.  You seem to
> be comparing shard1 and shard2.  That's not a valid comparison.  There's
> a very good chance that different shards will have different document
> counts even if everything is working correctly.  You need to compare
> replicas of shard1 to other replicas of shard1, shard2 to shard2, etc.
> They'll likely be on different servers.
>
> Probably the best way to force a resync is to shutdown a Solr instance,
> decide which replicas you want to delete on that instance, delete the
> data directory for those replicas, and start Solr back up.  Any replica
> where you delete the data directory will copy the index from the shard
> leader, and they'll be back in sync when the copy finishes.  Before you
> do this, make sure that you actually do have multiple replicas of each
> shard.
>
> Thanks,
> Shawn
>
>