query to get parents without childs

2015-12-16 Thread Novin Novin
Hi guys,

I have few parent index without child, what would wold be the query for
those to get?

Thanks,
Novin


Re: query to get parents without childs

2015-12-16 Thread Novin Novin
Hi Scott,

Actually, it is not multi value field. it is nested document.

Novin

On 16 December 2015 at 20:33, Scott Stults <
sstu...@opensourceconnections.com> wrote:

> Hi Novin,
>
> How are you associating parents with children? Is it a "children"
> multivalued field in the parent record? If so you could query for records
> that don't have a value in that field like "-children:[* TO *]"
>
> k/r,
> Scott
>
> On Wed, Dec 16, 2015 at 7:29 AM, Novin Novin  wrote:
>
> > Hi guys,
> >
> > I have few parent index without child, what would wold be the query for
> > those to get?
> >
> > Thanks,
> > Novin
> >
>
>
>
> --
> Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC
> | 434.409.2780
> http://www.opensourceconnections.com
>


solr 5.2.0 need to build high query response

2016-01-05 Thread Novin Novin
Hi guys,

I'm having trouble to figure what would be idle solr config for where:

I'm doing hard commit in every minute   for very few number of users
because I have to show those docs in search results quickly when user save
the changes.

It is causing the response in around  2 secs to show even I am getting only
10 records.

Could you able to give some idea where to look at.


Thanks in advance,
Novin


Re: solr 5.2.0 need to build high query response

2016-01-05 Thread Novin Novin
Thanks David. It is quite good to use for NRT.

Apologies, I didn't mention that facet search is really slow.

I found the below reason which could be the reason because I am using facet
spatial search which is getting slow.

To know more about solr hard and soft commits, have a look at this blog :
https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

In this article, "soft commits are that they will make documents visible,
but at some cost. In particular the “top level” caches, which include what
you configure in solrconfig.xml (filterCache, queryResultCache, etc) will
be invalidated! Autowarming will be performed on your top level caches
(e.g. filterCache, queryResultCache), and any newSearcher queries will be
executed. Also, the FieldValueCache is invalidated, so facet queries will
have to wait until the cache is refreshed."

Do you have any idea what could possible be do about this?



On Tue, 5 Jan 2016 at 12:31 davidphilip cherian <
davidphilipcher...@gmail.com> wrote:

> You should use solr softcommit for this use case. So, by setting softcommit
> to 5 seconds and autoCommit to minute with openSearcher=false should do the
> work.
>
>  
>  6
> false
>  
>
> 
> 2000
> 
>
> Reference link-
> https://cwiki.apache.org/confluence/display/solr/Near+Real+Time+Searching
>
> To know more about solr hard and soft commits, have a look at this blog :
>
> https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> On Tue, Jan 5, 2016 at 5:44 PM, Novin Novin  wrote:
>
> > Hi guys,
> >
> > I'm having trouble to figure what would be idle solr config for where:
> >
> > I'm doing hard commit in every minute   for very few number of users
> > because I have to show those docs in search results quickly when user
> save
> > the changes.
> >
> > It is causing the response in around  2 secs to show even I am getting
> only
> > 10 records.
> >
> > Could you able to give some idea where to look at.
> >
> >
> > Thanks in advance,
> > Novin
> >
>


Re: solr 5.2.0 need to build high query response

2016-01-05 Thread Novin Novin
If I'm correct, you are talking about this










*or may be here too.*

static firstSearcher warming in
solrconfig.xml




Thanks,
Novin

On Tue, 5 Jan 2016 at 16:22 Erick Erickson  wrote:

> It sounds like you're not doing proper autowarming,
> which you'd need to do either with hard or
> soft commits that open new searchers.
>
> see:
> https://wiki.apache.org/solr/SolrCaching#Cache_Warming_and_Autowarming
>
> In particular, you should have a newSearcher event
> that facets on the fields you expect to need.
>
> Best,
> Erick
>
> On Tue, Jan 5, 2016 at 8:17 AM, Novin Novin  wrote:
> > Thanks David. It is quite good to use for NRT.
> >
> > Apologies, I didn't mention that facet search is really slow.
> >
> > I found the below reason which could be the reason because I am using
> facet
> > spatial search which is getting slow.
> >
> > To know more about solr hard and soft commits, have a look at this blog :
> >
> https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> >
> > In this article, "soft commits are that they will make documents visible,
> > but at some cost. In particular the “top level” caches, which include
> what
> > you configure in solrconfig.xml (filterCache, queryResultCache, etc) will
> > be invalidated! Autowarming will be performed on your top level caches
> > (e.g. filterCache, queryResultCache), and any newSearcher queries will be
> > executed. Also, the FieldValueCache is invalidated, so facet queries will
> > have to wait until the cache is refreshed."
> >
> > Do you have any idea what could possible be do about this?
> >
> >
> >
> > On Tue, 5 Jan 2016 at 12:31 davidphilip cherian <
> > davidphilipcher...@gmail.com> wrote:
> >
> >> You should use solr softcommit for this use case. So, by setting
> softcommit
> >> to 5 seconds and autoCommit to minute with openSearcher=false should do
> the
> >> work.
> >>
> >>  
> >>  6
> >> false
> >>  
> >>
> >> 
> >> 2000
> >> 
> >>
> >> Reference link-
> >>
> https://cwiki.apache.org/confluence/display/solr/Near+Real+Time+Searching
> >>
> >> To know more about solr hard and soft commits, have a look at this blog
> :
> >>
> >>
> https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> >>
> >> On Tue, Jan 5, 2016 at 5:44 PM, Novin Novin 
> wrote:
> >>
> >> > Hi guys,
> >> >
> >> > I'm having trouble to figure what would be idle solr config for where:
> >> >
> >> > I'm doing hard commit in every minute   for very few number of users
> >> > because I have to show those docs in search results quickly when user
> >> save
> >> > the changes.
> >> >
> >> > It is causing the response in around  2 secs to show even I am getting
> >> only
> >> > 10 records.
> >> >
> >> > Could you able to give some idea where to look at.
> >> >
> >> >
> >> > Thanks in advance,
> >> > Novin
> >> >
> >>
>


Re: solr 5.2.0 need to build high query response

2016-01-06 Thread Novin Novin
Thanks Erick, this listener doing quite a good job. But not what  I needed.
Do the solr has any other things that I can look into to make it faster.
FYI  speed goes to 1 sec to 1.2 sec. I actually needed around 500 ms.

On Tue, 5 Jan 2016 at 18:24 Erick Erickson  wrote:

> Yep. Do note what's happening here. You're executing a query
> that potentially takes 10 seconds to execute (based on your
> earlier post). But you may be opening a new searcher every
> 2 seconds. You may start to see "too many on deck searchers"
> in your log. If you do do _not_ try to "fix" this by upping the
> maxWarmingSearchers in solrconfig.xml, that's really an
> anti-pattern.
>
> Really, I'd consider relaxing this 2 second limit. I've often found
> it easier to tell users "it may take up to 30 seconds for newly-added
> docs to appear in search results" than try to satisfy overly-tight
> requirements.
>
> As a former co-worker often said, "Users are much more comfortable
> with predictable delays than unpredictable ones". It's surprising how
> often it's the case.
>
> Best,
> Erick
>
> P.S. What's the difference between newSearcher and firstSearcher?
> newSearcher is fired every time a commit (soft or hard with
> openSearcher=true)
> where firstSearcher is fired up only when Solr starts. This is to
> accommodate
> the fact that the autowarm counts on things like filterCacher aren't
> available when Solr starts. In practice, though, many (most?) people
> put the same query in both.
>
> On Tue, Jan 5, 2016 at 9:17 AM, Novin Novin  wrote:
> > If I'm correct, you are talking about this
> >
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> >
> > *or may be here too.*
> >
> > static firstSearcher warming in
> > solrconfig.xml
> > 
> > 
> > 
> >
> > Thanks,
> > Novin
> >
> > On Tue, 5 Jan 2016 at 16:22 Erick Erickson 
> wrote:
> >
> >> It sounds like you're not doing proper autowarming,
> >> which you'd need to do either with hard or
> >> soft commits that open new searchers.
> >>
> >> see:
> >> https://wiki.apache.org/solr/SolrCaching#Cache_Warming_and_Autowarming
> >>
> >> In particular, you should have a newSearcher event
> >> that facets on the fields you expect to need.
> >>
> >> Best,
> >> Erick
> >>
> >> On Tue, Jan 5, 2016 at 8:17 AM, Novin Novin 
> wrote:
> >> > Thanks David. It is quite good to use for NRT.
> >> >
> >> > Apologies, I didn't mention that facet search is really slow.
> >> >
> >> > I found the below reason which could be the reason because I am using
> >> facet
> >> > spatial search which is getting slow.
> >> >
> >> > To know more about solr hard and soft commits, have a look at this
> blog :
> >> >
> >>
> https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> >> >
> >> > In this article, "soft commits are that they will make documents
> visible,
> >> > but at some cost. In particular the “top level” caches, which include
> >> what
> >> > you configure in solrconfig.xml (filterCache, queryResultCache, etc)
> will
> >> > be invalidated! Autowarming will be performed on your top level caches
> >> > (e.g. filterCache, queryResultCache), and any newSearcher queries
> will be
> >> > executed. Also, the FieldValueCache is invalidated, so facet queries
> will
> >> > have to wait until the cache is refreshed."
> >> >
> >> > Do you have any idea what could possible be do about this?
> >> >
> >> >
> >> >
> >> > On Tue, 5 Jan 2016 at 12:31 davidphilip cherian <
> >> > davidphilipcher...@gmail.com> wrote:
> >> >
> >> >> You should use solr softcommit for this use case. So, by setting
> >> softcommit
> >> >> to 5 seconds and autoCommit to minute with openSearcher=false should
> do
> >> the
> >> >> work.
> >> >>
> >> >>  
> >> >>  6
> >> >> false
> >> >>  
> >> >>
> >> >> 
> >> >> 2000
> >> >> 
> >> >>
> >> >> Reference link-
> >> >>
> >>
> https://cwiki.apache.org/confluence/display/solr/Near+Real+Time+Searching
> >> >>
> >> >> To know more about solr hard and soft commits, have a look at this
> blog
> >> :
> >> >>
> >> >>
> >>
> https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> >> >>
> >> >> On Tue, Jan 5, 2016 at 5:44 PM, Novin Novin 
> >> wrote:
> >> >>
> >> >> > Hi guys,
> >> >> >
> >> >> > I'm having trouble to figure what would be idle solr config for
> where:
> >> >> >
> >> >> > I'm doing hard commit in every minute   for very few number of
> users
> >> >> > because I have to show those docs in search results quickly when
> user
> >> >> save
> >> >> > the changes.
> >> >> >
> >> >> > It is causing the response in around  2 secs to show even I am
> getting
> >> >> only
> >> >> > 10 records.
> >> >> >
> >> >> > Could you able to give some idea where to look at.
> >> >> >
> >> >> >
> >> >> > Thanks in advance,
> >> >> > Novin
> >> >> >
> >> >>
> >>
>


Solr 5 with java 7 or java 8

2016-01-19 Thread Novin Novin
Hi Guys,

Is this highly recommended java 8 with solr 5.x, because solr 5.x code is
compiled with java 8
or it would be ok with java 7 but also could cause performance .

Thanks
Novin


slave is getting full synced every polling

2016-02-11 Thread Novin Novin
Hi Guys,

I'm having a problem with master slave syncing.

So I have two cores one is small core (just keep data use frequently for
fast results) and another is big core (for rare query and for search in
every thing). both core has same solrconfig file. But small core
replication is fine, other than this big core is doing full sync every time
wherever it start (every minute).

I found this
http://stackoverflow.com/questions/6435652/solr-replication-keeps-downloading-entire-index-from-master

But not really usefull.

Solr verion 5.2.0
Small core has doc 10 mil. size around 10 to 15 GB.
Big core has doc greater than 100 mil. size around 25 to 35 GB.

How can I stop full sync.

Thanks
Novin


Re: slave is getting full synced every polling

2016-02-11 Thread Novin Novin
Hi Erick,

Below is master slave config:

Master:

 
commit
optimize

2
  

Slave:


  
  http://master:8983/solr/big_core/replication

  00:00:60
  username
  password
 
  


Do you mean the Solr is restarting every minute or the polling
interval is 60 seconds?

I meant polling is 60 minutes

I didn't not see any suspicious in logs , and I'm not optimizing any thing
with commit.

Thanks
Novin

On Thu, 11 Feb 2016 at 18:02 Erick Erickson  wrote:

> What is your replication configuration in solrconfig.xml on both
> master and slave?
>
> bq:  big core is doing full sync every time wherever it start (every
> minute).
>
> Do you mean the Solr is restarting every minute or the polling
> interval is 60 seconds?
>
> The Solr logs should tell you something about what's going on there.
> Also, if you are for
> some reason optimizing the index that'll cause a full replication.
>
> Best,
> Erick
>
> On Thu, Feb 11, 2016 at 8:41 AM, Novin Novin  wrote:
> > Hi Guys,
> >
> > I'm having a problem with master slave syncing.
> >
> > So I have two cores one is small core (just keep data use frequently for
> > fast results) and another is big core (for rare query and for search in
> > every thing). both core has same solrconfig file. But small core
> > replication is fine, other than this big core is doing full sync every
> time
> > wherever it start (every minute).
> >
> > I found this
> >
> http://stackoverflow.com/questions/6435652/solr-replication-keeps-downloading-entire-index-from-master
> >
> > But not really usefull.
> >
> > Solr verion 5.2.0
> > Small core has doc 10 mil. size around 10 to 15 GB.
> > Big core has doc greater than 100 mil. size around 25 to 35 GB.
> >
> > How can I stop full sync.
> >
> > Thanks
> > Novin
>


Re: slave is getting full synced every polling

2016-02-12 Thread Novin Novin
Typo? That's 60 seconds, but that's not especially interesting either way.

Yes, I was thinking about this too and I have changed it to 59 actually.

Do the actual segment's look identical after the polling?

Well no.

How  I am handling master slave.
How we do this use sym link for master and slave config. When converting
master to slave solrconfig.xml point to slave.xml and same for slave to
master. Than my script restart solr master and slave both. Script tells the
website which one is current master and website save current master url and
use it for searching.

What I have done when the problem started, I changed slave to master and
master to slave. Before this some thing went wrong on machine (could be the
reason of problem not really sure, checked system logs every thing was
fine) no idea what was wrong couldn't find it yet.

Ho did I fix it, I have to reinstall solr for slave, before re-installation
remove all directories related to solr. Not really ideal way to fix it. but
solved the problem and still curious what could cause that such problem?

Thanks,
Novin





On Thu, 11 Feb 2016 at 22:07 Erick Erickson  wrote:

> Typo? That's 60 seconds, but that's not especially interesting either way.
>
> Do the actual segment's look identical after the polling?
>
> On Thu, Feb 11, 2016 at 1:16 PM, Novin Novin  wrote:
> > Hi Erick,
> >
> > Below is master slave config:
> >
> > Master:
> > 
> >  
> > commit
> > optimize
> > 
> > 2
> >   
> >
> > Slave:
> > 
> > 
> >   
> >   http://master:8983/solr/big_core/replication
> > 
> >   00:00:60
> >   username
> >   password
> >  
> >   
> >
> >
> > Do you mean the Solr is restarting every minute or the polling
> > interval is 60 seconds?
> >
> > I meant polling is 60 minutes
> >
> > I didn't not see any suspicious in logs , and I'm not optimizing any
> thing
> > with commit.
> >
> > Thanks
> > Novin
> >
> > On Thu, 11 Feb 2016 at 18:02 Erick Erickson 
> wrote:
> >
> >> What is your replication configuration in solrconfig.xml on both
> >> master and slave?
> >>
> >> bq:  big core is doing full sync every time wherever it start (every
> >> minute).
> >>
> >> Do you mean the Solr is restarting every minute or the polling
> >> interval is 60 seconds?
> >>
> >> The Solr logs should tell you something about what's going on there.
> >> Also, if you are for
> >> some reason optimizing the index that'll cause a full replication.
> >>
> >> Best,
> >> Erick
> >>
> >> On Thu, Feb 11, 2016 at 8:41 AM, Novin Novin 
> wrote:
> >> > Hi Guys,
> >> >
> >> > I'm having a problem with master slave syncing.
> >> >
> >> > So I have two cores one is small core (just keep data use frequently
> for
> >> > fast results) and another is big core (for rare query and for search
> in
> >> > every thing). both core has same solrconfig file. But small core
> >> > replication is fine, other than this big core is doing full sync every
> >> time
> >> > wherever it start (every minute).
> >> >
> >> > I found this
> >> >
> >>
> http://stackoverflow.com/questions/6435652/solr-replication-keeps-downloading-entire-index-from-master
> >> >
> >> > But not really usefull.
> >> >
> >> > Solr verion 5.2.0
> >> > Small core has doc 10 mil. size around 10 to 15 GB.
> >> > Big core has doc greater than 100 mil. size around 25 to 35 GB.
> >> >
> >> > How can I stop full sync.
> >> >
> >> > Thanks
> >> > Novin
> >>
>


Re: slave is getting full synced every polling

2016-02-12 Thread Novin Novin
Details here are important.  Do you understand what Erick was asking
when he was talking about segments?  The segments are the files in the
index directory, which is usually data/index inside the core's instance
directory.

Thanks Shawn, If I am thinking right these segments also appears on core
admin page. I wasn't looked in index directory, my bad.

And  yes I used the diff to compare file. Now difference found actually.

class="solr.Re1plicationHandler" this is typing error, apologies. This is
in the file (copy from file)
class="solr.ReplicationHandler"

Thanks,
Novin




On Fri, 12 Feb 2016 at 09:30 Shawn Heisey  wrote:

> On 2/12/2016 1:58 AM, Novin Novin wrote:
> > Typo? That's 60 seconds, but that's not especially interesting either
> way.
> >
> > Yes, I was thinking about this too and I have changed it to 59 actually.
>
> If you want the polling to occur once an hour, pollInterval will need to
> be set to 01:00:00 ... not 00:00:60.  If you want polling to occur once
> a minute, use 00:01:00 for this setting.
>
> > Do the actual segment's look identical after the polling?
> >
> > Well no.
>
> Details here are important.  Do you understand what Erick was asking
> when he was talking about segments?  The segments are the files in the
> index directory, which is usually data/index inside the core's instance
> directory.
>
> I did notice that the master config you gave us has this:
>
> class="solr.Re1plicationHandler"
>
> Note that there is a number 1 in there.  If this is actually in your
> config, I would expect there to be an error when Solr tries to create
> the replication handler.  Is this an error when transferring to email,
> or is it also incorrect in your solrconfig.xml file?
>
> If you use a tool like diff to compare solrconfig.xml in your small core
> to solrconfig.xml in your big core, can you see any differences in the
> replication config?
>
> Thanks,
> Shawn
>
>


Re: slave is getting full synced every polling

2016-02-12 Thread Novin Novin
 _9bqi.cfe, _9btr.fdx, _6s8a.si,
_9btk.fdt, _9bto_Lucene50_0.tim, _9bts.fnm, _9bto.nvm, _9btp.nvm,
_9bto.nvd, _6s8a.fdx, _9brx.si, _9btt_Lucene50_0.pos, _98nt.nvm, _9btt.nvd,
_99gt.cfe, _9bto.fdx, _9btt.fdx, _5zcy.nvm, _9bnd_Lucene50_0.doc,
_5h1s.nvd, _5h1s_Lucene50_0.doc, _9btr_Lucene50_0.doc,
_9bts_Lucene50_0.tip, write.lock, _9btg.fdx, _9br2.cfs, _9bts.nvd,
_9bts_Lucene50_0.doc, _9bsp_2.liv, _9bqi.cfs, _9bj7_8.liv, _9bh8.si,
_96i3.cfs, _98nt.si, _9b0u.cfs, _9ayb.si, _99gt.cfs, _5zcy_1kw.liv,
_9btk_Lucene50_0.tip, _9brx.cfe, _9bts.fdx, _9btg.fdt, _5zcy.nvd,
_8kbr_Lucene50_0.tim, _9btg.fnm, _17on.nvm, _9bts.fdt, _9bto.fdt,
_9btg_Lucene50_0.doc, segments_nk7, _9btn.nvd, _8kbr.nvd, _9boh.si, _98gy.si,
_9btn.nvm, _5zcy.si, _9b0u_o.liv, _9bti_Lucene50_0.pos, _9bpd.si,
_9bn9.cfs, _9bj7.cfe, _9btr.fnm, _98nt.fnm, _9btg_Lucene50_0.tip,
_9btk.nvm, _9btq_Lucene50_0.tim, _9bto_Lucene50_0.tip, _8kbr.si, _9bti.fdx,
_9bqb_4.liv, _17on.nvd, _17on_Lucene50_0.pos, _9btn_Lucene50_0.tip,
_9b0u.cfe, _5h1s.fdx, _5zcy.fdx, _9bsp.cfe, _9bpd.cfs, _98nt.fdt, _9bqb.si,
_9bts.nvm, _9bu6.cfs, _9bnd.fdx, _98ge.cfs, _9bpd.cfe, _5h1s.fnm,
_9brx.cfs, _98nt.fdx, _9btr.fdt, _9bpm.si, _96i3.si, _9btk_Lucene50_0.pos,
_9btg_Lucene50_0.pos, _8kbr.fnm, _9btt.fnm, _9btt_Lucene50_0.tim,
_17on_Lucene50_0.tim]
at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:560)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:235)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:227)
at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1220)
at org.apache.solr.handler.IndexFetcher.getDetails(IndexFetcher.java:1563)
at
org.apache.solr.handler.ReplicationHandler.getReplicationDetails(ReplicationHandler.java:821)
at
org.apache.solr.handler.ReplicationHandler.handleRequestBody(ReplicationHandler.java:305)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2064)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:640)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:436)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:542)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)


Thanks,
Novin

On Fri, 12 Feb 2016 at 11:11 Novin Novin  wrote:

> Details here are important.  Do you understand what Erick was asking
> when he was talking about segments?  The segments are the files in the
> index directory, which is usually data/index inside the core's instance
> directory.
>
> Thanks Shawn, If I am thinking right these segments also appears on core
> admin page. I wasn't looked in index directory, my bad.
>
> And  yes I used the diff to compare file. Now difference found actually.
>
> class="solr.Re1plicationHandler" this is typing error, apologies. This is
> in the file (copy from file)
> class="solr.ReplicationHandler"
>
> Thanks,
> Novin
>
>
>
>
> On Fri, 12 Feb 2016 at 09:30 Shawn Heisey  wrote:
>
>> On 2/12/2016 1:58 AM, Novin Novin wrote:
>> > Typo? That's 60 seconds, but that's not especially interesting either
>> way.
>> >
>> > Yes, I was thinking about this too and I hav

Re: slave is getting full synced every polling

2016-02-12 Thread Novin Novin
sorry core name is wmsapp_analysis which is big core

On Fri, 12 Feb 2016 at 12:01 Novin Novin  wrote:

> Well It started again.
>
> Below is are the errors from solr logging on admin ui.
> Log error message in master
> 2/12/2016, 11:39:24 AM null:java.lang.IllegalStateException: file:
> MMapDirectory@/var/solr/data/wmsapp_analysis/data/index.20160211204900750
> lockFactory=org.apache.lucene.store.NativeFSLockFactory@56639f83 appears
> both in delegate and in cache: cache=[_9bu7_Lucene50_0.pos,
> _9bua_Lucene50_0.tip, _9bty.fdt, _9bu7.nvd, _9bu1.nvd, _9bu0.nvm,
> _9bu4_Lucene50_0.tim, _8kbr_uu.liv, _9bu7_Lucene50_0.doc,
> _9bu1_Lucene50_0.tip, _9bu9.fnm, _9bty.fdx, _9btv.fdx, _9bu5.nvm,
> _9bu4_Lucene50_0.pos, _9bu5.fnm, _9bu3.si, _9bua_Lucene50_0.tim,
> _9bty_Lucene50_0.pos, _9bu0.si, _9btw_Lucene50_0.tim,
> _9bu0_Lucene50_0.tim, _9bu2.nvm, _9btv_Lucene50_0.pos, _9btv.nvd,
> _9bu3_Lucene50_0.tip, _9bua_Lucene50_0.doc, _9bu7_Lucene50_0.tip,
> _9btw.nvm, _9bua.fdx, _9bu4.nvm, _9bu9_Lucene50_0.tim, _9bu4_1.liv,
> _9bu7.nvm, _9bu3_1.liv, _9bu0.fnm, _9bu5_Lucene50_0.tim, _9btx.fnm,
> _9bu2.fdx, _9bu4.fdt, _9bu2_Lucene50_0.tip, _9bu9.fdx,
> _9bu9_Lucene50_0.pos, _9bu7.fdt, _9bu9.nvd, _9btx_1.liv, _99gt_2s.liv,
> _9btw.nvd, _9bu3_Lucene50_0.doc, _9bu2.fnm, _9bua_Lucene50_0.pos,
> _9bu9.nvm, _9btx.nvm, _9btw_Lucene50_0.tip, _9bu1.nvm,
> _9bu4_Lucene50_0.doc, _9bu9_1.liv, _9bu1.fnm, _9btu.cfs,
> _9bu8_Lucene50_0.tip, _9bua.nvm, _9btx_Lucene50_0.doc, _9btu.si,
> _9bu0.fdt, _9bu7.si, _9btx_Lucene50_0.tip, _9btw.si, _9bu8.fdx,
> _9bu0_Lucene50_0.doc, _9bu3.nvm, _9btz_Lucene50_0.tip,
> _9bu3_Lucene50_0.tim, _9btz.fdt, _9btw.fdt, _9bu2.si, _9bu4.si,
> _9btx.nvd, _9bu4.fnm, _9btv_1.liv, _9btz_Lucene50_0.doc, _9bpm_7.liv,
> _9btx_Lucene50_0.pos, _9bty.fnm, _9btw_Lucene50_0.doc, _9btv.fdt,
> _9bu2_Lucene50_0.doc, _9btu.cfe, _9bu3.nvd, _9btv.si, _9bu8.nvm,
> _9btx.fdt, _9bu5.si, _9bu5.fdt, _9bu2.nvd, _9bu3.fdx, _9btv.fnm,
> _9bu5.fdx, _9btz.fnm, _9bu3_Lucene50_0.pos, _9bu9_Lucene50_0.tip,
> _9bu1.fdt, _9bu0_Lucene50_0.tip, _9bty_Lucene50_0.tim,
> _9btx_Lucene50_0.tim, _9bt9_3.liv, _9bty.si, _9bu2.fdt, _9bu9.fdt,
> _9bu2_Lucene50_0.pos, _9bua.fdt, _9bu9_Lucene50_0.doc, _9bu4.fdx,
> _9bu5_Lucene50_0.pos, _9bu4.nvd, _9btv_Lucene50_0.tim, _9bty.nvd, _9bu8.si,
> _9bu5_Lucene50_0.doc, _9bu9.si, _9btw.fnm, _9bu3.fnm, _9bh8_m.liv,
> _9bu3.fdt, _9bu5.nvd, _9bua.fnm, _9btw_1.liv, _9bu8_Lucene50_0.pos,
> _9btw_Lucene50_0.pos, _9bty_Lucene50_0.doc, _9bu6_1.liv, _9bu7.fnm,
> _5zcy_1kx.liv, _9bu7.fdx, _9bu5_1.liv, _9bua.nvd, _9bty_Lucene50_0.tip,
> _9btz.fdx, _9bu0_Lucene50_0.pos, _9bu1_Lucene50_0.doc, _9btx.fdx,
> _9btv_Lucene50_0.tip, _9bn9_9.liv, _9bu0.fdx, _9bu8.nvd,
> _9bu1_Lucene50_0.pos, _9bua.si, _9bu1.si, _9bu8_Lucene50_0.tim,
> _9btv_Lucene50_0.doc, _9bu2_Lucene50_0.tim, _9bu1_Lucene50_0.tim,
> _9bu8.fnm, _9bu4_Lucene50_0.tip, _9btx.si, _98nt_5c.liv, _9btz.nvd,
> _9btw.fdx, _9btv.nvm, _9bu7_Lucene50_0.tim, pending_segments_nk8,
> _9btz_Lucene50_0.tim, _9btz.si, _9bu8_Lucene50_0.doc,
> _9bu5_Lucene50_0.tip, _9btz_Lucene50_0.pos, _9btz.nvm, _9bty.nvm,
> _9bu0.nvd, _9bu1.fdx, _9bu8.fdt],delegate=[_9br2.cfe, pending_segments_nk8,
> _9bnd.fnm, _9btn_Lucene50_0.tim, _96i3.cfe, _9boh.cfe,
> _9bto_Lucene50_0.pos, _6s8a.fnm, _9btr.si, _9bt9.cfs, _9bh8.cfe,
> _9btg.nvd, _9bqi_3.liv, _5zcy_Lucene50_0.tip, _9boh_6.liv,
> _98nt_Lucene50_0.tim, _9btt.si, _9bqi.si, _9bsp.si, _9bsp.cfs,
> _6s8a_1la.liv, _9bn9_8.liv, _6s8a_Lucene50_0.doc, _9bqb.cfs, _9boh.cfs,
> _9btp.fdx, _5h1s_1wg.liv, _8kbr.fdx, _9bti.nvm, _9bts_Lucene50_0.pos, _
> 9bts.si, _9btr.nvd, _9bnd_Lucene50_0.pos, _5h1s_Lucene50_0.tim,
> _9btq.fdt, _9bti.nvd, _9btm_1.liv, _9btn.fdt, _9btp.fnm, _9btg.nvm,
> _9bu6.cfe, _9btm.cfe, _98nt_Lucene50_0.pos, _9bqq_6.liv,
> _8kbr_Lucene50_0.tip, _9btq.fdx, _9ayb_c.liv, _5zcy_Lucene50_0.doc,
> _5zcy.fdt, _6s8a.nvd, _9ayb.cfe, _6s8a_Lucene50_0.tim, _9bh8_l.liv,
> _17on.fdx, _9btn.fdx, _9btg.si, _5h1s.fdt, _9btp_Lucene50_0.doc,
> _99gt_2r.liv, _9br2_5.liv, _9bnd.nvm, _9bj7.si, _9bto_Lucene50_0.doc,
> _9bpm.cfs, _17on_Lucene50_0.doc, _99gt.si, _9btg_Lucene50_0.tim,
> _9btk.nvd, _9bts_Lucene50_0.tim, _9bqb.cfe, _98nt_Lucene50_0.tip,
> _9btr.nvm, _98ge.si, _9bnd_4.liv, _9bto.si, _9btq.nvd, _9bnj.cfs,
> _9btn_Lucene50_0.doc, _9btt.fdt, _17on.si, _9bnj.cfe, _17on_2wi.liv,
> _9btt_Lucene50_0.doc, _9bqq.si, _9bt9_2.liv, _9btr_Lucene50_0.tim,
> _9btk.fnm, _9btk.si, _9bn9.cfe, _8kbr_Lucene50_0.pos, _9bt9.cfe,
> _17on.fnm, _9btq.si, _98gy_d.liv, _9btp.nvd, _9bnd_Lucene50_0.tim,
> _9bqq.cfe, _9bti.fnm, _8kbr_Lucene50_0.doc, _9bqq.cfs, _9bnj.si,
> _9bti_1.liv, _9bt9.si, _5zcy_Lucene50_0.tim, _9bh8.cfs, _98ge_g.liv,
> _9btr_Lucene50_0.pos, _9bti_Lucene50_0.doc, _98ge.cfe, _8kbr.nvm,
> _9bnd.fdt, _9br2.si, _5h1s_Lucene50_0.pos, _9btq_Lucene50_

Re: slave is getting full synced every polling

2016-02-12 Thread Novin Novin
you're trying
to accomplish X and asking about Y where Y is the index replication. What's
X?
What is the purpose of switching the master and slave and how often do you
do
it and why?

I think I didn't explain it quit properly. So I have situation in which
data is getting index every 20 seconds or less and I can't loose data while
indexing. I use searching a lot in website, if I have to restart my solr
machine because of kernel update or some network problem or some another
reason (not really in top of my head). It takes a while to restart, while
it is restarting some body is using website feature which require data to
be indexed and display results after request completed. In this situation,
if I loose the data I have to do full index (because I am using solrj it
takes 3 to 4 hours, so this not ideal). That's why I am doing switching between
master and slave.

So now what I do here when slave is synced I make it master2 at this point
I have two master, 1 and 2 and when I do that master2 it is used by website
and than I convert master1 to slave of master2. So I don't really use data
because it done by script it finish in couple of seconds.

How often I do this?
Not quit often. but once in two or three months.

Let me know if you need anything else or something I didn't explain it
properly.


@Alessandro Benedetti

Have you customized the merge factor ?
Nope, I am using merge factor 10 always.

Is it aggressive ?
I am not sure what did you mean here by aggressive.

When the replication is triggered what are the difference from the Master
index ( in term of segments) and the slave ?

What I have checked this time, It creates new directory
index.20160213120345, and this directory is empty. But it has another
directory with name index.20160213120322 and this one has more than 90% of
 index file same to master index directory.


I just wanna say that you guys are taking time to helping me out with this
problem is highly appreciated.

Best regards,
Novin





On Fri, 12 Feb 2016 at 16:08 Erick Erickson  wrote:

> bq: What I have done when the problem started, I changed slave to master
> and
> master to slave.
>
> OK, other things aside, if you're really saying that every time you
> switch the slave
> and master around and restart, you get a full sync then I'd reply
> "don't do that". Why
> are you switching slave and master? The whole purpose of replication is to
> have
> one master that essentially is _always_ the master. Essentially the
> slave asks the
> master "is my index up to date" and I'm not sure how that logic would
> handle
> going back and forth. Theoretically, if all the files in the index
> were exactly identical
> it wouldn't replicate when switched, but I cant say for certain that this
> is
> enforced.
>
> I think you're trying to accomplish some particular objective but
> going about it in
> a way that is causing you grief. This smells like an XY problem, i.e.
> you're trying
> to accomplish X and asking about Y where Y is the index replication.
> What's X?
> What is the purpose of switching the master and slave and how often do you
> do
> it and why?
>
> Best,
> Erick
>
> On Fri, Feb 12, 2016 at 6:46 AM, Alessandro Benedetti
>  wrote:
> > Have you customised the merge factor ?
> > Is it aggressive ?
> > In case a lot of merge happens, you can potentially incur in a big
> trasnfer
> > of files each replication .
> > You need to check the segments in the slave every minutes.
> > When the replication is triggered what are the difference from the Master
> > index ( in term of segments) and the slave ?
> >
> > Cheers
> >
> > On 12 February 2016 at 12:03, Novin Novin  wrote:
> >
> >> sorry core name is wmsapp_analysis which is big core
> >>
> >> On Fri, 12 Feb 2016 at 12:01 Novin Novin  wrote:
> >>
> >> > Well It started again.
> >> >
> >> > Below is are the errors from solr logging on admin ui.
> >> > Log error message in master
> >> > 2/12/2016, 11:39:24 AM null:java.lang.IllegalStateException: file:
> >> > MMapDirectory@
> >> /var/solr/data/wmsapp_analysis/data/index.20160211204900750
> >> > lockFactory=org.apache.lucene.store.NativeFSLockFactory@56639f83
> appears
> >> > both in delegate and in cache: cache=[_9bu7_Lucene50_0.pos,
> >> > _9bua_Lucene50_0.tip, _9bty.fdt, _9bu7.nvd, _9bu1.nvd, _9bu0.nvm,
> >> > _9bu4_Lucene50_0.tim, _8kbr_uu.liv, _9bu7_Lucene50_0.doc,
> >> > _9bu1_Lucene50_0.tip, _9bu9.fnm, _9bty.fdx, _9btv.fdx, _9bu5.nvm,
> >> > _9bu4_Lucene50_0.pos, _9bu5.fnm, _9bu3.si, _9bua_Lucene50_0.tim,
> >> > _9bty_Luc

Solr 5.5 timeout of solrj client

2016-04-14 Thread Novin Novin
Hi guys,

I'm having error

 when sending solr doc
mid15955728
org.apache.solr.client.solrj.SolrServerException: Timeout occured
while waiting response from server at:
http://localhost.com:8983/solr/analysis
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:585)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:240)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:229)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149)
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:106)
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:71)
at org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:85)
at com.temetra.wms.textindexer.TextIndexer$7.run(TextIndexer.java:544)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at 
org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:139)
at 
org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:155)
at 
org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:284)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:140)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
at 
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:165)
at 
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:167)
at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
at 
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:271)
at 
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
at 
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at 
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
at 
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:481)
... 10 more


Don't really able find why is this happening.

Does any body know what could have cause such error?


Thanks in advance.

Novin


Re: Solr 5.5 timeout of solrj client

2016-04-14 Thread Novin Novin
Thanks for reply Shawn.

Below is snippet of jetty.xml and jetty-https.xml

jetty.xml:38:
/// I presume this one I should increase, But I believe 5 second is enough
time for 250 docs to add to solr.

jetty.xml:39:

jetty-https.xml:45:

I'm also seeing "DirectUpdateHandler2 Starting optimize... Reading and
rewriting the entire index! Use with care". Would this be causing delay
response from solr?

Thanks in advance,
Novin


On 14 April 2016 at 14:05, Shawn Heisey  wrote:

> On 4/14/2016 4:40 AM, Novin Novin wrote:
> > I'm having error
> >
> >  when sending solr doc
> > mid15955728
> > org.apache.solr.client.solrj.SolrServerException: Timeout occured
> > while waiting response from server at:
> > http://localhost.com:8983/solr/analysis
>
> 
>
> > Caused by: java.net.SocketTimeoutException: Read timed out
>
> You encountered a socket timeout.  This is a low-level TCP timeout.
> It's effectively an idle timeout -- no activity for X seconds and the
> TCP connection is severed.
>
> I believe the Jetty included with Solr has a socket timeout of 50
> seconds configured.  You can also configure a socket timeout on the
> HttpClient used by various SolrClient implementations.
>
> The operating system (on either end of the connection) may also have a
> default socket timeout configured, but I believe that these defaults are
> normally measured in hours, not seconds.
>
> Thanks,
> Shawn
>
>


Re: Solr 5.5 timeout of solrj client

2016-04-14 Thread Novin Novin
How can I stop happening "DirectUpdateHandler2 Starting optimize... Reading
and rewriting the entire index! Use with care"

Thanks
novin

On 14 April 2016 at 14:36, Shawn Heisey  wrote:

> On 4/14/2016 7:23 AM, Novin Novin wrote:
> > Thanks for reply Shawn.
> >
> > Below is snippet of jetty.xml and jetty-https.xml
> >
> > jetty.xml:38: > name="solr.jetty.threads.idle.timeout" default="5000"/>
> > /// I presume this one I should increase, But I believe 5 second is
> enough
> > time for 250 docs to add to solr.
>
> 5 seconds might not be enough time.  The *add* probably completes in
> time, but the entire request might take longer, especially if you use
> commit=true with the request.  I would definitely NOT set this timeout
> so low -- requests that take longer than 5 seconds are very likely going
> to happen.
>
> > I'm also seeing "DirectUpdateHandler2 Starting optimize... Reading and
> > rewriting the entire index! Use with care". Would this be causing delay
> > response from solr?
>
> Exactly how long an optimize takes is dependent on the size of your
> index.  Rewriting an index that's a few hundred megabytes may take 30
> seconds to a minute.  Rewriting an index that's several gigabytes will
> take a few minutes.  Performance is typically lower during an optimize,
> because the CPU and disks are very busy.
>
> Thanks,
> Shawn
>
>


Re: Solr 5.5 timeout of solrj client

2016-04-14 Thread Novin Novin
Thanks Erick,  for pointing out.  You are right.  I was optimizing every 10
mins.  And I have change this to every day in night.
On 14-Apr-2016 5:20 pm, "Erick Erickson"  wrote:

> don't issue an optimize command... either you have a solrj client that
> issues a client.optimize() command or you pressed the "optimize now"
> in the admin UI. Solr doesn't do this by itself.
>
> Best,
> Erick
>
> On Thu, Apr 14, 2016 at 8:30 AM, Novin Novin  wrote:
> > How can I stop happening "DirectUpdateHandler2 Starting optimize...
> Reading
> > and rewriting the entire index! Use with care"
> >
> > Thanks
> > novin
> >
> > On 14 April 2016 at 14:36, Shawn Heisey  wrote:
> >
> >> On 4/14/2016 7:23 AM, Novin Novin wrote:
> >> > Thanks for reply Shawn.
> >> >
> >> > Below is snippet of jetty.xml and jetty-https.xml
> >> >
> >> > jetty.xml:38: >> > name="solr.jetty.threads.idle.timeout" default="5000"/>
> >> > /// I presume this one I should increase, But I believe 5 second is
> >> enough
> >> > time for 250 docs to add to solr.
> >>
> >> 5 seconds might not be enough time.  The *add* probably completes in
> >> time, but the entire request might take longer, especially if you use
> >> commit=true with the request.  I would definitely NOT set this timeout
> >> so low -- requests that take longer than 5 seconds are very likely going
> >> to happen.
> >>
> >> > I'm also seeing "DirectUpdateHandler2 Starting optimize... Reading and
> >> > rewriting the entire index! Use with care". Would this be causing
> delay
> >> > response from solr?
> >>
> >> Exactly how long an optimize takes is dependent on the size of your
> >> index.  Rewriting an index that's a few hundred megabytes may take 30
> >> seconds to a minute.  Rewriting an index that's several gigabytes will
> >> take a few minutes.  Performance is typically lower during an optimize,
> >> because the CPU and disks are very busy.
> >>
> >> Thanks,
> >> Shawn
> >>
> >>
>


Re: Solr 5.5 timeout of solrj client

2016-04-14 Thread Novin Novin
Thanks for the great advice Erick.

On 14 April 2016 at 18:18, Erick Erickson  wrote:

> BTW, the place optimize seems best used is when the index isn't
> updated very often. I've seen a pattern where the index is updated
> once a night (or even less). In that situation, optimization makes
> more sense. But when an index is continually updated, it's mostly
> wasted effort.
>
> Best,
> Erick
>
> On Thu, Apr 14, 2016 at 10:17 AM, Erick Erickson
>  wrote:
> > Unless you have somewhat unusual circumstances, I wouldn't optimize at
> > all, despite the name it really doesn't help all that much in _most_
> > cases.
> >
> > If your percentage deleted docs doesn't exceed, say, 15-20% I wouldn't
> > bother. Most of what optimize does is reclaim resources from deleted
> > docs. This happens as part of general background merging anyway.
> >
> > There have been some reports of 10-15% query performance after
> > optimizing, but I would measure on your system before expending the
> > resources optimizing.
> >
> > Best,
> > Erick
> >
> > On Thu, Apr 14, 2016 at 9:56 AM, Novin Novin 
> wrote:
> >> Thanks Erick,  for pointing out.  You are right.  I was optimizing
> every 10
> >> mins.  And I have change this to every day in night.
> >> On 14-Apr-2016 5:20 pm, "Erick Erickson" 
> wrote:
> >>
> >>> don't issue an optimize command... either you have a solrj client that
> >>> issues a client.optimize() command or you pressed the "optimize now"
> >>> in the admin UI. Solr doesn't do this by itself.
> >>>
> >>> Best,
> >>> Erick
> >>>
> >>> On Thu, Apr 14, 2016 at 8:30 AM, Novin Novin 
> wrote:
> >>> > How can I stop happening "DirectUpdateHandler2 Starting optimize...
> >>> Reading
> >>> > and rewriting the entire index! Use with care"
> >>> >
> >>> > Thanks
> >>> > novin
> >>> >
> >>> > On 14 April 2016 at 14:36, Shawn Heisey  wrote:
> >>> >
> >>> >> On 4/14/2016 7:23 AM, Novin Novin wrote:
> >>> >> > Thanks for reply Shawn.
> >>> >> >
> >>> >> > Below is snippet of jetty.xml and jetty-https.xml
> >>> >> >
> >>> >> > jetty.xml:38: >>> >> > name="solr.jetty.threads.idle.timeout" default="5000"/>
> >>> >> > /// I presume this one I should increase, But I believe 5 second
> is
> >>> >> enough
> >>> >> > time for 250 docs to add to solr.
> >>> >>
> >>> >> 5 seconds might not be enough time.  The *add* probably completes in
> >>> >> time, but the entire request might take longer, especially if you
> use
> >>> >> commit=true with the request.  I would definitely NOT set this
> timeout
> >>> >> so low -- requests that take longer than 5 seconds are very likely
> going
> >>> >> to happen.
> >>> >>
> >>> >> > I'm also seeing "DirectUpdateHandler2 Starting optimize...
> Reading and
> >>> >> > rewriting the entire index! Use with care". Would this be causing
> >>> delay
> >>> >> > response from solr?
> >>> >>
> >>> >> Exactly how long an optimize takes is dependent on the size of your
> >>> >> index.  Rewriting an index that's a few hundred megabytes may take
> 30
> >>> >> seconds to a minute.  Rewriting an index that's several gigabytes
> will
> >>> >> take a few minutes.  Performance is typically lower during an
> optimize,
> >>> >> because the CPU and disks are very busy.
> >>> >>
> >>> >> Thanks,
> >>> >> Shawn
> >>> >>
> >>> >>
> >>>
>


Re: Block Join query

2015-12-14 Thread Novin Novin
Hi Mikhail,

I'm having a little bit problem to construct the query for solr when I have
been trying to use block join query. As you said, i can't use + or 
in front of block join query, so I have to put *{**!parent
which="doctype:200"}  *in front. and after this, all fields are child
document, so I can't add any parent document field, if I add parent doc
field it would give me nothing because field is not exist in child
document.

But I can still add parent doc in "fq". Does it going to be cause any
trouble something related to highlight or scoring, because I was using
parent doc field in q not in fq.

Thanks,
Novin

On 12 December 2015 at 00:01, Novin  wrote:

> No Worries, I was just wondering what did I miss.  And thanks for blog
> link.
>
>
> On 11/12/2015 18:52, Mikhail Khludnev wrote:
>
>> Novin,
>>
>> I regret so much. It's my pet peeve in Solr query parsing. Handling s
>> space
>> is dependent from the first symbol of query sting
>> This will work (starts from '{!' ):
>> q={!parent which="doctype:200"}flow:[624 TO 700]
>> These won't due to " ", "+":
>> q= {!parent which="doctype:200"}flow:[624 TO 700]
>> q=+{!parent which="doctype:200"}flow:[624 TO 700]
>> Subordinate clauses with spaces are better handled with "Nested Queries"
>> or
>> so, check the post
>> <
>> http://blog.griddynamics.com/2013/12/grandchildren-and-siblings-with-block.html
>> >
>>
>>
>> On Fri, Dec 11, 2015 at 6:31 PM, Novin  wrote:
>>
>> Hi Guys,
>>>
>>> I'm trying  block join query, so I have tried   +{!parent
>>> which="doctype:200"}flow:624 worked fine. But when i tried +{!parent
>>> which="doctype:200"}flow:[624 TO 700]
>>>
>>> Got the below error
>>>
>>> org.apache.solr.search.SyntaxError: Cannot parse 'flow_l:[624':
>>> Encountered \"\" at line 1, column 11.\nWas expecting one of:\n
>>> \"TO\" ...\n ...\n  ...\n
>>>
>>> Just wondering too, can we able to do range in block join query.
>>>
>>> Thanks,
>>> Novin
>>>
>>>
>>>
>>>
>>>
>>>
>>
>


Re: Block Join query

2015-12-14 Thread Novin Novin
Thanks Man.

On Mon, 14 Dec 2015 at 12:19 Mikhail Khludnev 
wrote:

> In addition to the link in the previous response,
> http://blog.griddynamics.com/2013/09/solr-block-join-support.html provides
> an example of such combination. From my experience fq doen't participate in
> highlighting nor scoring.
>
> On Mon, Dec 14, 2015 at 2:45 PM, Novin Novin  wrote:
>
> > Hi Mikhail,
> >
> > I'm having a little bit problem to construct the query for solr when I
> have
> > been trying to use block join query. As you said, i can't use + or
> 
> > in front of block join query, so I have to put *{**!parent
> > which="doctype:200"}  *in front. and after this, all fields are child
> > document, so I can't add any parent document field, if I add parent doc
> > field it would give me nothing because field is not exist in child
> > document.
> >
> > But I can still add parent doc in "fq". Does it going to be cause any
> > trouble something related to highlight or scoring, because I was using
> > parent doc field in q not in fq.
> >
> > Thanks,
> > Novin
> >
> > On 12 December 2015 at 00:01, Novin  wrote:
> >
> > > No Worries, I was just wondering what did I miss.  And thanks for blog
> > > link.
> > >
> > >
> > > On 11/12/2015 18:52, Mikhail Khludnev wrote:
> > >
> > >> Novin,
> > >>
> > >> I regret so much. It's my pet peeve in Solr query parsing. Handling s
> > >> space
> > >> is dependent from the first symbol of query sting
> > >> This will work (starts from '{!' ):
> > >> q={!parent which="doctype:200"}flow:[624 TO 700]
> > >> These won't due to " ", "+":
> > >> q= {!parent which="doctype:200"}flow:[624 TO 700]
> > >> q=+{!parent which="doctype:200"}flow:[624 TO 700]
> > >> Subordinate clauses with spaces are better handled with "Nested
> Queries"
> > >> or
> > >> so, check the post
> > >> <
> > >>
> >
> http://blog.griddynamics.com/2013/12/grandchildren-and-siblings-with-block.html
> > >> >
> > >>
> > >>
> > >> On Fri, Dec 11, 2015 at 6:31 PM, Novin  wrote:
> > >>
> > >> Hi Guys,
> > >>>
> > >>> I'm trying  block join query, so I have tried   +{!parent
> > >>> which="doctype:200"}flow:624 worked fine. But when i tried
> > +{!parent
> > >>> which="doctype:200"}flow:[624 TO 700]
> > >>>
> > >>> Got the below error
> > >>>
> > >>> org.apache.solr.search.SyntaxError: Cannot parse 'flow_l:[624':
> > >>> Encountered \"\" at line 1, column 11.\nWas expecting one of:\n
> > >>> \"TO\" ...\n ...\n  ...\n
> > >>>
> > >>> Just wondering too, can we able to do range in block join query.
> > >>>
> > >>> Thanks,
> > >>> Novin
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>
> > >
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> <http://www.griddynamics.com>
> 
>


Replcate data from Solr to Solrcloud

2017-06-08 Thread Novin Novin
Hi Guys,

I have set up SolrCloud  for production but ready to use and currently Solr
running with two core in production. SolrCloud machines are separate than
standalone Solr and has two collections in SolrCloud similar to Solr.

Is it possible and  would be useful. If I could be replicate data from Solr
to SolrCloud like master slave does or use some other method to send data
from Solr to SolrCloud.

Let me know if you guys need more information.

Thanks in advance,
Navin


Re: Replcate data from Solr to Solrcloud

2017-06-08 Thread Novin Novin
Thanks Erick

No, I'm not doing distributed search. These two core with different type of
information.

If I understand you correctly, I can just use scp to copy index files from
solr to any shard of solrcloud and than solrcloud would balance the data
itself.

Cheers





On Thu, 8 Jun 2017 at 15:46 Erick Erickson  wrote:

> You say you have two cores. Are Tha same collection? That is, are you doing
> distributed search? If not, you can use the replication APIs fetchindex
> command to manually move them.
>
> For that matter, you can just scp the indexes over too, they're just files.
>
> If you're doing distributed search on your stand alone Solr, then you'd
> need to insure that the hash ranges were correct on your two-handed
> SolrCloud setup.
>
> Best,
> Erick
>
> On Jun 8, 2017 07:06, "Novin Novin"  wrote:
>
> > Hi Guys,
> >
> > I have set up SolrCloud  for production but ready to use and currently
> Solr
> > running with two core in production. SolrCloud machines are separate than
> > standalone Solr and has two collections in SolrCloud similar to Solr.
> >
> > Is it possible and  would be useful. If I could be replicate data from
> Solr
> > to SolrCloud like master slave does or use some other method to send data
> > from Solr to SolrCloud.
> >
> > Let me know if you guys need more information.
> >
> > Thanks in advance,
> > Navin
> >
>


Re: Replcate data from Solr to Solrcloud

2017-06-08 Thread Novin Novin
Thanks Erick.

On Thu, 8 Jun 2017 at 17:28 Erick Erickson  wrote:

> bq: would balance the data itself.
>
> not if you mean split it up amongst shards. The entire index would be
> on a _single_ shard. If you then do ADDREPLICA on that shard it'll
> replicate the entire index to each replica
>
> Also note that when you scp stuff around I'd recommend the destination
> Solr node be down. Otherwise use the fetchindex. Although note that
> fetchindex will prevent queries from being served in cloud mode.
>
> What I was thinking is more a one-time transfer rather than something
> ongoing. Solr 7.0 will have support for variants of the ongoing theme.
> I was thinking something like
>
> 1> move the indexes to a single-replica SolrCloud
> 2> if you need more shards, use SPLITSHARD on the SolrCloud installation.
> 3> use ADDREPLICA to build out your SolrCloud setup
> 4> thereafter index directly to your SolrCloud installation
> 5> when you've proved out your SolrCloud setup, get rid of the old
> stand-alone stuff.
>
> Best,
> Erick
>
> On Thu, Jun 8, 2017 at 8:55 AM, Novin Novin  wrote:
> > Thanks Erick
> >
> > No, I'm not doing distributed search. These two core with different type
> of
> > information.
> >
> > If I understand you correctly, I can just use scp to copy index files
> from
> > solr to any shard of solrcloud and than solrcloud would balance the data
> > itself.
> >
> > Cheers
> >
> >
> >
> >
> >
> > On Thu, 8 Jun 2017 at 15:46 Erick Erickson 
> wrote:
> >
> >> You say you have two cores. Are Tha same collection? That is, are you
> doing
> >> distributed search? If not, you can use the replication APIs fetchindex
> >> command to manually move them.
> >>
> >> For that matter, you can just scp the indexes over too, they're just
> files.
> >>
> >> If you're doing distributed search on your stand alone Solr, then you'd
> >> need to insure that the hash ranges were correct on your two-handed
> >> SolrCloud setup.
> >>
> >> Best,
> >> Erick
> >>
> >> On Jun 8, 2017 07:06, "Novin Novin"  wrote:
> >>
> >> > Hi Guys,
> >> >
> >> > I have set up SolrCloud  for production but ready to use and currently
> >> Solr
> >> > running with two core in production. SolrCloud machines are separate
> than
> >> > standalone Solr and has two collections in SolrCloud similar to Solr.
> >> >
> >> > Is it possible and  would be useful. If I could be replicate data from
> >> Solr
> >> > to SolrCloud like master slave does or use some other method to send
> data
> >> > from Solr to SolrCloud.
> >> >
> >> > Let me know if you guys need more information.
> >> >
> >> > Thanks in advance,
> >> > Navin
> >> >
> >>
>


recovery information for replica in recovering state

2017-07-06 Thread Novin Novin
Hi Guys,

I was just wondering is solr cloud can give information about how much
recovery has been done by replica while in it is recovering, some
percentage would be handy.

Thanks,
Novin


Re: recovery information for replica in recovering state

2017-07-07 Thread Novin Novin
It is 250gb data. It takes around 40 minutes. And yes, recovery completes
correctly.

On Thu, 6 Jul 2017 at 23:32 Rick Leir  wrote:

> Novin, How long is recovery taking for you? I assume the recovery
> completes correctly.
> Cheers-- Rick
>
> On July 6, 2017 7:59:03 AM EDT, Novin Novin  wrote:
> >Hi Guys,
> >
> >I was just wondering is solr cloud can give information about how much
> >recovery has been done by replica while in it is recovering, some
> >percentage would be handy.
> >
> >Thanks,
> >Novin
>
> --
> Sorry for being brief. Alternate email is rickleir at yahoo dot com


Always use leader for searching queries

2018-01-02 Thread Novin Novin
Hi guys,

I am using solr 5.5.4 and same version for solrj. My question is there any
way I can tell cloud solr client to use only leader for queries.

Thanks in advance.
Navin


Re: Always use leader for searching queries

2018-01-02 Thread Novin Novin
Hi Erick,

You are right, it is XY Problem.

Allow me to explain best I can, I have two replica of one collection called
"Main". When I was using search feature in my application I get two
different numFound count. So I start digging after spending 2 3 hours I
found the one replica has numFound count higher than other (higher count
was not leader). I am not sure how It got end up like that. This count
difference affects paging on my application side not solr side.

Extra info might be useful to know
Same query not a single letter difference.
auto soft commit 2
soft commit 6
indexing data every minute.

Let me know if you need to know anything else. Any help would highly
appreciated.

Thanks in advance,
Navin



On Tue, 2 Jan 2018 at 15:14 Erick Erickson  wrote:

> This seems like an XY problem. You're asking how to do X
> because you think it will solve problem Y without telling
> us what Y is.
>
> I say this because on the surface this seems to defeat the
> purpose behind SolrCloud. Why would you want to only make
> use of one piece of hardware? That will limit your throughput,
> so why bother to have replicas in the first place?
>
> Or is this some kind of diagnostic you're trying to implement?
>
> Best,
> Erick
>
> On Tue, Jan 2, 2018 at 5:08 AM, Novin Novin  wrote:
> > Hi guys,
> >
> > I am using solr 5.5.4 and same version for solrj. My question is there
> any
> > way I can tell cloud solr client to use only leader for queries.
> >
> > Thanks in advance.
> > Navin
>


Re: Always use leader for searching queries

2018-01-03 Thread Novin Novin
Hi Erick,

Thanks for your reply.

[ First of all, replicas can be off in terms of counts for the soft
commit interval. The commits don't all happen on the replicas at the
same wall-clock time. Solr promises eventual consistency, in this case
NOW-autocommit time.]

I realized that, to stop it. I have actually turned off auto soft commit
for a time being but nothing changed. Non leader replica still had extra
documents.

[ So my first question is whether the replicas in the shard are
inconsistent as of, say, NOW-your_soft_commit_time. I'd add a fudge
factor of 10 seconds earlier just to be sure I was past autowarming.
This does require that there be a time stamp. Absent a timestamp, you
could suspend indexing for a few minutes and run the test like below.]

When data was indexing at that time I was checking how the counts are in
both replica. What I found leader replica has 3 doc less than other replica
always. I don't think so they were of by NOW-soft_commit_time, CloudSolrClient
add some thing like this "_stateVer_=main:114" in query which I assume is
for results to be consistent between both replica search.

[Adding &distrib=false to your command and directing it at a specific
_core_ (something like collection1_shard1_replica1) will only return
data from that core.]
I probably not need to do this because I have only one shard but I did
anyway count was different.

[When you say you index every minute, I'm guessing you only index for
part of that minute, is that true? In that case you might get more
consistency if, instead of relying totally on your autoconfig
settings, specify commitWithin on your update command. That should
force the commits to happen more closely in-sync, although still not
perfect.]

We receive data every minute, so whenever we have new data we send it to
Solr cloud using queue. You said don't rely on auto config. Do you mean I
should turn off autocommit and use commitWithin using solrj or leave
autoCommit as it is and also use commitWithin from solrj client.

I apologize If I am not clear, thanks for your help again.

Thanks in advance,
Navin





On Tue, 2 Jan 2018 at 18:05 Erick Erickson  wrote:

> First of all, replicas can be off in terms of counts for the soft
> commit interval. The commits don't all happen on the replicas at the
> same wall-clock time. Solr promises eventual consistency, in this case
> NOW-autocommit time.
>
> So my first question is whether the replicas in the shard are
> inconsistent as of, say, NOW-your_soft_commit_time. I'd add a fudge
> factor of 10 seconds earlier just to be sure I was past autowarming.
> This does require that there be a time stamp. Absent a timestamp, you
> could suspend indexing for a few minutes and run the test like below.
>
> Adding &distrib=false to your command and directing it at a specific
> _core_ (something like collection1_shard1_replica1) will only return
> data from that core.
>
> When you say you index every minute, I'm guessing you only index for
> part of that minute, is that true? In that case you might get more
> consistency if, instead of relying totally on your autoconfig
> settings, specify commitWithin on your update command. That should
> force the commits to happen more closely in-sync, although still not
> perfect.
>
> Another option if you're totally and completely sure that your commits
> happen _only_ from your indexing program is to fire the commit at the
> end of the run from your SolrJ program.
>
> Let us know,
> Erick
>
> On Tue, Jan 2, 2018 at 9:33 AM, Novin Novin  wrote:
> > Hi Erick,
> >
> > You are right, it is XY Problem.
> >
> > Allow me to explain best I can, I have two replica of one collection
> called
> > "Main". When I was using search feature in my application I get two
> > different numFound count. So I start digging after spending 2 3 hours I
> > found the one replica has numFound count higher than other (higher count
> > was not leader). I am not sure how It got end up like that. This count
> > difference affects paging on my application side not solr side.
> >
> > Extra info might be useful to know
> > Same query not a single letter difference.
> > auto soft commit 2
> > soft commit 6
> > indexing data every minute.
> >
> > Let me know if you need to know anything else. Any help would highly
> > appreciated.
> >
> > Thanks in advance,
> > Navin
> >
> >
> >
> > On Tue, 2 Jan 2018 at 15:14 Erick Erickson 
> wrote:
> >
> >> This seems like an XY problem. You're asking how to do X
> >> because you think it will solve problem Y without telling
> >> us what Y is.
> >>
> >> I say this because on the surface this

Zookeeper version

2016-11-24 Thread Novin Novin
Hi Guys,

I found in solr docs that "Solr currently uses Apache ZooKeeper v3.4.6".
Can I use higher version or I have to use 3.4.6 zookeeper.

Thanks in advance,
Novin


Re: Zookeeper version

2016-11-25 Thread Novin Novin
Thanks guys.

On Thu, 24 Nov 2016 at 17:03 Erick Erickson  wrote:

> Well, 3.4.6 gets the most testing, so if you want to upgrade it's at
> your own risk.
>
> See: https://issues.apache.org/jira/browse/SOLR-8724, there are
> problems with 3.4.8 in the Solr context for instance.
>
> There's currently an open Zookeeper JIRA for 3.4.9 that, when fixed,
> Solr will try to upgrade to.
>
> Best,
> Erick
>
> On Thu, Nov 24, 2016 at 2:12 AM, Novin Novin  wrote:
> > Hi Guys,
> >
> > I found in solr docs that "Solr currently uses Apache ZooKeeper v3.4.6".
> > Can I use higher version or I have to use 3.4.6 zookeeper.
> >
> > Thanks in advance,
> > Novin
>


initiate solr could collection

2016-11-28 Thread Novin Novin
Hi Guys,

Does solr has any way to create collection when solr cloud is getting
started first time?

Best,
Novin


Re: initiate solr could collection

2016-11-28 Thread Novin Novin
Thanks for this Erick, -e brings me to prompt. I can't use it because I am
using script to setup solr cloud. I required something where I can define
shard and replica also.
Best,
Novin

On Mon, 28 Nov 2016 at 16:14 Erick Erickson  wrote:

> try
>
> bin/solr start -e cloud -z ZK_NODE
>
> That'll guide you through creating a collection, assuming you can get
> by with one of the stock configuration sets.
>
> Best,
> Erick
>
> On Mon, Nov 28, 2016 at 8:11 AM, Novin Novin  wrote:
> > Hi Guys,
> >
> > Does solr has any way to create collection when solr cloud is getting
> > started first time?
> >
> > Best,
> > Novin
>


Re: initiate solr could collection

2016-11-28 Thread Novin Novin
Apologies for that didn't described properly. Thanks for the help. I will
look into this.

On Mon, 28 Nov 2016 at 16:33 Erick Erickson  wrote:

> Please state the full problem rather than make us pull things out in
> dribs and drabs.
>
> Have you looked at the bin/solr script options? Particularly the
> create_collection option?
>
> On Mon, Nov 28, 2016 at 8:24 AM, Novin Novin  wrote:
> > Thanks for this Erick, -e brings me to prompt. I can't use it because I
> am
> > using script to setup solr cloud. I required something where I can define
> > shard and replica also.
> > Best,
> > Novin
> >
> > On Mon, 28 Nov 2016 at 16:14 Erick Erickson 
> wrote:
> >
> >> try
> >>
> >> bin/solr start -e cloud -z ZK_NODE
> >>
> >> That'll guide you through creating a collection, assuming you can get
> >> by with one of the stock configuration sets.
> >>
> >> Best,
> >> Erick
> >>
> >> On Mon, Nov 28, 2016 at 8:11 AM, Novin Novin 
> wrote:
> >> > Hi Guys,
> >> >
> >> > Does solr has any way to create collection when solr cloud is getting
> >> > started first time?
> >> >
> >> > Best,
> >> > Novin
> >>
>


not range query in block join

2018-08-21 Thread Novin Novin
Hi Guys,

I was try to do block join query with "not". I got not success, can anybody
please help me out here.

This works   q=+_query_:"{!parent which=type_s:parent}
+time_tdt:[2018-08-01T16:00:00Z
TO 2018-08-04T15:59:59Z]"
This works q=-time_tdt:[2018-08-01T16:00:00Z TO 2018-08-04T15:59:59Z]

This does not work q=+_query_:"{!parent which=type_s:parent}
-time_tdt:[2018-08-01T16:00:00Z
TO 2018-08-04T15:59:59Z]"

Did I missed something?

Thanks in advanced.
Bests,
Novin


Re: not range query in block join

2018-08-22 Thread Novin Novin
Thanks you very much guys for help.

On Wed, 22 Aug 2018 at 10:02 Mikhail Khludnev  wrote:

> q={!parent which=type_s:parent}+type_s:child
> -time_tdt:[2018-08-01T16:00:00Z TO 2018-08-04T15:59:59Z]
> or
> q={!parent which=type_s:parent v=$cq}&cq=+type_s:child
> -time_tdt:[2018-08-01T16:00:00Z TO 2018-08-04T15:59:59Z]
>
> On Tue, Aug 21, 2018 at 7:13 PM Novin Novin  wrote:
>
> > Hi Guys,
> >
> > I was try to do block join query with "not". I got not success, can
> anybody
> > please help me out here.
> >
> > This works   q=+_query_:"{!parent which=type_s:parent}
> > +time_tdt:[2018-08-01T16:00:00Z
> > TO 2018-08-04T15:59:59Z]"
> > This works q=-time_tdt:[2018-08-01T16:00:00Z TO 2018-08-04T15:59:59Z]
> >
> > This does not work q=+_query_:"{!parent which=type_s:parent}
> > -time_tdt:[2018-08-01T16:00:00Z
> > TO 2018-08-04T15:59:59Z]"
> >
> > Did I missed something?
> >
> > Thanks in advanced.
> > Bests,
> > Novin
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


Set Basic Auth to CloudSolrStream

2019-04-15 Thread Novin Novin
Hi

How can I set basic auth for CloudSolrStream? I couldn't find any
documentation. Can someone please point me in the right direction?

Thanks in advance,
Navin


Re: Always use leader for searching queries

2018-01-09 Thread Novin Novin
Hi Erick,

Apology for delay.

[This isn't what I meant. I meant to query each replica directly
_within_ the same shard. Your problem statement is that the leader and
replicas (I use "followers") have different document counts. How are
you verifying this? Through the admin UI? Using &distrib=false is
useful when you want to query each core directly (and you have to use
the core name) in some automated fashion.]

I might be wrong here because now I can't produce it with distrib=false

I also did as you said
[OK, I'm assuming then that you issue a manual commit sometime, right?
Here's what I'd do:
1> turn off indexing
2> issue a commit (soft or hard-with-opensearcher-true)
3> now look at your doc counts on each replica.]

Everything is seems ok now, I must have doing something wrong before.

Thanks for all yours and walter's  help
Best,
Navin


On Wed, 3 Jan 2018 at 17:09 Walter Underwood  wrote:

> If you have a field for the indexed datetime, you can use a filter query
> to get rid of recent updates that might be in transit. I’d use double the
> autocommit time, to leave time for the followers to index.
>
> If the autocommit interval is one minute:
>
> fq=indexed_datetime:[* TO NOW-2MIN]
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Jan 3, 2018, at 8:58 AM, Erick Erickson 
> wrote:
> >
> > [I probably not need to do this because I have only one shard but I did
> > anyway count was different.]
> >
> > This isn't what I meant. I meant to query each replica directly
> > _within_ the same shard. Your problem statement is that the leader and
> > replicas (I use "followers") have different document counts. How are
> > you verifying this? Through the admin UI? Using &distrib=false is
> > useful when you want to query each core directly (and you have to use
> > the core name) in some automated fashion.
> >
> > [I have actually turned off auto soft commit for a time being but
> > nothing changed]
> >
> > OK, I'm assuming then that you issue a manual commit sometime, right?
> > Here's what I'd do:
> > 1> turn off indexing
> > 2> issue a commit (soft or hard-with-opensearcher-true)
> > 3> now look at your doc counts on each replica.
> >
> > If the counts are different then something's not right, Solr tries
> > very hard to not lose data, it's concerning if the leader and replicas
> > have different counts.
> >
> > Best,
> > Erick
> >
> > On Wed, Jan 3, 2018 at 1:51 AM, Novin Novin  wrote:
> >> Hi Erick,
> >>
> >> Thanks for your reply.
> >>
> >> [ First of all, replicas can be off in terms of counts for the soft
> >> commit interval. The commits don't all happen on the replicas at the
> >> same wall-clock time. Solr promises eventual consistency, in this case
> >> NOW-autocommit time.]
> >>
> >> I realized that, to stop it. I have actually turned off auto soft commit
> >> for a time being but nothing changed. Non leader replica still had extra
> >> documents.
> >>
> >> [ So my first question is whether the replicas in the shard are
> >> inconsistent as of, say, NOW-your_soft_commit_time. I'd add a fudge
> >> factor of 10 seconds earlier just to be sure I was past autowarming.
> >> This does require that there be a time stamp. Absent a timestamp, you
> >> could suspend indexing for a few minutes and run the test like below.]
> >>
> >> When data was indexing at that time I was checking how the counts are in
> >> both replica. What I found leader replica has 3 doc less than other
> replica
> >> always. I don't think so they were of by NOW-soft_commit_time,
> CloudSolrClient
> >> add some thing like this "_stateVer_=main:114" in query which I assume
> is
> >> for results to be consistent between both replica search.
> >>
> >> [Adding &distrib=false to your command and directing it at a specific
> >> _core_ (something like collection1_shard1_replica1) will only return
> >> data from that core.]
> >> I probably not need to do this because I have only one shard but I did
> >> anyway count was different.
> >>
> >> [When you say you index every minute, I'm guessing you only index for
> >> part of that minute, is that true? In that case you might get more
> >> consistency if, instead of relying totally on your autoconfig
> >> settings, specify commitWithin on your update command. That sh

Re: Always use leader for searching queries

2018-01-09 Thread Novin Novin
Thank you very much for all your help.

On Tue 9 Jan 2018, 16:32 Erick Erickson,  wrote:

> One thing to be aware of is that the commit points on the replicas in a
> replica may (will) fire at different times. So when you're comparing the
> number of docs on the replicas in a shard you have to compare before the
> last commit interval. So say you have a soft commit of 1 minute. When
> comparing the docs on each shard you need to restrict the query to things
> older than 1 minute or stop indexing and wait for 1 minute (i.e. until
> after the autocommit fires).
>
> Glad things worked out!
> Erick
>
> On Tue, Jan 9, 2018 at 4:08 AM, Novin Novin  wrote:
>
> > Hi Erick,
> >
> > Apology for delay.
> >
> > [This isn't what I meant. I meant to query each replica directly
> > _within_ the same shard. Your problem statement is that the leader and
> > replicas (I use "followers") have different document counts. How are
> > you verifying this? Through the admin UI? Using &distrib=false is
> > useful when you want to query each core directly (and you have to use
> > the core name) in some automated fashion.]
> >
> > I might be wrong here because now I can't produce it with distrib=false
> >
> > I also did as you said
> > [OK, I'm assuming then that you issue a manual commit sometime, right?
> > Here's what I'd do:
> > 1> turn off indexing
> > 2> issue a commit (soft or hard-with-opensearcher-true)
> > 3> now look at your doc counts on each replica.]
> >
> > Everything is seems ok now, I must have doing something wrong before.
> >
> > Thanks for all yours and walter's  help
> > Best,
> > Navin
> >
> >
> > On Wed, 3 Jan 2018 at 17:09 Walter Underwood 
> > wrote:
> >
> > > If you have a field for the indexed datetime, you can use a filter
> query
> > > to get rid of recent updates that might be in transit. I’d use double
> the
> > > autocommit time, to leave time for the followers to index.
> > >
> > > If the autocommit interval is one minute:
> > >
> > > fq=indexed_datetime:[* TO NOW-2MIN]
> > >
> > > wunder
> > > Walter Underwood
> > > wun...@wunderwood.org
> > > http://observer.wunderwood.org/  (my blog)
> > >
> > >
> > > > On Jan 3, 2018, at 8:58 AM, Erick Erickson 
> > > wrote:
> > > >
> > > > [I probably not need to do this because I have only one shard but I
> did
> > > > anyway count was different.]
> > > >
> > > > This isn't what I meant. I meant to query each replica directly
> > > > _within_ the same shard. Your problem statement is that the leader
> and
> > > > replicas (I use "followers") have different document counts. How are
> > > > you verifying this? Through the admin UI? Using &distrib=false is
> > > > useful when you want to query each core directly (and you have to use
> > > > the core name) in some automated fashion.
> > > >
> > > > [I have actually turned off auto soft commit for a time being but
> > > > nothing changed]
> > > >
> > > > OK, I'm assuming then that you issue a manual commit sometime, right?
> > > > Here's what I'd do:
> > > > 1> turn off indexing
> > > > 2> issue a commit (soft or hard-with-opensearcher-true)
> > > > 3> now look at your doc counts on each replica.
> > > >
> > > > If the counts are different then something's not right, Solr tries
> > > > very hard to not lose data, it's concerning if the leader and
> replicas
> > > > have different counts.
> > > >
> > > > Best,
> > > > Erick
> > > >
> > > > On Wed, Jan 3, 2018 at 1:51 AM, Novin Novin 
> > wrote:
> > > >> Hi Erick,
> > > >>
> > > >> Thanks for your reply.
> > > >>
> > > >> [ First of all, replicas can be off in terms of counts for the soft
> > > >> commit interval. The commits don't all happen on the replicas at the
> > > >> same wall-clock time. Solr promises eventual consistency, in this
> case
> > > >> NOW-autocommit time.]
> > > >>
> > > >> I realized that, to stop it. I have actually turned off auto soft
> > commit
> > > >> for a time being but nothing changed. Non leader replica still had
> > extra
> >

Solr cloud upgrade from 5 to 6

2018-01-15 Thread Novin Novin
Hi Guys,

I would need a piece of advise about upgrading solr cloud.  Would I need to
re-index data If upgrade Solr cloud from 5.5.4 to 6.6.2?

Thanks in advance.
Navin


Re: Solr cloud upgrade from 5 to 6

2018-01-15 Thread Novin Novin
Thank you very much for your advise. I really appreciate.

On Mon, 15 Jan 2018 at 16:51 Erick Erickson  wrote:

> No, Solr works hard to guarantee compatibility
> one major revision back so any 6x version should
> be able to work fine with any 5x version.
>
> A couple of free bits of advice, worth what you pay
> for them:
>
> 1> don't just use your configs from 5x in 6x. Rather
> start with the stock 6x configs and customize them
> as you did for 5x.
>
> 2> Really look over the CHANGES.txt, particularly hte
> upgrade section for all versions between 5.5.4 and
> 6.6.2.
>
> 3> If you _can_ reindex, I always do if for no other reason
> than that'll force me to look at what's new and make use
> of it. Again you don't _have_ to though.
>
> Best,
> Erick
>
> On Mon, Jan 15, 2018 at 2:10 AM, Novin Novin  wrote:
> > Hi Guys,
> >
> > I would need a piece of advise about upgrading solr cloud.  Would I need
> to
> > re-index data If upgrade Solr cloud from 5.5.4 to 6.6.2?
> >
> > Thanks in advance.
> > Navin
>


duplicate doc of uniqueKey

2018-04-19 Thread Novin Novin
Hi Guys,

I end up with duplicate docs in solr cloud. I don't know how to debug it.
So looking for help here please.

Below is details:
Solr 6.6.2
zookeeper 3.4.10

Below is example of duplicate record of Json:

{
  "responseHeader":{
"zkConnected":true,
"status":0,
"QTime":0,
"params":{
  "q":"*:*",
  "distrib":"false",
  "indent":"on",
  "fl":"id",
  "fq":"id:mid531281",
  "wt":"json"}},
  "response":{"numFound":2,"start":0,"docs":[
  {
"id":"mid531281"},
  {
"id":"mid531281"}]
  }}

schema file contains:


id

Let me know if extra information required. Any help would be really
appreciated.

Regards,
Novin


Re: duplicate doc of uniqueKey

2018-04-19 Thread Novin Novin
Hi Erick,

I haven't done any of merge indexes with MergeIndexes or
MapReduceIndexerTool.
Actually I found that one of doc does not have child doc, because I am
using solr parent child docs for block join queries. As far as I know, it
is know issue for parent child docs that if you send only parent doc it end
up as single doc rather than to replace parent with child.
If you know that this issue has been fixed with certain solr version please
let me know or any other way to handle this issue.

Thanks in advance,
Novin



On Thu, 19 Apr 2018 at 17:26 Erick Erickson  wrote:

> Also ask for the _version_ field in your fl list. The _version_ field
> is used o r optimistic locking. This is mostly a curiosity
> question
>
> The only time I've ever seen something like this is if you, for
> instance, use MergeIndexes or MapReduceIndexerTool (which does a
> MergeIndexes under the covers). Have you done anything similar?
>
> Best,
> Erick
>
>
> On Thu, Apr 19, 2018 at 8:54 AM, Novin Novin  wrote:
> > Hi Guys,
> >
> > I end up with duplicate docs in solr cloud. I don't know how to debug it.
> > So looking for help here please.
> >
> > Below is details:
> > Solr 6.6.2
> > zookeeper 3.4.10
> >
> > Below is example of duplicate record of Json:
> >
> > {
> >   "responseHeader":{
> > "zkConnected":true,
> > "status":0,
> > "QTime":0,
> > "params":{
> >   "q":"*:*",
> >   "distrib":"false",
> >   "indent":"on",
> >   "fl":"id",
> >   "fq":"id:mid531281",
> >   "wt":"json"}},
> >   "response":{"numFound":2,"start":0,"docs":[
> >   {
> > "id":"mid531281"},
> >   {
> > "id":"mid531281"}]
> >   }}
> >
> > schema file contains:
> >  required="true"
> > multiValued="false" docValues="true"/>
> >
> > id
> >
> > Let me know if extra information required. Any help would be really
> > appreciated.
> >
> > Regards,
> > Novin
>


Re: duplicate doc of uniqueKey

2018-04-19 Thread Novin Novin
Hi Karthik,

*Was your system time moved to future time and then was reset to current*
*time?*

Nothing happen like this as far as I known.

Thanks in advance
Novin


On Thu, 19 Apr 2018 at 18:26 Karthik Ramachandran  wrote:

> Novin,
>
> Was your system time moved to future time and then was reset to current
> time?
>
> Solr will add the new document and will send delete for the old document
> but there will no document matching the criteria.
>
>
> On Thu, Apr 19, 2018 at 1:10 PM, Novin Novin  wrote:
>
> > Hi Erick,
> >
> > I haven't done any of merge indexes with MergeIndexes or
> > MapReduceIndexerTool.
> > Actually I found that one of doc does not have child doc, because I am
> > using solr parent child docs for block join queries. As far as I know, it
> > is know issue for parent child docs that if you send only parent doc it
> end
> > up as single doc rather than to replace parent with child.
> > If you know that this issue has been fixed with certain solr version
> please
> > let me know or any other way to handle this issue.
> >
> > Thanks in advance,
> > Novin
> >
> >
> >
> > On Thu, 19 Apr 2018 at 17:26 Erick Erickson 
> > wrote:
> >
> > > Also ask for the _version_ field in your fl list. The _version_ field
> > > is used o r optimistic locking. This is mostly a curiosity
> > > question
> > >
> > > The only time I've ever seen something like this is if you, for
> > > instance, use MergeIndexes or MapReduceIndexerTool (which does a
> > > MergeIndexes under the covers). Have you done anything similar?
> > >
> > > Best,
> > > Erick
> > >
> > >
> > > On Thu, Apr 19, 2018 at 8:54 AM, Novin Novin 
> > wrote:
> > > > Hi Guys,
> > > >
> > > > I end up with duplicate docs in solr cloud. I don't know how to debug
> > it.
> > > > So looking for help here please.
> > > >
> > > > Below is details:
> > > > Solr 6.6.2
> > > > zookeeper 3.4.10
> > > >
> > > > Below is example of duplicate record of Json:
> > > >
> > > > {
> > > >   "responseHeader":{
> > > > "zkConnected":true,
> > > > "status":0,
> > > > "QTime":0,
> > > > "params":{
> > > >   "q":"*:*",
> > > >   "distrib":"false",
> > > >   "indent":"on",
> > > >   "fl":"id",
> > > >   "fq":"id:mid531281",
> > > >   "wt":"json"}},
> > > >   "response":{"numFound":2,"start":0,"docs":[
> > > >   {
> > > > "id":"mid531281"},
> > > >   {
> > > > "id":"mid531281"}]
> > > >   }}
> > > >
> > > > schema file contains:
> > > >  > > required="true"
> > > > multiValued="false" docValues="true"/>
> > > >
> > > > id
> > > >
> > > > Let me know if extra information required. Any help would be really
> > > > appreciated.
> > > >
> > > > Regards,
> > > > Novin
> > >
> >
>
>
>
> --
> With Thanks & Regards
> Karthik Ramachandran
>
> P Please don't print this e-mail unless you really need to
>


Re: duplicate doc of uniqueKey

2018-04-19 Thread Novin Novin
Thanks Erick and Karthik for you help.

On Thu, 19 Apr 2018 at 19:53 Erick Erickson  wrote:

> Right, parent/child docs _must_ be treated as a block. By that I mean
> you cannot add/delete individuals child docs and/or parent docs.
> That's one of the limitations of parent/child blocks and I don't know
> of any plans to change that.
>
> Best,
> Erick
>
> On Thu, Apr 19, 2018 at 11:14 AM, Novin Novin  wrote:
> > Hi Karthik,
> >
> > *Was your system time moved to future time and then was reset to current*
> > *time?*
> >
> > Nothing happen like this as far as I known.
> >
> > Thanks in advance
> > Novin
> >
> >
> > On Thu, 19 Apr 2018 at 18:26 Karthik Ramachandran 
> wrote:
> >
> >> Novin,
> >>
> >> Was your system time moved to future time and then was reset to current
> >> time?
> >>
> >> Solr will add the new document and will send delete for the old document
> >> but there will no document matching the criteria.
> >>
> >>
> >> On Thu, Apr 19, 2018 at 1:10 PM, Novin Novin 
> wrote:
> >>
> >> > Hi Erick,
> >> >
> >> > I haven't done any of merge indexes with MergeIndexes or
> >> > MapReduceIndexerTool.
> >> > Actually I found that one of doc does not have child doc, because I am
> >> > using solr parent child docs for block join queries. As far as I
> know, it
> >> > is know issue for parent child docs that if you send only parent doc
> it
> >> end
> >> > up as single doc rather than to replace parent with child.
> >> > If you know that this issue has been fixed with certain solr version
> >> please
> >> > let me know or any other way to handle this issue.
> >> >
> >> > Thanks in advance,
> >> > Novin
> >> >
> >> >
> >> >
> >> > On Thu, 19 Apr 2018 at 17:26 Erick Erickson 
> >> > wrote:
> >> >
> >> > > Also ask for the _version_ field in your fl list. The _version_
> field
> >> > > is used o r optimistic locking. This is mostly a curiosity
> >> > > question
> >> > >
> >> > > The only time I've ever seen something like this is if you, for
> >> > > instance, use MergeIndexes or MapReduceIndexerTool (which does a
> >> > > MergeIndexes under the covers). Have you done anything similar?
> >> > >
> >> > > Best,
> >> > > Erick
> >> > >
> >> > >
> >> > > On Thu, Apr 19, 2018 at 8:54 AM, Novin Novin 
> >> > wrote:
> >> > > > Hi Guys,
> >> > > >
> >> > > > I end up with duplicate docs in solr cloud. I don't know how to
> debug
> >> > it.
> >> > > > So looking for help here please.
> >> > > >
> >> > > > Below is details:
> >> > > > Solr 6.6.2
> >> > > > zookeeper 3.4.10
> >> > > >
> >> > > > Below is example of duplicate record of Json:
> >> > > >
> >> > > > {
> >> > > >   "responseHeader":{
> >> > > > "zkConnected":true,
> >> > > > "status":0,
> >> > > > "QTime":0,
> >> > > > "params":{
> >> > > >   "q":"*:*",
> >> > > >   "distrib":"false",
> >> > > >   "indent":"on",
> >> > > >   "fl":"id",
> >> > > >   "fq":"id:mid531281",
> >> > > >   "wt":"json"}},
> >> > > >   "response":{"numFound":2,"start":0,"docs":[
> >> > > >   {
> >> > > > "id":"mid531281"},
> >> > > >   {
> >> > > > "id":"mid531281"}]
> >> > > >   }}
> >> > > >
> >> > > > schema file contains:
> >> > > >  >> > > required="true"
> >> > > > multiValued="false" docValues="true"/>
> >> > > >
> >> > > > id
> >> > > >
> >> > > > Let me know if extra information required. Any help would be
> really
> >> > > > appreciated.
> >> > > >
> >> > > > Regards,
> >> > > > Novin
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> With Thanks & Regards
> >> Karthik Ramachandran
> >>
> >> P Please don't print this e-mail unless you really need to
> >>
>