Re: Solr Ref Guide Redesign coming in 8.6

2020-04-29 Thread Bernd Fehling
+1

And a fully indexed search for the Ref Guide.
I have to use Google to search for infos in Ref Guide of a search engine. :-(


Am 29.04.20 um 02:11 schrieb matthew sporleder:
> I highly recommend a version selector in the header!  I am *always*
> landing on 6.x docs from google.
> 
> On Tue, Apr 28, 2020 at 5:18 PM Cassandra Targett  wrote:
>>
>> In case the list breaks the URL to view the Jenkins build, here's a shorter
>> URL:
>>
>> https://s.apache.org/df7ew.
>>
>> On Tue, Apr 28, 2020 at 3:12 PM Cassandra Targett 
>> wrote:
>>
>>> The PMC would like to engage the Solr user community for feedback on an
>>> extensive redesign of the Solr Reference Guide I've just committed to the
>>> master (future 9.0) branch.
>>>
>>> You can see the new design from our Jenkins build of master:
>>>
>>> https://builds.apache.org/view/L/view/Lucene/job/Solr-reference-guide-master/javadoc/
>>>
>>> The hope is that you will receive these changes positively. If so, we'll
>>> use this for the upcoming 8.6 Ref Guide and future releases. We also may
>>> re-publish earlier 8.x versions so they use this design.
>>>
>>> I embarked on this project last December simply as an attempt to upgrade
>>> the version of Bootstrap used by the Guide. After a couple of days, I'd
>>> changed the layout entirely. In the ensuing few months I've tried to iron
>>> out the kinks and made some extensive changes to the "backend" (the CSS,
>>> JavaScript, etc.).
>>>
>>> I'm no graphic designer, but some of my guiding thoughts were to try to
>>> make full use of the browser window, improve responsiveness for different
>>> sized screens, and just give it a more modern feel. The full list of what
>>> has changed is detailed in the Jira issue if you are interested:
>>> https://issues.apache.org/jira/browse/SOLR-14173
>>>
>>> This is Phase 1 of several changes. There is one glaring remaining issue,
>>> which is that our list of top-level categories is too long for the new
>>> design. I've punted fixing that to Phase 2, which will be an extensive
>>> re-consideration of how the Ref Guide is organized with the goal of
>>> trimming down the top-level categories to only 4-6. SOLR-1 will track
>>> phase 2.
>>>
>>> One last thing to note: this redesign really only changes the presentation
>>> of the pages and some of the framework under the hood - it doesn't yet add
>>> full-text search. All of the obstacles to providing search still exist, but
>>> please know that we fully understand frustration on this point and still
>>> hope to fix it.
>>>
>>> I look forward to hearing your feedback in this thread.
>>>
>>> Best,
>>> Cassandra
>>>


Re: Solr Ref Guide Redesign coming in 8.6

2020-04-29 Thread Colvin Cowie
In addition to those points, I think it generally does look good but the
thing I've noticed is that increase in text size on rollover in the menu
makes it quite jumpy:
https://drive.google.com/open?id=15EF0T_C_l8OIDuW8QHOFunL4VzxtyVyb

On Wed, 29 Apr 2020 at 08:15, Bernd Fehling 
wrote:

> +1
>
> And a fully indexed search for the Ref Guide.
> I have to use Google to search for infos in Ref Guide of a search engine.
> :-(
>
>
> Am 29.04.20 um 02:11 schrieb matthew sporleder:
> > I highly recommend a version selector in the header!  I am *always*
> > landing on 6.x docs from google.
> >
> > On Tue, Apr 28, 2020 at 5:18 PM Cassandra Targett 
> wrote:
> >>
> >> In case the list breaks the URL to view the Jenkins build, here's a
> shorter
> >> URL:
> >>
> >> https://s.apache.org/df7ew.
> >>
> >> On Tue, Apr 28, 2020 at 3:12 PM Cassandra Targett 
> >> wrote:
> >>
> >>> The PMC would like to engage the Solr user community for feedback on an
> >>> extensive redesign of the Solr Reference Guide I've just committed to
> the
> >>> master (future 9.0) branch.
> >>>
> >>> You can see the new design from our Jenkins build of master:
> >>>
> >>>
> https://builds.apache.org/view/L/view/Lucene/job/Solr-reference-guide-master/javadoc/
> >>>
> >>> The hope is that you will receive these changes positively. If so,
> we'll
> >>> use this for the upcoming 8.6 Ref Guide and future releases. We also
> may
> >>> re-publish earlier 8.x versions so they use this design.
> >>>
> >>> I embarked on this project last December simply as an attempt to
> upgrade
> >>> the version of Bootstrap used by the Guide. After a couple of days, I'd
> >>> changed the layout entirely. In the ensuing few months I've tried to
> iron
> >>> out the kinks and made some extensive changes to the "backend" (the
> CSS,
> >>> JavaScript, etc.).
> >>>
> >>> I'm no graphic designer, but some of my guiding thoughts were to try to
> >>> make full use of the browser window, improve responsiveness for
> different
> >>> sized screens, and just give it a more modern feel. The full list of
> what
> >>> has changed is detailed in the Jira issue if you are interested:
> >>> https://issues.apache.org/jira/browse/SOLR-14173
> >>>
> >>> This is Phase 1 of several changes. There is one glaring remaining
> issue,
> >>> which is that our list of top-level categories is too long for the new
> >>> design. I've punted fixing that to Phase 2, which will be an extensive
> >>> re-consideration of how the Ref Guide is organized with the goal of
> >>> trimming down the top-level categories to only 4-6. SOLR-1 will
> track
> >>> phase 2.
> >>>
> >>> One last thing to note: this redesign really only changes the
> presentation
> >>> of the pages and some of the framework under the hood - it doesn't yet
> add
> >>> full-text search. All of the obstacles to providing search still
> exist, but
> >>> please know that we fully understand frustration on this point and
> still
> >>> hope to fix it.
> >>>
> >>> I look forward to hearing your feedback in this thread.
> >>>
> >>> Best,
> >>> Cassandra
> >>>
>


Hi Team Just to verify with respect to an Issue working on Solr

2020-04-29 Thread Gundala Narayana
Hi Team,



Please Let me know. I have an issue working with Solr. I am importing data
from Neo4j data to Solr. I am making use of Neo4j Cypher Query string to
import data from Neo4j to Solr. The query string with all necessary
credentials like Neo4j host, User & password is placed in a .xml file. When
I am trying to execute, Zero index Unicode Character is getting added to
the query string and its not able to execute the code.



Any suggestions on this.



Thanks & Regards,



G V L Narayana.


off-heap OOM

2020-04-29 Thread Raji N
Hi,

We are using solrcloud 7.6.0 and we have containerized solr. We have around
30 collections and 7 solr nodes in the cluster.  Though we have
containerized , we have  one zookeeper container and one solr container
running in a host.
We have 24GB  heap and total container has memory 49GB , which leaves off-
heap as 25GB. We have set

max user processes  (-u) unlimited

virtual memory  (kbytes, -v) unlimited

file locks  (-x) unlimited

max memory size (kbytes, -m) unlimited


 OOM for solr occur in every 5 days. When we examined heapdumps , the heap
is only around 700MB , but we have off-heap memory as 29GB.

Major consumer is  java.nio.DirectByteBufferR


Major Reference chains



8,820,117Kb (1462.3%): *java.nio.DirectByteBufferR*: 64 objects

↖*sun.misc.Cleaner**.referent*

↖*sun.misc.Cleaner**.{prev}*

↖*java.nio.DirectByteBuffer**.cleaner*

↖*java.nio.ByteBuffer[]*

↖*sun.nio.ch.Util$BufferCache**.buffers*

↖*j.l.ThreadLocal$ThreadLocalMap$Entry**.value*

↖*j.l.ThreadLocal$ThreadLocalMap$Entry[]*

↖*j.l.ThreadLocal$ThreadLocalMap**.table*

↖*j.l.Thread**.threadLocals*

↖*j.l.Thread[]*

↖*j.l.ThreadGroup**.threads*

↖*j.l.ThreadGroup[]*

↖*j.l.ThreadGroup**.groups*

↖Java Static *sun.rmi.runtime.NewThreadAction**.systemThreadGroup*

3,534,863Kb (586.0%): *java.nio.DirectByteBufferR*: 22 objects

↖*sun.misc.Cleaner**.referent*

↖*sun.misc.Cleaner**.{next}*

↖*sun.nio.fs.NativeBuffer**.cleaner*

↖*sun.nio.fs.NativeBuffer[]*

↖*j.l.ThreadLocal$ThreadLocalMap$Entry**.value*

↖*j.l.ThreadLocal$ThreadLocalMap$Entry[]*

↖*j.l.ThreadLocal$ThreadLocalMap**.table*

↖*j.l.Thread**.threadLocals*

↖*j.l.Thread[]*

↖*j.l.ThreadGroup**.threads*

↖*j.l.ThreadGroup[]*

↖*j.l.ThreadGroup**.groups*

↖Java Static *sun.rmi.runtime.NewThreadAction**.systemThreadGroup*

3,145,728Kb (521.5%): *java.nio.DirectByteBufferR*: 3 objects

↖*java.nio.ByteBuffer[]*

↖*org.apache.lucene.store.ByteBufferIndexInput$MultiBufferImpl**.buffers*

↖*org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader*
*.fieldsStream*

↖*org.apache.lucene.index.SegmentCoreReaders**.fieldsReaderOrig*

↖*org.apache.lucene.index.SegmentReader**.core*

↖*org.apache.lucene.index.SegmentReader[]*

↖*org.apache.lucene.index.StandardDirectoryReader**.subReaders*

↖*org.apache.solr.search.SolrIndexSearcher**.rawReader*

↖*{j.u.concurrent.ConcurrentHashMap}**.values*

↖*org.apache.solr.core.SolrCore**.infoRegistry*

↖*{j.u.LinkedHashMap}**.values*

↖*org.apache.solr.core.SolrCores**.cores*

↖*org.apache.solr.core.CoreContainer**.solrCores*

↖*org.apache.solr.cloud.RecoveringCoreTermWatcher**.coreContainer*

↖*{j.u.HashSet}*

↖*org.apache.solr.cloud.ZkShardTerms**.listeners*

↖*{j.u.concurrent.ConcurrentHashMap}**.keys*

↖Java Static *org.apache.solr.common.util.ObjectReleaseTracker**.OBJECTS*

2,605,258Kb (431.9%): *java.nio.DirectByteBufferR*: 184 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

↖*org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader*
*.fieldsStream*

↖*org.apache.lucene.index.SegmentCoreReaders**.fieldsReaderOrig*

↖*org.apache.lucene.index.SegmentReader**.core*

↖*org.apache.lucene.index.SegmentReader[]*

↖*org.apache.lucene.index.StandardDirectoryReader**.subReaders*

↖*org.apache.solr.search.SolrIndexSearcher**.rawReader*

↖*{j.u.concurrent.ConcurrentHashMap}**.values*

↖*org.apache.solr.core.SolrCore**.infoRegistry*

↖*{j.u.LinkedHashMap}**.values*

↖*org.apache.solr.core.SolrCores**.cores*

↖*org.apache.solr.core.CoreContainer**.solrCores*

↖*org.apache.solr.cloud.RecoveringCoreTermWatcher**.coreContainer*

↖*{j.u.HashSet}*

↖*org.apache.solr.cloud.ZkShardTerms**.listeners*

↖*{j.u.concurrent.ConcurrentHashMap}**.keys*

↖Java Static *org.apache.solr.common.util.ObjectReleaseTracker**.OBJECTS*

1,790,441Kb (296.8%): *java.nio.DirectByteBufferR*: 70 objects

↖*org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl**.curBuf*

↖*org.apache.lucene.codecs.lucene50.Lucene50CompoundReader**.handle*

↖*org.apache.lucene.index.SegmentCoreReaders**.cfsReader*

↖*org.apache.lucene.index.SegmentReader**.core*

↖*org.apache.lucene.index.SegmentReader[]*

↖*org.apache.lucene.index.StandardDirectoryReader**.subReaders*

↖*org.apache.solr.search.SolrIndexSearcher**.rawReader*

↖*{j.u.concurrent.ConcurrentHashMap}**.values*

↖*org.apache.solr.core.SolrCore**.infoRegistry*

↖*{j.u.LinkedHashMap}**.values*

↖*org.apache.solr.core.SolrCores**.cores*

↖*org.apache.solr.core.CoreContainer**.solrCores*

↖*org.apache.solr.cloud.RecoveringCoreTermWatcher**.coreContainer*

↖*{j.u.HashSet}*

↖*org.apache.solr.cloud.ZkShardTerms**.listeners*

↖*{j.u.concurrent.ConcurrentHashMap}**.keys*

↖Java Static *org.apache.solr.common.util.ObjectReleaseTracker**.OBJECTS*

1,385,471Kb (229.7%): *java.nio.DirectByteBufferR*: 85 objects

↖*sun.misc.Cleaner**.referent*

↖*sun.misc.Cleaner**.{next}*

↖*java.nio.DirectByteBuffer**.cleaner*

↖*java.nio.ByteBuffer[]*

↖*sun.

Re: Which Solr metrics do you find important?

2020-04-29 Thread Radu Gheorghe
Thanks Matthew and Walter. OK, so you both use the clusterstatus output in
your regular monitoring. This seems to be missing from what we have now (we
collect everything else you mentioned, like response time percentiles, disk
IO, etc). So I guess clusterstatus deserves a priority bump :)

Best regards,
Radu
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


On Tue, Apr 28, 2020 at 6:47 PM Walter Underwood 
wrote:

> I also have some Python that pull stuff from clusterstatus and sends it to
> InfluxDB.
>
> We wrote a servlet filter that intercepts requests to Solr and sends
> performance data
> to monitoring. That gives us per-request handler traffic and response time
> percentiles.
>
> Telegraf for CPU, run queue, disk IO, etc.
>
> CloudWatch for load balancer traffic, errors, and healthy host count.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Apr 28, 2020, at 8:00 AM, matthew sporleder 
> wrote:
> >
> > I think clusterstatus is how you find some of that stuff.
> >
> > I wrote this when I was using datadog to supplement what they offered:
> > https://github.com/msporleder/dd-solrcloud/blob/master/solrcloud.py
> > (sorry for crappy python) and it got me most of the monitoring I
> > needed for my particular situation.
> >
> >
> >
> >
> > On Tue, Apr 28, 2020 at 10:52 AM Radu Gheorghe
> >  wrote:
> >>
> >> Thanks a lot, Matthew! OK, so you do care about the size of tlogs. As
> well
> >> as Collections API stuff (clusterstatus, overseerstatus).
> >>
> >> And DIH, I didn't think that these stats would be interesting, but
> surely
> >> they are for people who use DIH :)
> >>
> >> Best regards,
> >> Radu
> >> --
> >> Monitoring - Log Management - Alerting - Anomaly Detection
> >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >>
> >>
> >> On Tue, Apr 28, 2020 at 4:17 PM matthew sporleder  >
> >> wrote:
> >>
> >>> size-on-disk of cores, size of tlogs, DIH stats over time, last
> >>> modified date of cores
> >>>
> >>> The most important alert-type things are -- collections in recovery or
> >>> down state, solrcloud election events, various error rates
> >>>
> >>> It's also important to be able to tie these back to aliases so you are
> >>> only monitoring cores you care about, even if their backing collection
> >>> name changes every so often
> >>>
> >>>
> >>>
> >>> On Tue, Apr 28, 2020 at 7:57 AM Radu Gheorghe
> >>>  wrote:
> 
>  Hi fellow Solr users,
> 
>  I'm looking into improving our Solr monitoring
>   and I was curious on
> which
>  metrics you consider relevant.
> 
>  From what we currently have, I'm only really missing fieldCache.
> Which we
>  collect, but not show in the UI yet (unless you add a custom chart -
> >>> we'll
>  add it to default soon).
> 
>  You can click on a demo account 
> >>> (there's a
>  Solr app there called PH.Prod.Solr7) to see what we already collect,
> but
>  I'll write it here in short:
>  - query rate and latency (you can group per handler, per core, per
>  collection if it's SolrCloud)
>  - index size (number of segments, files...)
>  - indexing: added/deleted docs, commits
>  - caches (size, hit ratio, warmup...)
>  - OS- and JVM-level metrics (from CPU iowait to GC latency and
> everything
>  in between)
> 
>  Anything that we should add?
> 
>  I went through the Metrics API output, and the only significant thing
> I
> >>> can
>  think of is the transaction log. But to be honest I never checked
> those
>  metrics in practice.
> 
>  Or maybe there's something outside the Metrics API that would be
> useful?
> >>> I
>  thought about the breakdown of shards that are up/down/recovering...
> as
>  well as replica types. We plan on adding those, but there's a
> challenge
> >>> in
>  de-duplicating metrics. Because one would install one agent per node,
> and
>  I'm not aware of a way to show only local shards in the Collections
> API
> >>> ->
>  CLUSTERSTATUS.
> 
>  Thanks in advance for any feedback that you may have!
>  Radu
>  --
>  Monitoring - Log Management - Alerting - Anomaly Detection
>  Solr & Elasticsearch Consulting Support Training -
> http://sematext.com/
> >>>
>
>


Re: IdleTimeout setting in Jetty (Solr 7.7.1)

2020-04-29 Thread Raji N
Try starting like this.

 bin/solr start -Dsolr.jetty.threads.idle.timeout=2000 -z localhost:2181



Hope this helps

Raji

On Sun, Apr 26, 2020 at 11:24 PM Kommu, Vinodh K.  wrote:

> Can someone shed some idea on below requirement?
>
> Thanks & Regards,
> Vinodh
>
> From: Kommu, Vinodh K.
> Sent: Friday, April 24, 2020 11:34 PM
> To: solr-user@lucene.apache.org
> Subject: IdleTimeout setting in Jetty (Solr 7.7.1)
>
> Hi,
>
> Our clients are running streaming expressions on 140M docs collection
> which has relatively huge data however query is not completing through &
> timing out after 120secs (which is default timeout in jetty*.xml files). We
> had changed the default timeout from 120s to 300s which worked fine. To
> change the default timeout setting, we had to modify the jetty files from
> installation directory, to avoid this, is there any option/way available in
> solr start arguments to overwrite the default idleTimeout value with custom
> value without modifying the actual jetty*.xml files?
>
> Thanks & Regards,
> Vinodh
>
> DTCC DISCLAIMER: This email and any files transmitted with it are
> confidential and intended solely for the use of the individual or entity to
> whom they are addressed. If you have received this email in error, please
> notify us immediately and delete the email and any attachments from your
> system. The recipient should check this email and any attachments for the
> presence of viruses. The company accepts no liability for any damage caused
> by any virus transmitted by this email.
>


RE: IdleTimeout setting in Jetty (Solr 7.7.1)

2020-04-29 Thread Kommu, Vinodh K.
Thanks Raji, let me try it.

On the other hand, I'm trying socketTimeout and connTimeout properties as well.


Thanks & Regards,
Vinodh

-Original Message-
From: Raji N  
Sent: Wednesday, April 29, 2020 2:59 PM
To: solr-user@lucene.apache.org
Subject: Re: IdleTimeout setting in Jetty (Solr 7.7.1)

ATTENTION: External Email – Be Suspicious of Attachments, Links and Requests 
for Login Information.

Try starting like this.

 bin/solr start -Dsolr.jetty.threads.idle.timeout=2000 -z localhost:2181



Hope this helps

Raji

On Sun, Apr 26, 2020 at 11:24 PM Kommu, Vinodh K.  wrote:

> Can someone shed some idea on below requirement?
>
> Thanks & Regards,
> Vinodh
>
> From: Kommu, Vinodh K.
> Sent: Friday, April 24, 2020 11:34 PM
> To: solr-user@lucene.apache.org
> Subject: IdleTimeout setting in Jetty (Solr 7.7.1)
>
> Hi,
>
> Our clients are running streaming expressions on 140M docs collection 
> which has relatively huge data however query is not completing through 
> & timing out after 120secs (which is default timeout in jetty*.xml 
> files). We had changed the default timeout from 120s to 300s which 
> worked fine. To change the default timeout setting, we had to modify 
> the jetty files from installation directory, to avoid this, is there 
> any option/way available in solr start arguments to overwrite the 
> default idleTimeout value with custom value without modifying the actual 
> jetty*.xml files?
>
> Thanks & Regards,
> Vinodh
>
> DTCC DISCLAIMER: This email and any files transmitted with it are 
> confidential and intended solely for the use of the individual or 
> entity to whom they are addressed. If you have received this email in 
> error, please notify us immediately and delete the email and any 
> attachments from your system. The recipient should check this email 
> and any attachments for the presence of viruses. The company accepts 
> no liability for any damage caused by any virus transmitted by this email.
>
DTCC DISCLAIMER: This email and any files transmitted with it are confidential 
and intended solely for the use of the individual or entity to whom they are 
addressed. If you have received this email in error, please notify us 
immediately and delete the email and any attachments from your system. The 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any virus 
transmitted by this email.


Adding several new fields to managed-schema by sorlj

2020-04-29 Thread Szűcs Roland
Hi folks,

I am using solr 8.5.0 in standalone mode and use the CoreAdmin API and
Schema API of solrj to create new core and its fields in managed-schema
Is there any way to add several fields to managed-schema by solrj without
processing each by each?

The following two rows make the job done by 4sec/field which is
extremely slow:
SchemaRequest.AddField schemaRequest = new
SchemaRequest.AddField(fieldAttributes);
SchemaResponse.UpdateResponse response =  schemaRequest.process(solrC);

The core is empty as the field creation is the part of the core creation
process. The schema API docs says:
It is possible to perform one or more add requests in a single command. The
API is transactional and all commands in a single call either succeed or
fail together.
I am looking for the equivalent of this approach in solrj.
Is there any?

Cheers,
Roland


Re: Delete on 8.5.1

2020-04-29 Thread Joe Obernberger
Hi - I also tried deleting from solrj (8.5.1) using 
CloudSolrClient.deleteByQuery.


This results in:

Error: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: 
Error from server at http://paradigm8:9100/solr/PROCESSOR_LOGS: Async 
exception during distributed update: Error from server at 
http://paradigm7:9100/solr/PROCESSOR_LOGS_shard6_replica_n20/: null




request: http://paradigm7:9100/solr/PROCESSOR_LOGS_shard6_replica_n20/
Remote error message: Task queue processing has stalled for 20203 ms 
with 0 remaining elements to process.
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: 
Error from server at http://paradigm8:9100/solr/PROCESSOR_LOGS: Async 
exception during distributed update: Error from server at 
http://paradigm7:9100/solr/PROCESSOR_LOGS_shard6_replica_n20/: null




request: http://paradigm7:9100/solr/PROCESSOR_LOGS_shard6_replica_n20/
Remote error message: Task queue processing has stalled for 20203 ms 
with 0 remaining elements to process.
    at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:665)
    at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:265)
    at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
    at 
org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:368)
    at 
org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:296)
    at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.sendRequest(BaseCloudSolrClient.java:1143)
    at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.requestWithRetryOnStaleState(BaseCloudSolrClient.java:906)
    at 
org.apache.solr.client.solrj.impl.BaseCloudSolrClient.request(BaseCloudSolrClient.java:838)
    at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
    at 
org.apache.solr.client.solrj.SolrClient.deleteByQuery(SolrClient.java:940)
    at 
org.apache.solr.client.solrj.SolrClient.deleteByQuery(SolrClient.java:903)
    at 
com.ngc.bigdata.solrsearcher.SearcherThread.doSearch(SearcherThread.java:401)
    at 
com.ngc.bigdata.solrsearcher.SearcherThread.run(SearcherThread.java:125)
    at 
com.ngc.bigdata.solrsearcher.Worker.doSearchTest(Worker.java:145)
    at 
com.ngc.bigdata.solrsearcher.SolrSearcher.main(SolrSearcher.java:60)



On 4/28/2020 11:50 AM, Joe Obernberger wrote:
Hi all - I'm running this query on solr cloud 8.5.1 with the index on 
HDFS:


curl http://enceladus:9100/solr/PROCESSOR_LOGS/update?commit=true -H 
"Connect-Type: text/xml" --data-binary 
'StartTime:[2020-01-01T01:02:43Z TO 
2020-04-25T00:00:00Z]'


getting this response:





  1
  500
  54091


  
    name="error-class">org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException
    name="root-error-class">org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException

  
  2 Async exceptions during distributed update:
Error from server at 
http://paradigm8:9100/solr/PROCESSOR_LOGS_shard2_replica_n4/: null




request: http://paradigm8:9100/solr/PROCESSOR_LOGS_shard2_replica_n4/
Remote error message: Task queue processing has stalled for 20193 ms 
with 0 remaining elements to process.
Error from server at 
http://belinda:9100/solr/PROCESSOR_LOGS_shard10_replica_n38/: null




request: http://belinda:9100/solr/PROCESSOR_LOGS_shard10_replica_n38/
Remote error message: Task queue processing has stalled for 20021 ms 
with 0 remaining elements to process.
  name="trace">org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException: 
2 Async exceptions during distributed update:
Error from server at 
http://paradigm8:9100/solr/PROCESSOR_LOGS_shard2_replica_n4/: null




request: http://paradigm8:9100/solr/PROCESSOR_LOGS_shard2_replica_n4/
Remote error message: Task queue processing has stalled for 20193 ms 
with 0 remaining elements to process.
Error from server at 
http://belinda:9100/solr/PROCESSOR_LOGS_shard10_replica_n38/: null




request: http://belinda:9100/solr/PROCESSOR_LOGS_shard10_replica_n38/
Remote error message: Task queue processing has stalled for 20021 ms 
with 0 remaining elements to process.
    at 
org.apache.solr.update.processor.DistributedZkUpdateProcessor.doDistribFinish(DistributedZkUpdateProcessor.java:1189)
    at 
org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:1096)
    at 
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.finish(LogUpdateProcessorFactory.java:182)
    at 
org.apache.solr.update.processor.UpdateRequestProcessor.finish(UpdateRequestProcessor.java:80)
    at 
org.apache.solr.update.processor.UpdateRequestProcessor.finish(UpdateRequestProcessor.java:80)
    at 
org.apache.solr.update.processor.UpdateRequestProcessor.finish(UpdateRequestProcessor.java:80)
   

Re: Solr Ref Guide Redesign coming in 8.6

2020-04-29 Thread Cassandra Targett
To respond to the feedback so far:

Version picker: This is more complex than it may be initially assumed
because of the way we publish the site. Since we use a static site
generator, each page is a standalone HTML file. To add a version picker to
an older version that includes all the latest versions, we'd need to
republish all the older versions every time we published a new one. We
could get around this by pointing a version picker to a location we update
each time (but still today would need to republish all the older versions
to add it at all), which was harder to do in the past due to how the Solr
website as a whole was published. I wanted to try to address this problem
in what I'm calling Phase 3 - moving to a different static site generator
that supports multiple versions in a more native way but maybe I can find a
simple stopgap until then.

Jumpiness while hovering over nav items: The font size isn't actually
different when you hover or select a nav item, it's just bolded which makes
it overrun the allotted space and wrap. I made it bolded in response to
other feedback I got earlier on that there was not enough differentiation
between selected and not-selected items but didn't notice the jumpiness (I
had to make my screen quite large to duplicate it, and when I looked at the
Guide on larger screens earlier I was usually looking at some other problem
and just didn't notice). I'll play with it a little and find another way to
differentiate the items without bolding them.

Thanks for your comments so far!

On Wed, Apr 29, 2020 at 2:56 AM Colvin Cowie 
wrote:

> In addition to those points, I think it generally does look good but the
> thing I've noticed is that increase in text size on rollover in the menu
> makes it quite jumpy:
> https://drive.google.com/open?id=15EF0T_C_l8OIDuW8QHOFunL4VzxtyVyb
>
> On Wed, 29 Apr 2020 at 08:15, Bernd Fehling <
> bernd.fehl...@uni-bielefeld.de>
> wrote:
>
> > +1
> >
> > And a fully indexed search for the Ref Guide.
> > I have to use Google to search for infos in Ref Guide of a search engine.
> > :-(
> >
> >
> > Am 29.04.20 um 02:11 schrieb matthew sporleder:
> > > I highly recommend a version selector in the header!  I am *always*
> > > landing on 6.x docs from google.
> > >
> > > On Tue, Apr 28, 2020 at 5:18 PM Cassandra Targett  >
> > wrote:
> > >>
> > >> In case the list breaks the URL to view the Jenkins build, here's a
> > shorter
> > >> URL:
> > >>
> > >> https://s.apache.org/df7ew.
> > >>
> > >> On Tue, Apr 28, 2020 at 3:12 PM Cassandra Targett <
> ctarg...@apache.org>
> > >> wrote:
> > >>
> > >>> The PMC would like to engage the Solr user community for feedback on
> an
> > >>> extensive redesign of the Solr Reference Guide I've just committed to
> > the
> > >>> master (future 9.0) branch.
> > >>>
> > >>> You can see the new design from our Jenkins build of master:
> > >>>
> > >>>
> >
> https://builds.apache.org/view/L/view/Lucene/job/Solr-reference-guide-master/javadoc/
> > >>>
> > >>> The hope is that you will receive these changes positively. If so,
> > we'll
> > >>> use this for the upcoming 8.6 Ref Guide and future releases. We also
> > may
> > >>> re-publish earlier 8.x versions so they use this design.
> > >>>
> > >>> I embarked on this project last December simply as an attempt to
> > upgrade
> > >>> the version of Bootstrap used by the Guide. After a couple of days,
> I'd
> > >>> changed the layout entirely. In the ensuing few months I've tried to
> > iron
> > >>> out the kinks and made some extensive changes to the "backend" (the
> > CSS,
> > >>> JavaScript, etc.).
> > >>>
> > >>> I'm no graphic designer, but some of my guiding thoughts were to try
> to
> > >>> make full use of the browser window, improve responsiveness for
> > different
> > >>> sized screens, and just give it a more modern feel. The full list of
> > what
> > >>> has changed is detailed in the Jira issue if you are interested:
> > >>> https://issues.apache.org/jira/browse/SOLR-14173
> > >>>
> > >>> This is Phase 1 of several changes. There is one glaring remaining
> > issue,
> > >>> which is that our list of top-level categories is too long for the
> new
> > >>> design. I've punted fixing that to Phase 2, which will be an
> extensive
> > >>> re-consideration of how the Ref Guide is organized with the goal of
> > >>> trimming down the top-level categories to only 4-6. SOLR-1 will
> > track
> > >>> phase 2.
> > >>>
> > >>> One last thing to note: this redesign really only changes the
> > presentation
> > >>> of the pages and some of the framework under the hood - it doesn't
> yet
> > add
> > >>> full-text search. All of the obstacles to providing search still
> > exist, but
> > >>> please know that we fully understand frustration on this point and
> > still
> > >>> hope to fix it.
> > >>>
> > >>> I look forward to hearing your feedback in this thread.
> > >>>
> > >>> Best,
> > >>> Cassandra
> > >>>
> >
>


Re: Solr Ref Guide Redesign coming in 8.6

2020-04-29 Thread Mark H. Wood
At first glance, I have no big issues.  It looks clean and functional,
and I like that.  I think it will work well enough for me.

This design still has a minor annoyance that I have noted in the past:
in the table of contents pane it is easy to open a subtree, but the
only way to close it is to open another one.  Obviously not a big
deal.

I'll probably spend too much time researching how to widen the
razor-thin scrollbar in the TOC panel, since it seems to be
independent of the way I spent too much time fixing the browser's own
inadequate scrollbar width. :-) Also, the thumb's color is so close to
the surrounding color that it's really hard to see.  And for some
reason when I use the mouse wheel to scroll the TOC, when it gets to
the top or the bottom the content pane starts scrolling instead, which
is surprising and mildly inconvenient.  Final picky point:  the
scrolling is *very* insensitive -- takes a lot of wheel motion to move
the panel just a bit.

(I'm aware that a lot of the things I complain about in "modern" web
sites are the things that make them "modern".  So, I'm an old fossil. :-)

Firefox 68.7.0esr, Gentoo Linux.

-- 
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu


signature.asc
Description: PGP signature


Re: SolrCloud degraded during backup and batch CSV update

2020-04-29 Thread matthew sporleder
FWIW I've had some luck with strategy 3 (increase zk timeout) when you
overwhelm the connection to zk or the disk on zk.

Is zk on the same boxes as solr?

On Tue, Apr 28, 2020 at 10:15 PM Sethuraman, Ganesh
 wrote:
>
> Hi
>
> We are using SolrCloud 7.2.1 with 3 node Zookeeper ensemble. We have 92 
> collection each on avg. having 8 shards and 2 replica with 2 EC2 nodes, with 
> JVM size of 18GB (G1 GC). We need your help with the Issue we faced today: 
> The issue is SolrCloud server went into a degraded collections (for few 
> collections) when the Solr backup and the Solr batch CSV update load happened 
> at the same time as backup. The CSV data load was about ~5 GB per 
> shard/replica. We think this happened after zkClient disconnect happened as 
> noted below.  We had to restart Solr to bring it back to normal.
>
>
>   1.  Is it not suggested to run backup and Solr batch CSV update large load 
> at the same time?
>   2.  In the past we have seen two CSV batch update load in parallel causes 
> issues, is this also not suggested (this issue is not related to that)?
>   3.  Do you think we should increase Zookeeper timeout?
>   4.  How do we know if  we need to up the JVM Max memory, and by how much?
>   5.  We also see that once the Solr goes into degraded collection and 
> recovery failed, it NEVER get back to normal, even after when there is no 
> load. Is this a bug?
>
> The GC information and Solr Log below
>
> https://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMjAvMDQvMjkvLS0wMl9zb2xyX2djLmxvZy56aXAtLTEtNDAtMzE=&channel=WEB
>
>
> 2020-04-27 07:34:07.322 WARN  
> (zkConnectionManagerCallback-6-thread-1-processing-n:mysolrsever.com:6010_solr-SendThread(zoo-prd-n1:2181))
>  [   ] o.a.z.ClientCnxn Client session timed out, have not heard from server 
> in 10775ms for sessionid 0x171a6fb51310008
> 
> 2020-04-27 07:34:07.426 WARN  
> (zkConnectionManagerCallback-6-thread-1-processing-n:mysolrsever.com:6010_solr-EventThread)
>  [   ] o.a.s.c.c.ConnectionManager zkClient has disconnected
>
>
>
>
> SOLR Log Below (Curtailed WARN log)
> 
> 2020-04-27 07:26:45.402 WARN  
> (recoveryExecutor-4-thread-697-processing-n:mysolrsever.com:6010_solr 
> x:mycollection_shard13_replica_n48 s:shard13 c:mycollection r:core_node51) 
> [c:mycollection s:shard13 r:core_node51 x:mycollection_shard13_replica_n48] 
> o.a.s.h.IndexFetcher Error in fetching file: _1kr_r.liv (downloaded 0 of 587 
> bytes)
> java.io.EOFException
>   at 
> org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:168)
>   at 
> org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:160)
>   at 
> org.apache.solr.handler.IndexFetcher$FileFetcher.fetchPackets(IndexFetcher.java:1579)
>   at 
> org.apache.solr.handler.IndexFetcher$FileFetcher.fetch(IndexFetcher.java:1545)
>   at 
> org.apache.solr.handler.IndexFetcher$FileFetcher.fetchFile(IndexFetcher.java:1526)
>   at 
> org.apache.solr.handler.IndexFetcher.downloadIndexFiles(IndexFetcher.java:1008)
>   at 
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:566)
>   at 
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:345)
>   at 
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:420)
>   at 
> org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:225)
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:626)
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:308)
>   at 
> org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:292)
>   at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> 2020-04-27 07:26:45.405 WARN  
> (recoveryExecutor-4-thread-697-processing-n:mysolrsever.com:6010_solr 
> x:mycollection_shard13_replica_n48 s:shard13 c:mycollection r:core_node51) 
> [c:mycollection s:shard13 r:core_node51 x:mycollection_shard13_replica_n48] 
> o.a.s.h.IndexFetcher Error in fetching file: _1kr_r.liv (downloaded 0 of 587 
> bytes)
> java.io.EOFException
>   at 
> org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:168)
>   at 
> org.apache.s

Re: Solr fields mapping

2020-04-29 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Hi, Sam!

Have you tried creating a copyField? 
https://builds.apache.org/view/L/view/Lucene/job/Solr-reference-guide-8.x/javadoc/copying-fields.html

Best,
Audrey

On 4/28/20, 1:07 PM, "sambasivarao giddaluri"  
wrote:

Hi All,
Is there a way we can map fields in a single field?
Ex: scheme has below fields
createdBy.userName
createdBy.name
createdBy.email

If have to retrieve these fields need to pass all the three fields in  *fl*
parameter  instead is there a way i can have a map or a object of these
fields in to createdBy and in fl i pass only createdBy and get all these 3
as output

Regards
sam




Re: off-heap OOM

2020-04-29 Thread Shawn Heisey

On 4/29/2020 2:07 AM, Raji N wrote:

Has anyone encountered off-heap OOM. We are thinking of reducing heap
further and increasing the hardcommit interval . Any other suggestions? .
Please share your thoughts.


It sounds like it's not heap memory that's running out.

When the OutOfMemoryError is logged, it will also contain a message 
mentioning which resource ran out.


A common message that might be logged with the OOME is "Unable to create 
native thread".  This type of error, if that's what's happening, 
actually has nothing at all to do with memory, OOME is just how Java 
happens to report it.


You will need to know exactly which resource is running out before we 
can offer any assistance.


If the OOME is logged, the message you're looking for will be in the 
solr log, not the tiny special log that is created when Solr is killed 
by an OOME.  What version of Solr are you running, and what OS is it 
running on?


Thanks,
Shawn


Re: off-heap OOM

2020-04-29 Thread Raji N
Thank you for your reply.  When OOM happens somehow it doesn't generate
dump file. So we have hourly heaps running to diagnose this issue. Heap is
around 700MB and threads around 150. But 29GB of native memory is used up,
it is consumed by java.io.DirectBufferR (27GB major consumption) and
java.io.DirectByteBuffer  objects .

We use solr 7.6.0 in solrcloud mode and OS is alpine . Java version

java -version

Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8

java version "1.8.0_211"

Java(TM) SE Runtime Environment (build 1.8.0_211-b12)

Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)



Thanks much for taking a look at it.

Raji



On Wed, Apr 29, 2020 at 10:04 AM Shawn Heisey  wrote:

> On 4/29/2020 2:07 AM, Raji N wrote:
> > Has anyone encountered off-heap OOM. We are thinking of reducing heap
> > further and increasing the hardcommit interval . Any other suggestions? .
> > Please share your thoughts.
>
> It sounds like it's not heap memory that's running out.
>
> When the OutOfMemoryError is logged, it will also contain a message
> mentioning which resource ran out.
>
> A common message that might be logged with the OOME is "Unable to create
> native thread".  This type of error, if that's what's happening,
> actually has nothing at all to do with memory, OOME is just how Java
> happens to report it.
>
> You will need to know exactly which resource is running out before we
> can offer any assistance.
>
> If the OOME is logged, the message you're looking for will be in the
> solr log, not the tiny special log that is created when Solr is killed
> by an OOME.  What version of Solr are you running, and what OS is it
> running on?
>
> Thanks,
> Shawn
>


Re: Solr Ref Guide Redesign coming in 8.6

2020-04-29 Thread Cassandra Targett
> This design still has a minor annoyance that I have noted in the past:
> in the table of contents pane it is easy to open a subtree, but the
> only way to close it is to open another one. Obviously not a big
> deal.

Thanks for pointing that out, it helped me find a big problem which was that I 
used the wrong build of JQuery to support using the caret to open/close the 
subtree. It should work now to open a subtree independently of clicking the 
heading, and should also close the tree.
> I'll probably spend too much time researching how to widen the
> razor-thin scrollbar in the TOC panel, since it seems to be
> independent of the way I spent too much time fixing the browser's own
> inadequate scrollbar width. :-) Also, the thumb's color is so close to
> the surrounding color that it's really hard to see. And for some
> reason when I use the mouse wheel to scroll the TOC, when it gets to
> the top or the bottom the content pane starts scrolling instead, which
> is surprising and mildly inconvenient. Final picky point: the
> scrolling is *very* insensitive -- takes a lot of wheel motion to move
> the panel just a bit.

I’m not totally following all of this, but if I assume you mean the left 
sidebar navigation (and not an in-page TOC) then my answer to at least part of 
it is to pare down the list of top-level topics so you don’t have to scroll it 
at all and then the only scrolling you need to do is for the content itself. 
That’s what I want to do in Phase 2, so there are several things in the 
behavior of the sidebar I’m purposely ignoring for now. Some will go away with 
a new organization and new things will be introduced that will need to be 
fixed, so to save myself some time I’m waiting to fix all of it at once.


Re: off-heap OOM

2020-04-29 Thread matthew sporleder
What does the message look like, exactly, from solr.log ?

On Wed, Apr 29, 2020 at 1:27 PM Raji N  wrote:
>
> Thank you for your reply.  When OOM happens somehow it doesn't generate
> dump file. So we have hourly heaps running to diagnose this issue. Heap is
> around 700MB and threads around 150. But 29GB of native memory is used up,
> it is consumed by java.io.DirectBufferR (27GB major consumption) and
> java.io.DirectByteBuffer  objects .
>
> We use solr 7.6.0 in solrcloud mode and OS is alpine . Java version
>
> java -version
>
> Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
>
> java version "1.8.0_211"
>
> Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
>
> Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)
>
>
>
> Thanks much for taking a look at it.
>
> Raji
>
>
>
> On Wed, Apr 29, 2020 at 10:04 AM Shawn Heisey  wrote:
>
> > On 4/29/2020 2:07 AM, Raji N wrote:
> > > Has anyone encountered off-heap OOM. We are thinking of reducing heap
> > > further and increasing the hardcommit interval . Any other suggestions? .
> > > Please share your thoughts.
> >
> > It sounds like it's not heap memory that's running out.
> >
> > When the OutOfMemoryError is logged, it will also contain a message
> > mentioning which resource ran out.
> >
> > A common message that might be logged with the OOME is "Unable to create
> > native thread".  This type of error, if that's what's happening,
> > actually has nothing at all to do with memory, OOME is just how Java
> > happens to report it.
> >
> > You will need to know exactly which resource is running out before we
> > can offer any assistance.
> >
> > If the OOME is logged, the message you're looking for will be in the
> > solr log, not the tiny special log that is created when Solr is killed
> > by an OOME.  What version of Solr are you running, and what OS is it
> > running on?
> >
> > Thanks,
> > Shawn
> >


Re: off-heap OOM

2020-04-29 Thread Jan Høydahl
I have seen the same, but only in Docker.
I think it does not relate to Solr’s off-heap usage for filters and other data 
structures, but rather how Docker treats memory-mapped files as virtual memory.
As you know, when using MMapDirectoryFactory, you actually let Linux handle the 
loading and unloading of the index files, and Solr will access them as if they 
were in a huge virtual memory pool. Naturally the index files grow large, and 
there is something strange going on in the way Docker handles this, leading to 
OOM, not for Java heap but for the process.

I have no definitive answer, but so far my research has found a few possible 
settings

Set env.var MALLOC_ARENA_MAX=2
Try to limit -XX:MaxDirectMemorySize
Lower mem swappiness in Docker (--memory-swappiness 0)
More generic insight into java mem allocation in Docker: 
https://dzone.com/articles/native-memory-allocation-in-examples

Have not yet found a silver bullet, so very interested in this thread.

Jan

> 29. apr. 2020 kl. 19:26 skrev Raji N :
> 
> Thank you for your reply.  When OOM happens somehow it doesn't generate
> dump file. So we have hourly heaps running to diagnose this issue. Heap is
> around 700MB and threads around 150. But 29GB of native memory is used up,
> it is consumed by java.io.DirectBufferR (27GB major consumption) and
> java.io.DirectByteBuffer  objects .
> 
> We use solr 7.6.0 in solrcloud mode and OS is alpine . Java version
> 
> java -version
> 
> Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
> 
> java version "1.8.0_211"
> 
> Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
> 
> Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)
> 
> 
> 
> Thanks much for taking a look at it.
> 
> Raji
> 
> 
> 
> On Wed, Apr 29, 2020 at 10:04 AM Shawn Heisey  wrote:
> 
>> On 4/29/2020 2:07 AM, Raji N wrote:
>>> Has anyone encountered off-heap OOM. We are thinking of reducing heap
>>> further and increasing the hardcommit interval . Any other suggestions? .
>>> Please share your thoughts.
>> 
>> It sounds like it's not heap memory that's running out.
>> 
>> When the OutOfMemoryError is logged, it will also contain a message
>> mentioning which resource ran out.
>> 
>> A common message that might be logged with the OOME is "Unable to create
>> native thread".  This type of error, if that's what's happening,
>> actually has nothing at all to do with memory, OOME is just how Java
>> happens to report it.
>> 
>> You will need to know exactly which resource is running out before we
>> can offer any assistance.
>> 
>> If the OOME is logged, the message you're looking for will be in the
>> solr log, not the tiny special log that is created when Solr is killed
>> by an OOME.  What version of Solr are you running, and what OS is it
>> running on?
>> 
>> Thanks,
>> Shawn
>> 



RE: SolrCloud degraded during backup and batch CSV update

2020-04-29 Thread Sethuraman, Ganesh
3 Zookeeper ensemble are all in 3 separate boxes (EC2 instances). Each have 
separate transactional logs directory (separate EBS volume, separate disk), as 
this was zookeeper best practices. 

It feels ZK timeout is more of symptom and Solr slowness is the cause. Having 
said that, do you increase the timeout setting  in Solr or Zookeeper, if you 
can share parameters it will certainly help

Regards
Ganesh

-Original Message-
From: matthew sporleder  
Sent: Wednesday, April 29, 2020 11:47 AM
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud degraded during backup and batch CSV update

CAUTION: This email originated from outside of D&B. Please do not click links 
or open attachments unless you recognize the sender and know the content is 
safe.


FWIW I've had some luck with strategy 3 (increase zk timeout) when you
overwhelm the connection to zk or the disk on zk.

Is zk on the same boxes as solr?

On Tue, Apr 28, 2020 at 10:15 PM Sethuraman, Ganesh
 wrote:
>
> Hi
>
> We are using SolrCloud 7.2.1 with 3 node Zookeeper ensemble. We have 92 
> collection each on avg. having 8 shards and 2 replica with 2 EC2 nodes, with 
> JVM size of 18GB (G1 GC). We need your help with the Issue we faced today: 
> The issue is SolrCloud server went into a degraded collections (for few 
> collections) when the Solr backup and the Solr batch CSV update load happened 
> at the same time as backup. The CSV data load was about ~5 GB per 
> shard/replica. We think this happened after zkClient disconnect happened as 
> noted below.  We had to restart Solr to bring it back to normal.
>
>
>   1.  Is it not suggested to run backup and Solr batch CSV update large load 
> at the same time?
>   2.  In the past we have seen two CSV batch update load in parallel causes 
> issues, is this also not suggested (this issue is not related to that)?
>   3.  Do you think we should increase Zookeeper timeout?
>   4.  How do we know if  we need to up the JVM Max memory, and by how much?
>   5.  We also see that once the Solr goes into degraded collection and 
> recovery failed, it NEVER get back to normal, even after when there is no 
> load. Is this a bug?
>
> The GC information and Solr Log below
>
> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgceasy.io%2Fmy-gc-report.jsp%3Fp%3Dc2hhcmVkLzIwMjAvMDQvMjkvLS0wMl9zb2xyX2djLmxvZy56aXAtLTEtNDAtMzE%3D%26channel%3DWEB&data=02%7C01%7CSethuramanG%40dnb.com%7C8e47fdc157ce4bf8425c08d7ec548d6c%7C19e2b708bf12437597198dec42771b3e%7C0%7C0%7C637237720241486583&sdata=MlstokkgBAX7joUpljJnQZjrbQ7cZZAoSPfQebx2q5I%3D&reserved=0
>
>
> 2020-04-27 07:34:07.322 WARN  
> (zkConnectionManagerCallback-6-thread-1-processing-n:mysolrsever.com:6010_solr-SendThread(zoo-prd-n1:2181))
>  [   ] o.a.z.ClientCnxn Client session timed out, have not heard from server 
> in 10775ms for sessionid 0x171a6fb51310008
> 
> 2020-04-27 07:34:07.426 WARN  
> (zkConnectionManagerCallback-6-thread-1-processing-n:mysolrsever.com:6010_solr-EventThread)
>  [   ] o.a.s.c.c.ConnectionManager zkClient has disconnected
>
>
>
>
> SOLR Log Below (Curtailed WARN log)
> 
> 2020-04-27 07:26:45.402 WARN  
> (recoveryExecutor-4-thread-697-processing-n:mysolrsever.com:6010_solr 
> x:mycollection_shard13_replica_n48 s:shard13 c:mycollection r:core_node51) 
> [c:mycollection s:shard13 r:core_node51 x:mycollection_shard13_replica_n48] 
> o.a.s.h.IndexFetcher Error in fetching file: _1kr_r.liv (downloaded 0 of 587 
> bytes)
> java.io.EOFException
>   at 
> org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:168)
>   at 
> org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:160)
>   at 
> org.apache.solr.handler.IndexFetcher$FileFetcher.fetchPackets(IndexFetcher.java:1579)
>   at 
> org.apache.solr.handler.IndexFetcher$FileFetcher.fetch(IndexFetcher.java:1545)
>   at 
> org.apache.solr.handler.IndexFetcher$FileFetcher.fetchFile(IndexFetcher.java:1526)
>   at 
> org.apache.solr.handler.IndexFetcher.downloadIndexFiles(IndexFetcher.java:1008)
>   at 
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:566)
>   at 
> org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:345)
>   at 
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:420)
>   at 
> org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:225)
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:626)
>   at 
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:308)
>   at 
> org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:292)
>   at 
> com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
>   at 
> java.util.

Re: SolrCloud degraded during backup and batch CSV update

2020-04-29 Thread matthew sporleder
You can add something like this to SOLR_OPTS -DzkClientTimeout=3
in your init script or adjust solr.xml ${zkClientTimeout:15000}

On Wed, Apr 29, 2020 at 5:41 PM Sethuraman, Ganesh
 wrote:
>
> 3 Zookeeper ensemble are all in 3 separate boxes (EC2 instances). Each have 
> separate transactional logs directory (separate EBS volume, separate disk), 
> as this was zookeeper best practices.
>
> It feels ZK timeout is more of symptom and Solr slowness is the cause. Having 
> said that, do you increase the timeout setting  in Solr or Zookeeper, if you 
> can share parameters it will certainly help
>
> Regards
> Ganesh
>
> -Original Message-
> From: matthew sporleder 
> Sent: Wednesday, April 29, 2020 11:47 AM
> To: solr-user@lucene.apache.org
> Subject: Re: SolrCloud degraded during backup and batch CSV update
>
> CAUTION: This email originated from outside of D&B. Please do not click links 
> or open attachments unless you recognize the sender and know the content is 
> safe.
>
>
> FWIW I've had some luck with strategy 3 (increase zk timeout) when you
> overwhelm the connection to zk or the disk on zk.
>
> Is zk on the same boxes as solr?
>
> On Tue, Apr 28, 2020 at 10:15 PM Sethuraman, Ganesh
>  wrote:
> >
> > Hi
> >
> > We are using SolrCloud 7.2.1 with 3 node Zookeeper ensemble. We have 92 
> > collection each on avg. having 8 shards and 2 replica with 2 EC2 nodes, 
> > with JVM size of 18GB (G1 GC). We need your help with the Issue we faced 
> > today: The issue is SolrCloud server went into a degraded collections (for 
> > few collections) when the Solr backup and the Solr batch CSV update load 
> > happened at the same time as backup. The CSV data load was about ~5 GB per 
> > shard/replica. We think this happened after zkClient disconnect happened as 
> > noted below.  We had to restart Solr to bring it back to normal.
> >
> >
> >   1.  Is it not suggested to run backup and Solr batch CSV update large 
> > load at the same time?
> >   2.  In the past we have seen two CSV batch update load in parallel causes 
> > issues, is this also not suggested (this issue is not related to that)?
> >   3.  Do you think we should increase Zookeeper timeout?
> >   4.  How do we know if  we need to up the JVM Max memory, and by how much?
> >   5.  We also see that once the Solr goes into degraded collection and 
> > recovery failed, it NEVER get back to normal, even after when there is no 
> > load. Is this a bug?
> >
> > The GC information and Solr Log below
> >
> > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgceasy.io%2Fmy-gc-report.jsp%3Fp%3Dc2hhcmVkLzIwMjAvMDQvMjkvLS0wMl9zb2xyX2djLmxvZy56aXAtLTEtNDAtMzE%3D%26channel%3DWEB&data=02%7C01%7CSethuramanG%40dnb.com%7C8e47fdc157ce4bf8425c08d7ec548d6c%7C19e2b708bf12437597198dec42771b3e%7C0%7C0%7C637237720241486583&sdata=MlstokkgBAX7joUpljJnQZjrbQ7cZZAoSPfQebx2q5I%3D&reserved=0
> >
> >
> > 2020-04-27 07:34:07.322 WARN  
> > (zkConnectionManagerCallback-6-thread-1-processing-n:mysolrsever.com:6010_solr-SendThread(zoo-prd-n1:2181))
> >  [   ] o.a.z.ClientCnxn Client session timed out, have not heard from 
> > server in 10775ms for sessionid 0x171a6fb51310008
> > 
> > 2020-04-27 07:34:07.426 WARN  
> > (zkConnectionManagerCallback-6-thread-1-processing-n:mysolrsever.com:6010_solr-EventThread)
> >  [   ] o.a.s.c.c.ConnectionManager zkClient has disconnected
> >
> >
> >
> >
> > SOLR Log Below (Curtailed WARN log)
> > 
> > 2020-04-27 07:26:45.402 WARN  
> > (recoveryExecutor-4-thread-697-processing-n:mysolrsever.com:6010_solr 
> > x:mycollection_shard13_replica_n48 s:shard13 c:mycollection r:core_node51) 
> > [c:mycollection s:shard13 r:core_node51 x:mycollection_shard13_replica_n48] 
> > o.a.s.h.IndexFetcher Error in fetching file: _1kr_r.liv (downloaded 0 of 
> > 587 bytes)
> > java.io.EOFException
> >   at 
> > org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:168)
> >   at 
> > org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:160)
> >   at 
> > org.apache.solr.handler.IndexFetcher$FileFetcher.fetchPackets(IndexFetcher.java:1579)
> >   at 
> > org.apache.solr.handler.IndexFetcher$FileFetcher.fetch(IndexFetcher.java:1545)
> >   at 
> > org.apache.solr.handler.IndexFetcher$FileFetcher.fetchFile(IndexFetcher.java:1526)
> >   at 
> > org.apache.solr.handler.IndexFetcher.downloadIndexFiles(IndexFetcher.java:1008)
> >   at 
> > org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:566)
> >   at 
> > org.apache.solr.handler.IndexFetcher.fetchLatestIndex(IndexFetcher.java:345)
> >   at 
> > org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:420)
> >   at 
> > org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:225)
> >   at 
> > org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery

Re: off-heap OOM

2020-04-29 Thread Raji N
Thanks so much Jan. Will try your suggestions , yes we are also running
solr inside docker.

Thanks,
Raji

On Wed, Apr 29, 2020 at 1:46 PM Jan Høydahl  wrote:

> I have seen the same, but only in Docker.
> I think it does not relate to Solr’s off-heap usage for filters and other
> data structures, but rather how Docker treats memory-mapped files as
> virtual memory.
> As you know, when using MMapDirectoryFactory, you actually let Linux
> handle the loading and unloading of the index files, and Solr will access
> them as if they were in a huge virtual memory pool. Naturally the index
> files grow large, and there is something strange going on in the way Docker
> handles this, leading to OOM, not for Java heap but for the process.
>
> I have no definitive answer, but so far my research has found a few
> possible settings
>
> Set env.var MALLOC_ARENA_MAX=2
> Try to limit -XX:MaxDirectMemorySize
> Lower mem swappiness in Docker (--memory-swappiness 0)
> More generic insight into java mem allocation in Docker:
> https://dzone.com/articles/native-memory-allocation-in-examples
>
> Have not yet found a silver bullet, so very interested in this thread.
>
> Jan
>
> > 29. apr. 2020 kl. 19:26 skrev Raji N :
> >
> > Thank you for your reply.  When OOM happens somehow it doesn't generate
> > dump file. So we have hourly heaps running to diagnose this issue. Heap
> is
> > around 700MB and threads around 150. But 29GB of native memory is used
> up,
> > it is consumed by java.io.DirectBufferR (27GB major consumption) and
> > java.io.DirectByteBuffer  objects .
> >
> > We use solr 7.6.0 in solrcloud mode and OS is alpine . Java version
> >
> > java -version
> >
> > Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
> >
> > java version "1.8.0_211"
> >
> > Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
> >
> > Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)
> >
> >
> >
> > Thanks much for taking a look at it.
> >
> > Raji
> >
> >
> >
> > On Wed, Apr 29, 2020 at 10:04 AM Shawn Heisey 
> wrote:
> >
> >> On 4/29/2020 2:07 AM, Raji N wrote:
> >>> Has anyone encountered off-heap OOM. We are thinking of reducing heap
> >>> further and increasing the hardcommit interval . Any other
> suggestions? .
> >>> Please share your thoughts.
> >>
> >> It sounds like it's not heap memory that's running out.
> >>
> >> When the OutOfMemoryError is logged, it will also contain a message
> >> mentioning which resource ran out.
> >>
> >> A common message that might be logged with the OOME is "Unable to create
> >> native thread".  This type of error, if that's what's happening,
> >> actually has nothing at all to do with memory, OOME is just how Java
> >> happens to report it.
> >>
> >> You will need to know exactly which resource is running out before we
> >> can offer any assistance.
> >>
> >> If the OOME is logged, the message you're looking for will be in the
> >> solr log, not the tiny special log that is created when Solr is killed
> >> by an OOME.  What version of Solr are you running, and what OS is it
> >> running on?
> >>
> >> Thanks,
> >> Shawn
> >>
>
>