Solr hangs / LRU operations are heavy on cpu

2015-03-19 Thread Sergey Shvets
Hi,

we have quite a problem with Solr. We are running it in a config 6x3, and
suddenly solr started to hang, taking all the available cpu on the nodes.

In the threads dump noticed things like this can eat lot of CPU time


   - org.apache.solr.search.LRUCache.put​(LRUCache.java:116)
   -
   org.apache.solr.search.SolrIndexSearcher.doc​(SolrIndexSearcher.java:705)
   -
   
org.apache.solr.response.BinaryResponseWriter$Resolver.writeResultsBody​(BinaryResponseWriter.java:155)
   -
   
org.apache.solr.response.BinaryResponseWriter$Resolver.writeResults​(BinaryResponseWriter.java:183)
   -
   
org.apache.solr.response.BinaryResponseWriter$Resolver.resolve​(BinaryResponseWriter.java:88)
   -
   org.apache.solr.common.util.JavaBinCodec.writeVal​(JavaBinCodec.java:158)
   -
   
org.apache.solr.common.util.JavaBinCodec.writeNamedList​(JavaBinCodec.java:148)
   -
   
org.apache.solr.common.util.JavaBinCodec.writeKnownType​(JavaBinCodec.java:242)
   -
   org.apache.solr.common.util.JavaBinCodec.writeVal​(JavaBinCodec.java:153)
   - org.apache.solr.common.util.JavaBinCodec.marshal​(JavaBinCodec.java:96)
   -
   
org.apache.solr.response.BinaryResponseWriter.write​(BinaryResponseWriter.java:52)
   -
   
org.apache.solr.servlet.SolrDispatchFilter.writeResponse​(SolrDispatchFilter.java:758)
   -
   
org.apache.solr.servlet.SolrDispatchFilter.doFilter​(SolrDispatchFilter.java:426)
   -
   
org.apache.solr.servlet.SolrDispatchFilter.doFilter​(SolrDispatchFilter.java:207)
   -
   
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter​(ApplicationFilterChain.java:241)
   -
   
org.apache.catalina.core.ApplicationFilterChain.doFilter​(ApplicationFilterChain.java:208)
   -
   
org.apache.catalina.core.StandardWrapperValve.invoke​(StandardWrapperValve.java:220)
   -
   
org.apache.catalina.core.StandardContextValve.invoke​(StandardContextValve.java:122)
   -
   
org.apache.catalina.core.StandardHostValve.invoke​(StandardHostValve.java:170)
   -
   
org.apache.catalina.valves.ErrorReportValve.invoke​(ErrorReportValve.java:103)
   -
   org.apache.catalina.valves.AccessLogValve.invoke​(AccessLogValve.java:950)
   -
   
org.apache.catalina.core.StandardEngineValve.invoke​(StandardEngineValve.java:116)


The cache itself is very minimalistic


  




true
20
200

Solr version is 4.10.3

Any of help is appreciated!

sergey


Re: Solr hangs / LRU operations are heavy on cpu

2015-03-20 Thread Sergey Shvets
Hello Umesh,

Thank you, indeed that gave positive results so far.

we  changed  completely to LFU. Today it went quite okay. We wait till
it shows more stability and then work out the optimal cache size.

Below is a summary of the changes.

- 
- 
- 
- 
+ 
+ 
+ 
+ 
+ 



-- 
Best regards,
 Sergeymailto:ser...@bintime.com



Re: Solr hangs / LRU operations are heavy on cpu

2015-03-20 Thread Sergey Shvets
Hello Shawn,

In that case it makes it a bit strange the behavior as it was noticed.
LRU   was   heavy   on  the  CPU in threads dump, and I don't have any
reasonable explanation for that.

However switch to LFU seemingly solved the case.



-- 
Best regards,
 Sergeymailto:ser...@bintime.com



Re: Indexing gets significantly slower after every batch commit

2015-05-21 Thread Sergey Shvets
Hi Angel

We also noticed that kind of performance degrade in our workloads.

Which is logical as index growth and time needed to put something to it is
log(n)



четверг, 21 мая 2015 г. пользователь Angel Todorov написал:

> hi Shawn,
>
> Thanks a bunch for your feedback. I've played with the heap size, but I
> don't see any improvement. Even if i index, say , a million docs, and the
> throughput is about 300 docs per sec, and then I shut down solr completely
> - after I start indexing again, the throughput is dropping below 300.
>
> I should probably experiment with sharding those documents to multiple SOLR
> cores - that should help, I guess. I am talking about something like this:
>
>
> https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud
>
> Thanks,
> Angel
>
>
> On Thu, May 21, 2015 at 11:36 AM, Shawn Heisey  > wrote:
>
> > On 5/21/2015 2:07 AM, Angel Todorov wrote:
> > > I'm crawling a file system folder and indexing 10 million docs, and I
> am
> > > adding them in batches of 5000, committing every 50 000 docs. The
> > problem I
> > > am facing is that after each commit, the documents per sec that are
> > indexed
> > > gets less and less.
> > >
> > > If I do not commit at all, I can index those docs very quickly, and
> then
> > I
> > > commit once at the end, but once i start indexing docs _after_ that
> (for
> > > example new files get added to the folder), indexing is also slowing
> > down a
> > > lot.
> > >
> > > Is it normal that the SOLR indexing speed depends on the number of
> > > documents that are _already_ indexed? I think it shouldn't matter if i
> > > start from scratch or I index a document in a core that already has a
> > > couple of million docs. Looks like SOLR is either doing something in a
> > > linear fashion, or there is some magic config parameter that I am not
> > aware
> > > of.
> > >
> > > I've read all perf docs, and I've tried changing mergeFactor,
> > > autowarmCounts, and the buffer sizes - to no avail.
> > >
> > > I am using SOLR 5.1
> >
> > Have you changed the heap size?  If you use the bin/solr script to start
> > it and don't change the heap size with the -m option or another method,
> > Solr 5.1 runs with a default size of 512MB, which is *very* small.
> >
> > I bet you are running into problems with frequent and then ultimately
> > constant garbage collection, as Java attempts to free up enough memory
> > to allow the program to continue running.  If that is what is happening,
> > then eventually you will see an OutOfMemoryError exception.  The
> > solution is to increase the heap size.  I would probably start with at
> > least 4G for 10 million docs.
> >
> > Thanks,
> > Shawn
> >
> >
>


Re: How to index 20 000 files with a command line ?

2015-05-29 Thread Sergey Shvets
Hello Bruno,

You can use find command with exec attribute.

regards
 Sergey

Friday, May 29, 2015, 3:11:37 PM, you wrote:

Dear Solr Users,

Habitualy i use this command line to index my files:
 >bin/post -c hbl /data/hbl-201522/*.xml

but today I have a big update, so there are 20 000 xml files (each files 
1kohttp://www.avast.com




-- 
Best regards,
 Sergeymailto:ser...@bintime.com