High facet.limit (with only 2-3 actual facets) -> Massive bandwidth consumption in DistributedSearch

2011-09-08 Thread Frederik Kraus
 Hi guys, 

I've just experienced an odd issue today with the following setup:

Test 1:

20 Shards
facet.limit=2 (returned facets 2-3)
about 5-6MB network traffic
Resp Time ca 3sec

Test 2:

20 Shards
facet.limit=100 (returned facets 2-3)
only a few kb network traffic
Resp Time ca 0.02sec

Does anyone of you know the reason for this odd behavior or should I create a 
Jira ticket? 

Thanks,

Fred.





Re: High facet.limit (with only 2-3 actual facets) -> Massive bandwidth consumption in DistributedSearch

2011-09-08 Thread Frederik Kraus
yep - facet.mincount=1

Am Donnerstag, 8. September 2011 um 21:37 schrieb Michael Ryan:

> Are you using facet.mincount in the query?
> 
> -Michael



Re: High facet.limit (with only 2-3 actual facets) -> Massive bandwidth consumption in DistributedSearch

2011-09-08 Thread Frederik Kraus
 Now that is quite interesting indeed and sounds like a bug to me. Including 
facets with a count of 0 we have a few 100k which then apparently get 
transferred. hmhmhm 

Can anyone with more knowledge of the facet component maybe chime in why the 
miscount is removed?


Am Donnerstag, 8. September 2011 um 22:03 schrieb Michael Ryan:

> What is happening is that the facet.mincount parameter is removed when the 
> query is made to the shards, so each shard is returning about 3 facet 
> values, most of them with a count of 0. I don't have a complete understanding 
> of FacetComponent, but it seems like it shouldn't have to do this. 



Re: High facet.limit (with only 2-3 actual facets) -> Massive bandwidth consumption in DistributedSearch

2011-09-08 Thread Frederik Kraus
 In our case it's clearly the wrong tradeoff :) 

I'm going to patch our Solr for now, but either

- a config option
- a facet.whatever param
- or reversing the tradeoff

should be done in my eyes. 


Am Donnerstag, 8. September 2011 um 22:34 schrieb Yonik Seeley:

> So this is bad if you have a high facet.limit, but really few actual matches.
> It may be better for large base docsets that match a lot of facet
> values (but in that case, one would expect to see few zeros anyway).
> So perhaps using 0 as the mincount isn't the right tradeoff? 



strange performance issue with many shards on one server

2011-09-28 Thread Frederik Kraus
 Hi, 


I am experiencing a strange issue doing some load tests. Our setup:

- 2 server with each 24 cpu cores, 130GB of RAM
- 10 shards per server (needed for response times) running in a single tomcat 
instance
- each query queries all 20 shards (distributed search)

- each shard holds about 1.5 mio documents (small shards are needed due to 
rather complex queries)
- all caches are warmed / high cache hit rates (99%) etc.


Now for some reason we cannot seem to fully utilize all CPU power (no disk IO), 
ie. increasing concurrent users doesn't increase CPU-Load at a point, decreases 
throughput and increases the response times of the individual queries.

Also 1-2% of the queries take significantly longer: avg somewhere at 100ms 
while 1-2% take 1.5s or longer. 

Any ideas are greatly appreciated :)

Fred.



Re: strange performance issue with many shards on one server

2011-09-28 Thread Frederik Kraus
Hi Vladim, 

the thing is, that those exact same queries, that take longer during a load 
test, perform just fine when executed at a slower request rate and are also 
random, i.e. there is no pattern in bad/slow queries.

My first thought was some kind of contention and/or connection starvation for 
the internal shard communication?

Fred.


Am Mittwoch, 28. September 2011 um 13:18 schrieb Vadim Kisselmann:

> Hi Fred,
> analyze the queries which take longer.
> We observe our queries and see the problems with q-time with queries which
> are complex, with phrase queries or queries which contains numbers or
> special characters.
> if you don't know it:
> http://www.hathitrust.org/blogs/large-scale-search/tuning-search-performance
> Regards
> Vadim
> 
> 
> 2011/9/28 Frederik Kraus  (mailto:frederik.kr...@gmail.com)>
> 
> >  Hi,
> > 
> > 
> > I am experiencing a strange issue doing some load tests. Our setup:
> > 
> > - 2 server with each 24 cpu cores, 130GB of RAM
> > - 10 shards per server (needed for response times) running in a single
> > tomcat instance
> > - each query queries all 20 shards (distributed search)
> > 
> > - each shard holds about 1.5 mio documents (small shards are needed due to
> > rather complex queries)
> > - all caches are warmed / high cache hit rates (99%) etc.
> > 
> > 
> > Now for some reason we cannot seem to fully utilize all CPU power (no disk
> > IO), ie. increasing concurrent users doesn't increase CPU-Load at a point,
> > decreases throughput and increases the response times of the individual
> > queries.
> > 
> > Also 1-2% of the queries take significantly longer: avg somewhere at 100ms
> > while 1-2% take 1.5s or longer.
> > 
> > Any ideas are greatly appreciated :)
> > 
> > Fred.



Re: strange performance issue with many shards on one server

2011-09-28 Thread Frederik Kraus


Am Mittwoch, 28. September 2011 um 13:41 schrieb Vadim Kisselmann:

> Hi Fred,
> 
> ok, it's a strange behavior with same queries.
> Another questions:
> -which solr version?

3.3 (might the NIOFSDirectory from 3.4 help?)
 
> -do you indexing during your load test? (because of index rebuilt)
nope
 
> -do you replicate your index?

nope 
> 
> Regards
> Vadim
> 
> 
> 
> 2011/9/28 Frederik Kraus  (mailto:frederik.kr...@gmail.com)>
> 
> > Hi Vladim,
> > 
> > the thing is, that those exact same queries, that take longer during a load
> > test, perform just fine when executed at a slower request rate and are also
> > random, i.e. there is no pattern in bad/slow queries.
> > 
> > My first thought was some kind of contention and/or connection starvation
> > for the internal shard communication?
> > 
> > Fred.
> > 
> > 
> > Am Mittwoch, 28. September 2011 um 13:18 schrieb Vadim Kisselmann:
> > 
> > > Hi Fred,
> > > analyze the queries which take longer.
> > > We observe our queries and see the problems with q-time with queries
> > which
> > > are complex, with phrase queries or queries which contains numbers or
> > > special characters.
> > > if you don't know it:
> > http://www.hathitrust.org/blogs/large-scale-search/tuning-search-performance
> > > Regards
> > > Vadim
> > > 
> > > 
> > > 2011/9/28 Frederik Kraus  > > (mailto:frederik.kr...@gmail.com) (mailto:
> > frederik.kr...@gmail.com (mailto:frederik.kr...@gmail.com))>
> > > 
> > > >  Hi,
> > > > 
> > > > 
> > > > I am experiencing a strange issue doing some load tests. Our setup:
> > > > 
> > > > - 2 server with each 24 cpu cores, 130GB of RAM
> > > > - 10 shards per server (needed for response times) running in a single
> > > > tomcat instance
> > > > - each query queries all 20 shards (distributed search)
> > > > 
> > > > - each shard holds about 1.5 mio documents (small shards are needed due
> > to
> > > > rather complex queries)
> > > > - all caches are warmed / high cache hit rates (99%) etc.
> > > > 
> > > > 
> > > > Now for some reason we cannot seem to fully utilize all CPU power (no
> > disk
> > > > IO), ie. increasing concurrent users doesn't increase CPU-Load at a
> > point,
> > > > decreases throughput and increases the response times of the individual
> > > > queries.
> > > > 
> > > > Also 1-2% of the queries take significantly longer: avg somewhere at
> > 100ms
> > > > while 1-2% take 1.5s or longer.
> > > > 
> > > > Any ideas are greatly appreciated :)
> > > > 
> > > > Fred.



Re: strange performance issue with many shards on one server

2011-09-28 Thread Frederik Kraus
lrDispatchFilter.java:356) 
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
 
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:554) 
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) 
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) 
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) 
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) 
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
 
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) 
at java.lang.Thread.run(Thread.java:662) 






Am Mittwoch, 28. September 2011 um 13:53 schrieb Frederik Kraus:

> 
> 
> Am Mittwoch, 28. September 2011 um 13:41 schrieb Vadim Kisselmann:
> 
> > Hi Fred,
> > 
> > ok, it's a strange behavior with same queries.
> > Another questions:
> > -which solr version?
> 
> 3.3 (might the NIOFSDirectory from 3.4 help?)
> 
> > -do you indexing during your load test? (because of index rebuilt)
> nope
> 
> > -do you replicate your index?
> 
> nope 
> > 
> > Regards
> > Vadim
> > 
> > 
> > 
> > 2011/9/28 Frederik Kraus  > (mailto:frederik.kr...@gmail.com)>
> > 
> > > Hi Vladim,
> > > 
> > > the thing is, that those exact same queries, that take longer during a 
> > > load
> > > test, perform just fine when executed at a slower request rate and are 
> > > also
> > > random, i.e. there is no pattern in bad/slow queries.
> > > 
> > > My first thought was some kind of contention and/or connection starvation
> > > for the internal shard communication?
> > > 
> > > Fred.
> > > 
> > > 
> > > Am Mittwoch, 28. September 2011 um 13:18 schrieb Vadim Kisselmann:
> > > 
> > > > Hi Fred,
> > > > analyze the queries which take longer.
> > > > We observe our queries and see the problems with q-time with queries
> > > which
> > > > are complex, with phrase queries or queries which contains numbers or
> > > > special characters.
> > > > if you don't know it:
> > > http://www.hathitrust.org/blogs/large-scale-search/tuning-search-performance
> > > > Regards
> > > > Vadim
> > > > 
> > > > 
> > > > 2011/9/28 Frederik Kraus  > > > (mailto:frederik.kr...@gmail.com) (mailto:
> > > frederik.kr...@gmail.com (mailto:frederik.kr...@gmail.com))>
> > > > 
> > > > >  Hi,
> > > > > 
> > > > > 
> > > > > I am experiencing a strange issue doing some load tests. Our setup:
> > > > > 
> > > > > - 2 server with each 24 cpu cores, 130GB of RAM
> > > > > - 10 shards per server (needed for response times) running in a single
> > > > > tomcat instance
> > > > > - each query queries all 20 shards (distributed search)
> > > > > 
> > > > > - each shard holds about 1.5 mio documents (small shards are needed 
> > > > > due
> > > to
> > > > > rather complex queries)
> > > > > - all caches are warmed / high cache hit rates (99%) etc.
> > > > > 
> > > > > 
> > > > > Now for some reason we cannot seem to fully utilize all CPU power (no
> > > disk
> > > > > IO), ie. increasing concurrent users doesn't increase CPU-Load at a
> > > point,
> > > > > decreases throughput and increases the response times of the 
> > > > > individual
> > > > > queries.
> > > > > 
> > > > > Also 1-2% of the queries take significantly longer: avg somewhere at
> > > 100ms
> > > > > while 1-2% take 1.5s or longer.
> > > > > 
> > > > > Any ideas are greatly appreciated :)
> > > > > 
> > > > > Fred.



Re: strange performance issue with many shards on one server

2011-09-28 Thread Frederik Kraus
 Hi Ken,  

the HttpConnectionManager was actually the first thing I looked at - and bumped 
the Solr default of 20 up to 50, 100, 400, 1 (which should be more or less 
unlimited ;) ). Unfortunately didn't really solve anything. I don't know if the 
"static" HttpClient is a problem here as it will be the same 
HttpConnectionManager for all shards …

Obviously a way of validating this would be to spawn 20 tomcat (or jetty) 
instances, one for each shard and 10 per server - hopefully there is an easier 
way ;)

By the way: Ubuntu / GC / etc. are all tuned and shouldn't be a bottleneck 
here. The GC only spends about 50-100ms during a 10min load test, and never a 
full-GC.  

Just going through a jstack dump again, it looks like the HttpConnectionManager 
is actually waiting for a lock …

"pool-31-thread-15776" prio=10 tid=0x7ef544249000 nid=0x50be waiting for 
monitor entry [0x7ef4d38fc000]
 java.lang.Thread.State: BLOCKED (on object monitor)
 at 
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.doGetConnection(MultiThreadedHttpConnectionManager.java:447)
 - waiting to lock <0x7f07dd6bfa70> (a 
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool)
 at 
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.getConnectionWithTimeout(MultiThreadedHttpConnectionManager.java:416)
 at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:153)
 at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
 at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323)
 at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:427)
….

Fred.  


Am Mittwoch, 28. September 2011 um 17:48 schrieb Ken Krugler:

> Hi Frederik,
>  
> I haven't directly run into this issue with Solr, but I have experienced 
> similar issues in a related context.
>  
> In my case, I had a custom webapp that made SolrJ requests and then generated 
> some aggregated/analyzed results.
>  
> During load testing, we ran into a few different issues...
>  
> 1. The load test software itself had an issue with scaling - I'm assuming 
> that's not the case for you, but I've seen it happen more than once.
>  
> E.g. there's a limit to max parallel connections in the client being used to 
> talk to Solr.
>  
> 2. We needed to tune up the SolrJ settings for the HttpConnectionManager
>  
> Under heavy load, this was running out of free connections.
>  
> Given you've got 20 shards, each request is going to spawn 20 HTTP 
> connections.
>  
> I don't know off the top of my head how solr.SearchHandler manages 
> connections (and whether it's possible to tune this), but from the stack 
> trace below it sure looks like you're blocked on getting free HTTP 
> connections.
>  
> 3. We needed to optimize our configuration for Jetty, Ubuntu, JVM GC, etc.
>  
> There are lots of knobs to twiddle here, for better or worse.
>  
> -- Ken
>  
> On Sep 28, 2011, at 5:21am, Frederik Kraus wrote:
>  
> > I just had a look at the thread-dump, pasting 3 examples here:
> >  
> >  
> > 'pool-31-thread-8233' Id=11626, BLOCKED on 
> > lock=org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool@19dd10d9,
> >  total cpu time=20.ms user time=20.ms
> > at 
> > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool.freeConnection(MultiThreadedHttpConnectionManager.java:982)
> >   
> > at 
> > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager.releaseConnection(MultiThreadedHttpConnectionManager.java:643)
> >   
> > at 
> > org.apache.commons.httpclient.HttpConnection.releaseConnection(HttpConnection.java:1179)
> >   
> > at 
> > org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.releaseConnection(MultiThreadedHttpConnectionManager.java:1423)
> >   
> > at 
> > org.apache.commons.httpclient.HttpMethodBase.ensureConnectionRelease(HttpMethodBase.java:2430)
> >   
> > at 
> > org.apache.commons.httpclient.HttpMethodBase.responseBodyConsumed(HttpMethodBase.java:2422)
> >   
> > at 
> > org.apache.commons.httpclient.HttpMethodBase$1.responseConsumed(HttpMethodBase.java:1892)
> >   
> > at 
> > org.apache.commons.httpclient.AutoCloseInputStream.notifyWatcher(AutoCloseInputStream.java:198)
> >   
> > at 
> > org.apache.commons.httpclient.AutoCloseInputStream.close(AutoCloseInputStream.java:158)
> >   
> > at 
> > org.apache.commons.httpclient.HttpMethodBase.releaseConnection(HttpMethodBase.java:1

Re: strange performance issue with many shards on one server

2011-09-28 Thread Frederik Kraus


Am Mittwoch, 28. September 2011 um 16:40 schrieb Toke Eskildsen:

> On Wed, 2011-09-28 at 12:58 +0200, Frederik Kraus wrote:
> > - 10 shards per server (needed for response times) running in a single 
> > tomcat instance
> 
> Have you tested that sharding actually decreases response times in your
> case? I see the idea in decreasing response times with sharding at the
> cost of decreasing throughput, but the added overhead of merging is
> non-trivial.
Yep unfortunately, the queries have huge boolean filterqueries for ACLs etc. 
which just take too long to compute in a single thread.

> 
> > - each query queries all 20 shards (distributed search)
> > 
> > - each shard holds about 1.5 mio documents (small shards are needed due to 
> > rather complex queries)
> > - all caches are warmed / high cache hit rates (99%) etc.
> 
> > Now for some reason we cannot seem to fully utilize all CPU power (no disk 
> > IO), ie. increasing concurrent users doesn't increase CPU-Load at a point, 
> > decreases throughput and increases the response times of the individual 
> > queries.
> 
> It sounds as if there's a hard limit on the number of concurrent users
> somewhere. I am no expert in httpclient, but the blocked threads in your
> thread dump seems to indicate that they wait for connections to be
> established rather than for results to be produced.
> 
> I seem to remember that tomcat has a default limit on 200 concurrent
> connections and with 10 shards/search, that is just 200 / (10
> shard_connections + 1 incoming_connection) = 18 concurrent searches.
> 

I have gradually bumped all of this up to (almost) infinity with no effect ;)


> > Also 1-2% of the queries take significantly longer: avg somewhere at 100ms 
> > while 1-2% take 1.5s or longer. 
> 
> Could be garbage collection, especially since it shows under high load
> which might result in more old objects and thereby trigger full gc.
 GC is only spending something like 50-100ms total for a 10min load test 





Re: strange performance issue with many shards on one server

2011-09-28 Thread Frederik Kraus
 Yep, I'm not getting more than 50-60% CPU during those load tests. 


Am Mittwoch, 28. September 2011 um 23:01 schrieb Jaeger, Jay - DOT:

> Yes, that thread waits (in the sense that nothing useful gets done), but 
> during that time, from the perspective of the applications and OS, that CPU 
> is busy: it is not "waiting" in such a way that you can dispatch a different 
> process.
> 
> The point is, that if this was actually the problem, it would show up in a 
> higher CPU utilization than the correspondent reported.
> 
> -Original Message-
> From: Federico Fissore [mailto:feder...@fissore.org] 
> Sent: Wednesday, September 28, 2011 2:04 PM
> To: solr-user@lucene.apache.org (mailto:solr-user@lucene.apache.org)
> Subject: Re: strange performance issue with many shards on one server
> 
> Jaeger, Jay - DOT, il 28/09/2011 18:40, ha scritto:
> > That would still show up as the CPU being busy.
> 
> i don't know how the program (top, htop, whatever) displays the value 
> but when the cpu has a cache miss definitely that thread sits and waits 
> for a number of clock cycles
> with 130GB of ram (per server?) I suspect caches miss as a rule
> 
> just a suspicion however, nothing I'll bet on




Re: mixing version of solr

2011-03-03 Thread Frederik Kraus
No, that won't work as the index format has changed.
On Donnerstag, 3. März 2011 at 20:03, Ofer Fort wrote: 
> Hey all,
> I have a master slave using the same index folder, the master only writes,
> and the slave only reads.
> Is it possible to use different versions of solr for those two servers?
> Let's say i want to gain from the improved search speed of solr4.0 but since
> it's my production system, am not willing to index using it since it's not a
> stable release.
> Since the slave only reads, if it will crash i'll just restart it.
> 
> Can i index using solr 1.4.1 and read the same index with solr 4.0?
> 
> thanks
> 


DIH - Multiple Cores / Consistent Hashing

2011-03-04 Thread Frederik Kraus
Hi Guys, 

I'm currently working on a project with quite a few shards/cores etc. and 
ideally want to use the DIH to the indexing. Is there any consistent hashing 
method available, other than the modulo way of selecting only specific 
documents.


Thanks,

Fred. 

Re: Extra facet query from within a custom search component

2011-04-28 Thread Frederik Kraus
Haaa fantastic! 

Thanks a lot!

Fred.
On Donnerstag, 28. April 2011 at 22:21, Erick Erickson wrote: 
> Have you looked at: http://wiki.apache.org/solr/TermsComponent?
> 
> Best
> Erick
> 
> On Thu, Apr 28, 2011 at 2:44 PM, Frederik Kraus
>  wrote:
> > Hi Guys,
> > 
> > I'm currently working on a custom search component and need to fetch a list 
> > of all possible values within a certain field.
> > An internal facet (wildcard) query first came to mind, but I'm not quite 
> > sure how to best create and then execute such a query ...
> > 
> > What would be the best way to do this?
> > 
> > Can anyone please point me in the right direction?
> > 
> > Thanks,
> > 
> > Fred.
> 


force "0" results from within a search component?

2011-05-05 Thread Frederik Kraus
Hi guys,

another question on custom search components:

Is there any way to force the response to be "0 results" from within a search 
component (and break out of the component chain)?

I'm doing some checks in my first-component and in some cases would like to 
stop processing the request and just pretend, that there are 0 results ...

Thanks,

Fred. 

Re: Huge performance drop in distributed search w/ shards on the same server/container

2011-05-13 Thread Frederik Kraus
One Tomcat with multicore. I have a list of about 2mio "real" queries that I'm 
firing at the cluster with jmeter. Reason for splitting up the index in rather 
small parts is that the maximum response time of 1 sec cannot be exceeded for 
any of those queries.



On Freitag, 13. Mai 2011 at 12:57, Grant Ingersoll wrote: 
> Is that 10 different Tomcat instances or are you using multicore? How are you 
> testing?
> 
> On May 13, 2011, at 6:08 AM, Frederik Kraus wrote:
> 
> > Hi, 
> > 
> > I'm having some serious problems scaling the following setup:
> > 
> > 48 CPU / Tomcat / ...
> > 
> > localhost/shard1
> > ...
> > localhost/shard10
> > 
> > When using all 10 shards in the query the req/s drop down to about 300 
> > without fully utilizing cpu (60% idle) or ram (disk i/o is zero - 
> > everything fits into the ram)
> > 
> > When only quering one shard I get about 5k-6k req/s 
> > 
> > Are there any known limits and/or work-arounds?
> > 
> > Thanks,
> > 
> > Fred.
> 
> 
> Grant Ingersoll
> Join the LUCENE REVOLUTION
> Lucene & Solr User Conference
> May 25-26, San Francisco
> www.lucenerevolution.org
> 


Re: Huge performance drop in distributed search w/ shards on the same server/container

2011-05-15 Thread Frederik Kraus
Any ideas?
On Freitag, 13. Mai 2011 at 13:19, Frederik Kraus wrote: 
> One Tomcat with multicore. I have a list of about 2mio "real" queries that 
> I'm firing at the cluster with jmeter. Reason for splitting up the index in 
> rather small parts is that the maximum response time of 1 sec cannot be 
> exceeded for any of those queries.
> 
> 
> 
> On Freitag, 13. Mai 2011 at 12:57, Grant Ingersoll wrote:
> > Is that 10 different Tomcat instances or are you using multicore? How are 
> > you testing?
> > 
> > On May 13, 2011, at 6:08 AM, Frederik Kraus wrote:
> > 
> > > Hi, 
> > > 
> > > I'm having some serious problems scaling the following setup:
> > > 
> > > 48 CPU / Tomcat / ...
> > > 
> > > localhost/shard1
> > > ...
> > > localhost/shard10
> > > 
> > > When using all 10 shards in the query the req/s drop down to about 300 
> > > without fully utilizing cpu (60% idle) or ram (disk i/o is zero - 
> > > everything fits into the ram)
> > > 
> > > When only quering one shard I get about 5k-6k req/s 
> > > 
> > > Are there any known limits and/or work-arounds?
> > > 
> > > Thanks,
> > > 
> > > Fred.
> > 
> > 
> > Grant Ingersoll
> > Join the LUCENE REVOLUTION
> > Lucene & Solr User Conference
> > May 25-26, San Francisco
> > www.lucenerevolution.org
> > 
> 


DIH / dynamic fields / ...

2011-07-06 Thread Frederik Kraus
Hi, 

I'm currently stuck with a (probably straightforward) problem concerning DIH 
and dynamic fields.

I'm having a DB-Datasource with one of the columns (metaXml) containing an xml 
string looking something like this:



The  looks something like this:


…


…











 





In my schema.xml I have a dynamic field looking like:



For whatever reason those meta_* fields are not populated ...

Maybe (hopefully) I'm missing something obvious. Any help would be great ;)

Thanks,

Fred.