Contribute QParserPlugin
Hi, I need QParserPlugin that will use Redis as a backend to prepare filter queries. There are several data structures available in Redis (hash, set, etc.). From some reasons I cannot fetch data from redis data structures, build and send big requests from application. That's why I want to build that filters on backend (Solr) side. I'm wondering what do I have to do to contribute QParserPlugin into Solr repository. Can you suggest me a way (in a few steps) to publish it in Solr repository, probably as a contrib? -- Paweł Róg
Solr cloud hangs
Hi, I have quite annoying problem with Solr cloud. I have a cluster with 8 shards and with 2 replicas in each. (Solr 4.6.1) After some time cluster doesn't respond to any update requests. Restarting the cluster nodes doesn't help. There are a lot of such stack traces (waiting for very long time): - sun.misc.Unsafe.park(Native Method) - java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) - org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:342) - org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:526) - org.eclipse.jetty.util.thread.QueuedThreadPool.access$600(QueuedThreadPool.java:44) - org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572) - java.lang.Thread.run(Thread.java:722) Do you have any idea where can I look for? -- Pawel
Re: Solr cloud hangs
Hi, Here is the whole stack trace: https://gist.github.com/anonymous/9056783 -- Pawel On Mon, Feb 17, 2014 at 4:53 PM, Mark Miller wrote: > Can you share the full stack trace dump? > > - Mark > > http://about.me/markrmiller > > On Feb 17, 2014, at 7:07 AM, Pawel Rog wrote: > > > Hi, > > I have quite annoying problem with Solr cloud. I have a cluster with 8 > > shards and with 2 replicas in each. (Solr 4.6.1) > > After some time cluster doesn't respond to any update requests. > Restarting > > the cluster nodes doesn't help. > > > > There are a lot of such stack traces (waiting for very long time): > > > > > > - sun.misc.Unsafe.park(Native Method) > > - > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > > - > > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) > > - > > > org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:342) > > - > > > org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:526) > > - > > > org.eclipse.jetty.util.thread.QueuedThreadPool.access$600(QueuedThreadPool.java:44) > > - > > > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572) > > - java.lang.Thread.run(Thread.java:722) > > > > > > Do you have any idea where can I look for? > > > > -- > > Pawel > >
Re: Solr cloud hangs
There are also many errors in solr log like that one: org.apache.solr.update.StreamingSolrServers$1; error org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool at org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:232) at org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:199) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:456) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:232) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) -- Pawel On Mon, Feb 17, 2014 at 8:01 PM, Pawel Rog wrote: > Hi, > Here is the whole stack trace: https://gist.github.com/anonymous/9056783 > > -- > Pawel > > > On Mon, Feb 17, 2014 at 4:53 PM, Mark Miller wrote: > >> Can you share the full stack trace dump? >> >> - Mark >> >> http://about.me/markrmiller >> >> On Feb 17, 2014, at 7:07 AM, Pawel Rog wrote: >> >> > Hi, >> > I have quite annoying problem with Solr cloud. I have a cluster with 8 >> > shards and with 2 replicas in each. (Solr 4.6.1) >> > After some time cluster doesn't respond to any update requests. >> Restarting >> > the cluster nodes doesn't help. >> > >> > There are a lot of such stack traces (waiting for very long time): >> > >> > >> > - sun.misc.Unsafe.park(Native Method) >> > - >> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) >> > - >> > >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) >> > - >> > >> org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:342) >> > - >> > >> org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:526) >> > - >> > >> org.eclipse.jetty.util.thread.QueuedThreadPool.access$600(QueuedThreadPool.java:44) >> > - >> > >> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572) >> > - java.lang.Thread.run(Thread.java:722) >> > >> > >> > Do you have any idea where can I look for? >> > >> > -- >> > Pawel >> >> >
Edismax parser and boosts
Hi, I use edismax query with q parameter set as below: q=foo^1.0+AND+bar For such a query for the same document I see different (lower) scoring value than for q=foo+AND+bar By default boost of term is 1 as far as i know so why the scoring differs? When I check debugQuery parameter in parsedQuery for "foo^1.0+AND+bar" I see Boolean query which one of clauses is a phrase query "foo 1.0 bar". It seems that edismax parser takes whole q parameter as a phrase without removing boost value and add it as a boolean clause. Is it a bug or it should work like that? -- Paweł Róg
Re: Edismax parser and boosts
Hi, Thank you for your response. I checked it in Solr 4.8 but I think this works as I described from very long time. I'm not 100% sure if it is really bug or not. When I run phrase query like "foo^1.0 bar" this works very similarto what happens in edismax with set *pf* parameter (boost part is not removed). -- Paweł Róg On Thu, Oct 9, 2014 at 12:07 AM, Jack Krupansky wrote: > Definitely sounds like a bug! File a Jira. Thanks for reporting this. What > release of Solr? > > > > -- Jack Krupansky > -----Original Message- From: Pawel Rog > Sent: Wednesday, October 8, 2014 3:57 PM > To: solr-user@lucene.apache.org > Subject: Edismax parser and boosts > > > Hi, > I use edismax query with q parameter set as below: > > q=foo^1.0+AND+bar > > For such a query for the same document I see different (lower) scoring > value than for > > q=foo+AND+bar > > By default boost of term is 1 as far as i know so why the scoring differs? > > When I check debugQuery parameter in parsedQuery for "foo^1.0+AND+bar" I > see Boolean query which one of clauses is a phrase query "foo 1.0 bar". It > seems that edismax parser takes whole q parameter as a phrase without > removing boost value and add it as a boolean clause. Is it a bug or it > should work like that? > > -- > Paweł Róg >
Highlighting integer field
Hi, Is it possible to highlight int (TrieLongField) or long (TrieLongField) field in Solr? -- Paweł
Re: Solr 3.5 very slow (performance)
examples facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=0&q=name:(kurtka+skóry+brazowe42)&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50 facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=1350&q=name:naczepa&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50 facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=0&q=it_name:(miłosz+giedroyc)&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50 default operation ANDpromoted - intending - intb_count - intname - textcat1 - intcat2 -int these are only few examples. almost all queries are much slower. there was about 60 searches per second on old and new version of solr. solr 1.4 reached 200% cpu utilization and solr 3.5 reached 1200% cpu utilization on same machine On Tue, Nov 29, 2011 at 7:05 PM, Yonik Seeley wrote: > On Tue, Nov 29, 2011 at 12:25 PM, Pawel wrote: >> I've build index on solr 1.4 some time ago (about 18milions documents, >> about 8GB). I need new features from newer version of solr, so i >> decided to upgrade solr version from 1.4 to 3.5. >> >> * I created new solr master on new physical machine >> * then I created new index using the same schema as in earlier version >> * then I indexed some slave, and start sending the same requests as >> earlier but to newer version of solr (3.5, but the same situation is >> on solr 3.4). >> >> The CPU went from 200% to 1200% and load went from 3 to 15. Avarage >> QTime went from 15ms to 180ms and median went from 1ms to 150ms >> I didn't change any parameters in solrconfig and schema. > > What are the requests that look slower? > > -Yonik > http://www.lucidimagination.com
Re: Solr 3.5 very slow (performance)
in my last pos i mean default operation AND promoted - int ending - int b_count - int name - text cat1 - int cat2 - int On Tue, Nov 29, 2011 at 7:54 PM, Pawel Rog wrote: > examples > > facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=0&q=name:(kurtka+skóry+brazowe42)&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50 > > facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=1350&q=name:naczepa&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50 > > facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=0&q=it_name:(miłosz+giedroyc)&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50 > > default operation ANDpromoted - intending - intb_count - intname - > textcat1 - intcat2 -int > these are only few examples. almost all queries are much slower. there > was about 60 searches per second on old and new version of solr. solr > 1.4 reached 200% cpu utilization and solr 3.5 reached 1200% cpu > utilization on same machine > > On Tue, Nov 29, 2011 at 7:05 PM, Yonik Seeley > wrote: >> On Tue, Nov 29, 2011 at 12:25 PM, Pawel wrote: >>> I've build index on solr 1.4 some time ago (about 18milions documents, >>> about 8GB). I need new features from newer version of solr, so i >>> decided to upgrade solr version from 1.4 to 3.5. >>> >>> * I created new solr master on new physical machine >>> * then I created new index using the same schema as in earlier version >>> * then I indexed some slave, and start sending the same requests as >>> earlier but to newer version of solr (3.5, but the same situation is >>> on solr 3.4). >>> >>> The CPU went from 200% to 1200% and load went from 3 to 15. Avarage >>> QTime went from 15ms to 180ms and median went from 1ms to 150ms >>> I didn't change any parameters in solrconfig and schema. >> >> What are the requests that look slower? >> >> -Yonik >> http://www.lucidimagination.com
Re: Solr 3.5 very slow (performance)
On Tue, Nov 29, 2011 at 9:13 PM, Chris Hostetter wrote: > > Let's back up a minute and cover some basics... > > 1) You said that you built a brand new index on a brand new master server, > using Solr 3.5 -- how do you build your indexes? did the source data > change at all? does your new index have the same number of docs as your > previous Solr 1.4 index? what does a directory listing (including file > sizes) look like for both your old and new indexes? Yes, both indexes have same data. Indexes are build using some C++ programm which reads data from database and inserts it into Solr (using XML). Both indexes have about 8GB size and 18milions documents. > 2) Did you try using your Solr 1.4 index (and configs) directly in Solr > 3.5 w/o rebuilding from scratch? Yes I used the same configs in solr 1.4 and solr 3.5 (adding only line about "luceneMatchVersion") As I see in example of solr 3.5 in repository (solrconfig.xml) there are not many diffrences. > 3) You said you build the new index on a new mmachine, but then you said > you used a slave where the performanne was worse then Solr 1.4 "on the > same machine" ... are you running both the Solr 1.4 and Solr 3.5 instances > concurrently on your slave machine? How much physical ram is on that > machine? what JVM options are using when running the Solr 3.5 instance? > what servlet container are you using? Mayby I didn't wrote precisely enough. I have some machine on which there is master node. I have second machine on which there is slave. I tested solr 1.4 on that machine, then turned it off and turned on solr-3.5. I have 36GB RAM on that machine. On both - solr 1.4 and 3.5 configuration of JVM is the same, and the same servlet container ... jetty-6 JVM options: -server -Xms12000m -Xmx12000m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:NewSize=1500m -XX:ParallelGCThreads=8 -XX:CMSInitiatingOccupancyFraction=60 > 4) what does your request handler configuration look like? do you have > any default/invariant/appended request params? explicit http://${masterHost}:${masterPort}/solr-3.5/${solr.core.instanceDir}replication 00:00:02 5000 1 > 5) The descriptions youve given of how the performance has changed sound > like you are doing concurrent load testing -- did you do cache warming before > you > started your testing? how many client threads are hitting the solr server > at one time? Maybe I wasn't precise enough again. CPU on solr 1.4 was 200% and on solr 3.5 1200% yes there is cache warming. There are 50-100 client threads on both 1.4 and 3.5. There are about 60 requests per second on 3.5 and on 1.4, but on 3.5 responses are slower and CPU usage much higher. > 6) have you tried doing some basic manual testing to see how individual > requests performe? ie: single client at a time, loading a URL, then > request the same URL again to verify that your Solr caches are in use and > the QTime is low. If you see slow respone times even when manually > executing single requests at a time, have you tried using "debug=timing" > to see which serach components are contributing the most to the slow > QTimes? Most time is in org.apache.solr.handler.component.QueryComponent and org.apache.solr.handler.component.DebugComponent in process. I didn't comare individual request performance. > 7) What do the cache stats look like on your Solr 3.5 instance after > you've done some of this timing testing? the output of... > http://localhost:8983/solr/admin/mbeans?cat=CACHE&stats=true&wt=json&indent=true > ...would be helpful. NOTE: you may need to add this to your solrconfig.xml > for that URL to work... > ' > Will check it :) > > : in my last pos i mean > : default operation AND > : promoted - int > : ending - int > : b_count - int > : name - text > : cat1 - int > : cat2 - int > : > : On Tue, Nov 29, 2011 at 7:54 PM, Pawel Rog wrote: > : > examples > : > > : > > facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=0&q=name:(kurtka+skóry+brazowe42)&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50 > : > > : > > facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=1350&q=name:naczepa&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50 > : > > : > > facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=0&q=it_name:(miłosz+giedroyc)&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50 > : > > : > default operation ANDpro
Re: Solr 3.5 very slow (performance)
IO waits about 0-2% Didn't see any suspicious activity in logs, but I can check it again On Tue, Nov 29, 2011 at 11:40 PM, Darren Govoni wrote: > Any suspicous activity in the logs? what about disk activity? > > > On 11/29/2011 05:22 PM, Pawel Rog wrote: >> >> On Tue, Nov 29, 2011 at 9:13 PM, Chris Hostetter >> wrote: >>> >>> Let's back up a minute and cover some basics... >>> >>> 1) You said that you built a brand new index on a brand new master >>> server, >>> using Solr 3.5 -- how do you build your indexes? did the source data >>> change at all? does your new index have the same number of docs as your >>> previous Solr 1.4 index? what does a directory listing (including file >>> sizes) look like for both your old and new indexes? >> >> Yes, both indexes have same data. Indexes are build using some C++ >> programm which reads data from database and inserts it into Solr >> (using XML). Both indexes have about 8GB size and 18milions documents. >> >> >>> 2) Did you try using your Solr 1.4 index (and configs) directly in Solr >>> 3.5 w/o rebuilding from scratch? >> >> Yes I used the same configs in solr 1.4 and solr 3.5 (adding only line >> about "luceneMatchVersion") >> As I see in example of solr 3.5 in repository (solrconfig.xml) there >> are not many diffrences. >> >>> 3) You said you build the new index on a new mmachine, but then you said >>> you used a slave where the performanne was worse then Solr 1.4 "on the >>> same machine" ... are you running both the Solr 1.4 and Solr 3.5 >>> instances >>> concurrently on your slave machine? How much physical ram is on that >>> machine? what JVM options are using when running the Solr 3.5 instance? >>> what servlet container are you using? >> >> Mayby I didn't wrote precisely enough. I have some machine on which >> there is master node. I have second machine on which there is slave. I >> tested solr 1.4 on that machine, then turned it off and turned on >> solr-3.5. I have 36GB RAM on that machine. >> On both - solr 1.4 and 3.5 configuration of JVM is the same, and the >> same servlet container ... jetty-6 >> >> JVM options: -server -Xms12000m -Xmx12000m -XX:+UseParNewGC >> -XX:+UseConcMarkSweepGC -XX:NewSize=1500m -XX:ParallelGCThreads=8 >> -XX:CMSInitiatingOccupancyFraction=60 >> >>> 4) what does your request handler configuration look like? do you have >>> any default/invariant/appended request params? >> >> >> >> explicit >> >> >> > class="org.apache.solr.handler.admin.AdminHandlers" /> >> >> >> >> > name="masterUrl">http://${masterHost}:${masterPort}/solr-3.5/${solr.core.instanceDir}replication >> 00:00:02 >> 5000 >> 1 >> >> >> >> >>> 5) The descriptions youve given of how the performance has changed sound >>> like you are doing concurrent load testing -- did you do cache warming >>> before you >>> started your testing? how many client threads are hitting the solr >>> server >>> at one time? >> >> Maybe I wasn't precise enough again. CPU on solr 1.4 was 200% and on >> solr 3.5 1200% >> yes there is cache warming. There are 50-100 client threads on both >> 1.4 and 3.5. There are about 60 requests per second on 3.5 and on 1.4, >> but on 3.5 responses are slower and CPU usage much higher. >> >>> 6) have you tried doing some basic manual testing to see how individual >>> requests performe? ie: single client at a time, loading a URL, then >>> request the same URL again to verify that your Solr caches are in use and >>> the QTime is low. If you see slow respone times even when manually >>> executing single requests at a time, have you tried using "debug=timing" >>> to see which serach components are contributing the most to the slow >>> QTimes? >> >> Most time is in org.apache.solr.handler.component.QueryComponent and >> org.apache.solr.handler.component.DebugComponent in process. I didn't >> comare individual request performance. >> >>> 7) What do the cache stats look like on your Solr 3.5 instance after >>> you've done some of this timing testing? the output of... >>> >>> http://localhost:8983/solr/admin/mbeans?cat=CACHE&stats=true&wt=json
Re: Solr 3.5 very slow (performance)
* 1st question (ls from index directory) solr 1.4 -rw-r--r-- 1 user user2180582 Nov 30 07:26 _3g1_cf.del -rw-r--r-- 1 user user 5190652802 Nov 28 17:57 _3g1.fdt -rw-r--r-- 1 user user 139556724 Nov 28 17:57 _3g1.fdx -rw-r--r-- 1 user user 4963 Nov 28 17:56 _3g1.fnm -rw-r--r-- 1 user user 1879006175 Nov 28 18:01 _3g1.frq -rw-r--r-- 1 user user 513919573 Nov 28 18:01 _3g1.prx -rw-r--r-- 1 user user2745451 Nov 28 18:01 _3g1.tii -rw-r--r-- 1 user user 218731810 Nov 28 18:01 _3g1.tis -rw-r--r-- 1 user user 275268 Nov 30 07:26 _3uu_1a.del -rw-r--r-- 1 user user 666375513 Nov 30 03:35 _3uu.fdt -rw-r--r-- 1 user user 17616636 Nov 30 03:35 _3uu.fdx -rw-r--r-- 1 user user 4884 Nov 30 03:35 _3uu.fnm -rw-r--r-- 1 user user 243847897 Nov 30 03:35 _3uu.frq -rw-r--r-- 1 user user 64791316 Nov 30 03:35 _3uu.prx -rw-r--r-- 1 user user 545317 Nov 30 03:35 _3uu.tii -rw-r--r-- 1 user user 42993472 Nov 30 03:35 _3uu.tis -rw-r--r-- 1 user user 1178 Nov 30 07:26 _3wj_1.del -rw-r--r-- 1 user user2813124 Nov 30 07:26 _3wj.fdt -rw-r--r-- 1 user user 74852 Nov 30 07:26 _3wj.fdx -rw-r--r-- 1 user user 2175 Nov 30 07:26 _3wj.fnm -rw-r--r-- 1 user user 911051 Nov 30 07:26 _3wj.frq -rw-r--r-- 1 user user 4 Nov 30 07:26 _3wj.nrm -rw-r--r-- 1 user user 285405 Nov 30 07:26 _3wj.prx -rw-r--r-- 1 user user 7951 Nov 30 07:26 _3wj.tii -rw-r--r-- 1 user user 624702 Nov 30 07:26 _3wj.tis -rw-r--r-- 1 user user 35859092 Nov 30 07:26 _3wk.fdt -rw-r--r-- 1 user user 958148 Nov 30 07:26 _3wk.fdx -rw-r--r-- 1 user user 4104 Nov 30 07:26 _3wk.fnm -rw-r--r-- 1 user user 12228212 Nov 30 07:26 _3wk.frq -rw-r--r-- 1 user user3438508 Nov 30 07:26 _3wk.prx -rw-r--r-- 1 user user 58672 Nov 30 07:26 _3wk.tii -rw-r--r-- 1 user user4621519 Nov 30 07:26 _3wk.tis -rw-r--r-- 1 user user 0 Nov 30 07:27 lucene-9445a367a714cc9bf70d0ebdf83b9e01-write.lock -rw-r--r-- 1 user user 1010 Nov 30 07:26 segments_2tr -rw-r--r-- 1 user user 20 Nov 17 14:06 segments.gen solr 3.5 (dates are older - because I turned off feeding 3.5 instance) -rw-r--r-- 1 user user2188376 Nov 29 13:10 _2x_6g.del -rw-r--r-- 1 user user 4955406209 Nov 28 17:38 _2x.fdt -rw-r--r-- 1 user user 140054140 Nov 28 17:38 _2x.fdx -rw-r--r-- 1 user user 4852 Nov 28 17:37 _2x.fnm -rw-r--r-- 1 user user 1845719205 Nov 28 17:42 _2x.frq -rw-r--r-- 1 user user 497871055 Nov 28 17:42 _2x.prx -rw-r--r-- 1 user user3006635 Nov 28 17:42 _2x.tii -rw-r--r-- 1 user user 230304265 Nov 28 17:42 _2x.tis -rw-r--r-- 1 user user 50128 Nov 29 13:10 _5s_48.del -rw-r--r-- 1 user user 116159640 Nov 29 00:25 _5s.fdt -rw-r--r-- 1 user user3206268 Nov 29 00:25 _5s.fdx -rw-r--r-- 1 user user 4963 Nov 29 00:25 _5s.fnm -rw-r--r-- 1 user user 44556139 Nov 29 00:25 _5s.frq -rw-r--r-- 1 user user 11405232 Nov 29 00:25 _5s.prx -rw-r--r-- 1 user user 149965 Nov 29 00:25 _5s.tii -rw-r--r-- 1 user user 11662163 Nov 29 00:25 _5s.tis -rw-r--r-- 1 user user 63191 Nov 29 13:10 _97_1o.del -rw-r--r-- 1 user user 145482785 Nov 29 08:08 _97.fdt -rw-r--r-- 1 user user4042300 Nov 29 08:08 _97.fdx -rw-r--r-- 1 user user 4963 Nov 29 08:08 _97.fnm -rw-r--r-- 1 user user 55361299 Nov 29 08:08 _97.frq -rw-r--r-- 1 user user 14181208 Nov 29 08:08 _97.prx -rw-r--r-- 1 user user 187731 Nov 29 08:08 _97.tii -rw-r--r-- 1 user user 14617940 Nov 29 08:08 _97.tis -rw-r--r-- 1 user user 21310 Nov 29 13:10 _9q_1a.del -rw-r--r-- 1 user user 49864395 Nov 29 09:19 _9q.fdt -rw-r--r-- 1 user user1361884 Nov 29 09:19 _9q.fdx -rw-r--r-- 1 user user 4963 Nov 29 09:19 _9q.fnm -rw-r--r-- 1 user user 17879364 Nov 29 09:19 _9q.frq -rw-r--r-- 1 user user4970178 Nov 29 09:19 _9q.prx -rw-r--r-- 1 user user 75969 Nov 29 09:19 _9q.tii -rw-r--r-- 1 user user5932085 Nov 29 09:19 _9q.tis -rw-r--r-- 1 user user 62661357 Nov 29 10:19 _a6.fdt -rw-r--r-- 1 user user1717820 Nov 29 10:19 _a6.fdx -rw-r--r-- 1 user user 4963 Nov 29 10:19 _a6.fnm -rw-r--r-- 1 user user 23283028 Nov 29 10:19 _a6.frq -rw-r--r-- 1 user user6196945 Nov 29 10:19 _a6.prx -rw-r--r-- 1 user user 92528 Nov 29 10:19 _a6.tii -rw-r--r-- 1 user user7209783 Nov 29 10:19 _a6.tis -rw-r--r-- 1 user user 26871 Nov 29 13:10 _a6_y.del -rw-r--r-- 1 user user 16372020 Nov 29 10:39 _ab.fdt -rw-r--r-- 1 user user 455476 Nov 29 10:39 _ab.fdx -rw-r--r-- 1 user user 4963 Nov 29 10:39 _ab.fnm -rw-r--r-- 1 user user6025966 Nov 29 10:39 _ab.frq -rw-r--r-- 1 user user1622841 Nov 29 10:39 _ab.prx -rw-r--r-- 1 user user 35252 Nov 29 10:39 _ab.tii -rw-r--r-- 1 user user2766468 Nov 29 10:39 _ab.tis -rw-r--r-- 1 user user 7147 Nov 29 13:10 _ab_u.del -rw-r--r-- 1 user user 14818116 Nov 29 11:09 _aj.fdt -rw-r--r-- 1 user user 409356 Nov 29 11:09 _aj.fdx -rw-r--r-- 1 user user 4963 Nov 29 11:09 _aj.fnm -rw-r--r-- 1 user user5461353 N
Re: Solr 3.5 very slow (performance)
I attach chart which presents cpu usage. Solr 3.5 uses almost all cpu (left side of chart). at the begining of chart there was about 60rps and about 100rps (before turning off solr 3.5). Then there was 1.4 turned on with 100rps. -- Pawel On Wed, Nov 30, 2011 at 9:07 AM, Pawel Rog wrote: > * 1st question (ls from index directory) > > solr 1.4 > > -rw-r--r-- 1 user user 2180582 Nov 30 07:26 _3g1_cf.del > -rw-r--r-- 1 user user 5190652802 Nov 28 17:57 _3g1.fdt > -rw-r--r-- 1 user user 139556724 Nov 28 17:57 _3g1.fdx > -rw-r--r-- 1 user user 4963 Nov 28 17:56 _3g1.fnm > -rw-r--r-- 1 user user 1879006175 Nov 28 18:01 _3g1.frq > -rw-r--r-- 1 user user 513919573 Nov 28 18:01 _3g1.prx > -rw-r--r-- 1 user user 2745451 Nov 28 18:01 _3g1.tii > -rw-r--r-- 1 user user 218731810 Nov 28 18:01 _3g1.tis > -rw-r--r-- 1 user user 275268 Nov 30 07:26 _3uu_1a.del > -rw-r--r-- 1 user user 666375513 Nov 30 03:35 _3uu.fdt > -rw-r--r-- 1 user user 17616636 Nov 30 03:35 _3uu.fdx > -rw-r--r-- 1 user user 4884 Nov 30 03:35 _3uu.fnm > -rw-r--r-- 1 user user 243847897 Nov 30 03:35 _3uu.frq > -rw-r--r-- 1 user user 64791316 Nov 30 03:35 _3uu.prx > -rw-r--r-- 1 user user 545317 Nov 30 03:35 _3uu.tii > -rw-r--r-- 1 user user 42993472 Nov 30 03:35 _3uu.tis > -rw-r--r-- 1 user user 1178 Nov 30 07:26 _3wj_1.del > -rw-r--r-- 1 user user 2813124 Nov 30 07:26 _3wj.fdt > -rw-r--r-- 1 user user 74852 Nov 30 07:26 _3wj.fdx > -rw-r--r-- 1 user user 2175 Nov 30 07:26 _3wj.fnm > -rw-r--r-- 1 user user 911051 Nov 30 07:26 _3wj.frq > -rw-r--r-- 1 user user 4 Nov 30 07:26 _3wj.nrm > -rw-r--r-- 1 user user 285405 Nov 30 07:26 _3wj.prx > -rw-r--r-- 1 user user 7951 Nov 30 07:26 _3wj.tii > -rw-r--r-- 1 user user 624702 Nov 30 07:26 _3wj.tis > -rw-r--r-- 1 user user 35859092 Nov 30 07:26 _3wk.fdt > -rw-r--r-- 1 user user 958148 Nov 30 07:26 _3wk.fdx > -rw-r--r-- 1 user user 4104 Nov 30 07:26 _3wk.fnm > -rw-r--r-- 1 user user 12228212 Nov 30 07:26 _3wk.frq > -rw-r--r-- 1 user user 3438508 Nov 30 07:26 _3wk.prx > -rw-r--r-- 1 user user 58672 Nov 30 07:26 _3wk.tii > -rw-r--r-- 1 user user 4621519 Nov 30 07:26 _3wk.tis > -rw-r--r-- 1 user user 0 Nov 30 07:27 > lucene-9445a367a714cc9bf70d0ebdf83b9e01-write.lock > -rw-r--r-- 1 user user 1010 Nov 30 07:26 segments_2tr > -rw-r--r-- 1 user user 20 Nov 17 14:06 segments.gen > > solr 3.5 (dates are older - because I turned off feeding 3.5 instance) > > -rw-r--r-- 1 user user 2188376 Nov 29 13:10 _2x_6g.del > -rw-r--r-- 1 user user 4955406209 Nov 28 17:38 _2x.fdt > -rw-r--r-- 1 user user 140054140 Nov 28 17:38 _2x.fdx > -rw-r--r-- 1 user user 4852 Nov 28 17:37 _2x.fnm > -rw-r--r-- 1 user user 1845719205 Nov 28 17:42 _2x.frq > -rw-r--r-- 1 user user 497871055 Nov 28 17:42 _2x.prx > -rw-r--r-- 1 user user 3006635 Nov 28 17:42 _2x.tii > -rw-r--r-- 1 user user 230304265 Nov 28 17:42 _2x.tis > -rw-r--r-- 1 user user 50128 Nov 29 13:10 _5s_48.del > -rw-r--r-- 1 user user 116159640 Nov 29 00:25 _5s.fdt > -rw-r--r-- 1 user user 3206268 Nov 29 00:25 _5s.fdx > -rw-r--r-- 1 user user 4963 Nov 29 00:25 _5s.fnm > -rw-r--r-- 1 user user 44556139 Nov 29 00:25 _5s.frq > -rw-r--r-- 1 user user 11405232 Nov 29 00:25 _5s.prx > -rw-r--r-- 1 user user 149965 Nov 29 00:25 _5s.tii > -rw-r--r-- 1 user user 11662163 Nov 29 00:25 _5s.tis > -rw-r--r-- 1 user user 63191 Nov 29 13:10 _97_1o.del > -rw-r--r-- 1 user user 145482785 Nov 29 08:08 _97.fdt > -rw-r--r-- 1 user user 4042300 Nov 29 08:08 _97.fdx > -rw-r--r-- 1 user user 4963 Nov 29 08:08 _97.fnm > -rw-r--r-- 1 user user 55361299 Nov 29 08:08 _97.frq > -rw-r--r-- 1 user user 14181208 Nov 29 08:08 _97.prx > -rw-r--r-- 1 user user 187731 Nov 29 08:08 _97.tii > -rw-r--r-- 1 user user 14617940 Nov 29 08:08 _97.tis > -rw-r--r-- 1 user user 21310 Nov 29 13:10 _9q_1a.del > -rw-r--r-- 1 user user 49864395 Nov 29 09:19 _9q.fdt > -rw-r--r-- 1 user user 1361884 Nov 29 09:19 _9q.fdx > -rw-r--r-- 1 user user 4963 Nov 29 09:19 _9q.fnm > -rw-r--r-- 1 user user 17879364 Nov 29 09:19 _9q.frq > -rw-r--r-- 1 user user 4970178 Nov 29 09:19 _9q.prx > -rw-r--r-- 1 user user 75969 Nov 29 09:19 _9q.tii > -rw-r--r-- 1 user user 5932085 Nov 29 09:19 _9q.tis > -rw-r--r-- 1 user user 62661357 Nov 29 10:19 _a6.fdt > -rw-r--r-- 1 user user 1717820 Nov 29 10:19 _a6.fdx > -rw-r--r-- 1 user user 4963 Nov 29 10:19 _a6.fnm > -rw-r--r-- 1 user user 23283028 Nov 29 10:19 _a6.frq > -rw-r--r-- 1 user user 6196945 Nov 29 10:19 _a6.prx > -rw-r--r-- 1 user user 92528 Nov 29 10:19 _a6.tii > -rw-r--r-- 1 user user 7209783 Nov 29 10:19 _a6.ti
Re: Solr 3.5 very slow (performance)
I made thread dump. Most active threads have such trace: "471003383@qtp-536357250-245" - Thread t@270 java.lang.Thread.State: RUNNABLE at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:702) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1144) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:362) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:378) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) On Wed, Nov 30, 2011 at 10:31 AM, Pawel Rog wrote: > I attach chart which presents cpu usage. Solr 3.5 uses almost all cpu > (left side of chart). > at the begining of chart there was about 60rps and about 100rps > (before turning off solr 3.5). Then there was 1.4 turned on with > 100rps. > > -- > Pawel > > On Wed, Nov 30, 2011 at 9:07 AM, Pawel Rog wrote: >> * 1st question (ls from index directory) >> >> solr 1.4 >> >> -rw-r--r-- 1 user user 2180582 Nov 30 07:26 _3g1_cf.del >> -rw-r--r-- 1 user user 5190652802 Nov 28 17:57 _3g1.fdt >> -rw-r--r-- 1 user user 139556724 Nov 28 17:57 _3g1.fdx >> -rw-r--r-- 1 user user 4963 Nov 28 17:56 _3g1.fnm >> -rw-r--r-- 1 user user 1879006175 Nov 28 18:01 _3g1.frq >> -rw-r--r-- 1 user user 513919573 Nov 28 18:01 _3g1.prx >> -rw-r--r-- 1 user user 2745451 Nov 28 18:01 _3g1.tii >> -rw-r--r-- 1 user user 218731810 Nov 28 18:01 _3g1.tis >> -rw-r--r-- 1 user user 275268 Nov 30 07:26 _3uu_1a.del >> -rw-r--r-- 1 user user 666375513 Nov 30 03:35 _3uu.fdt >> -rw-r--r-- 1 user user 17616636 Nov 30 03:35 _3uu.fdx >> -rw-r--r-- 1 user user 4884 Nov 30 03:35 _3uu.fnm >> -rw-r--r-- 1 user user 243847897 Nov 30 03:35 _3uu.frq >> -rw-r--r-- 1 user user 64791316 Nov 30 03:35 _3uu.prx >> -rw-r--r-- 1 user user 545317 Nov 30 03:35 _3uu.tii >> -rw-r--r-- 1 user user 42993472 Nov 30 03:35 _3uu.tis >> -rw-r--r-- 1 user user 1178 Nov 30 07:26 _3wj_1.del >> -rw-r--r-- 1 user user 2813124 Nov 30 07:26 _3wj.fdt >> -rw-r--r-- 1 user user 74852 Nov 30 07:26 _3wj.fdx >> -rw-r--r-- 1 user user 2175 Nov 30 07:26 _3wj.fnm >> -rw-r--r-- 1 user user 911051 Nov 30 07:26 _3wj.frq >> -rw-r--r-- 1 user user 4 Nov 30 07:26 _3wj.nrm >> -rw-r--r-- 1 user user 285405 Nov 30 07:26 _3wj.prx >> -rw-r--r-- 1 user user 7951 Nov 30 07:26 _3wj.tii >> -rw-r--r-- 1 user user 624702 Nov 30 07:26 _3wj.tis >> -rw-r--r-- 1 user user 35859092 Nov 30 07:26 _3wk.fdt >> -rw-r--r-- 1 user user 958148 Nov 30 07:26 _3wk.fdx >> -rw-r--r-- 1 user user 4104 Nov 30 07:26 _3wk.fnm >> -rw-r--r-- 1 user user 12228212 Nov 30 07:26 _3wk.frq >> -rw-r--r-- 1 user user 3438508 Nov 30 07:26 _3wk.prx >> -rw-r--r-- 1 user user 58672 Nov 30 07:26 _3wk.tii >> -rw-r--r-- 1 user user 4621519 Nov 30 07:26 _3wk.tis >&
Re: Solr 3.5 very slow (performance)
http://imageshack.us/photo/my-images/838/cpuusage.png/ On Wed, Nov 30, 2011 at 9:18 PM, Chris Hostetter wrote: > > : I attach chart which presents cpu usage. Solr 3.5 uses almost all cpu > : (left side of chart). > > FWIW: The mailing list software filters out most attachments (there are > some exceptions for certain text mime types) > > > -Hoss
Re: Solr 3.5 very slow (performance)
On Wed, Nov 30, 2011 at 9:05 PM, Chris Hostetter wrote: > > : I tried to use index from 1.4 (load was the same as on index from 3.5) > : but there was problem with synchronization with master (invalid > : javabin format) > : Then I built new index on 3.5 with luceneMatchVersion LUCENE_35 > > why would you need to re-replicate from the master? > > You already have a copy of the Solr 1.4 index on the slave machine where > you are doing testing correct? Just (make sure Solr 1.4 isn't running > and) point Solr 3.5 at that solr home directory for the configs and data > and time that. (Just because Solr 3.5 can't replicate from Solr 1.4 > over HTTP doesn't mean it can't open indexes built by Solr 1.4) > I made It before sending earlier e-mail. Efect was the same. > It's important to understand if the discrepencies you are seeing have to > do with *building* the index under Solr 3.5, or *searching* in Solr 3.5. > > : reader : > SolrIndexReader{this=8cca36c,r=ReadOnlyDirectoryReader@8cca36c,refCnt=1,segments=4} > : readerDir : > org.apache.lucene.store.NIOFSDirectory@/data/solr_data/itemsfull/index > : > : solr 3.5 > : reader : > SolrIndexReader{this=3d01e178,r=ReadOnlyDirectoryReader@3d01e178,refCnt=1,segments=14} > : readerDir : > org.apache.lucene.store.MMapDirectory@/data/solr_data_350/itemsfull/index > : lockFactory=org.apache.lucene.store.NativeFSLockFactory@294ce5eb > > As mentioned, the difference in the number of segments may be contributing > to the perf differences you are seeing, so optimizing both indexes (or > doing a partial optimize of your 3.5 index down to 4 segments) for > comparison would probably be worthwhile. (and if that is the entirety of > hte problem, then explicitly configuring a MergePolicy may help you in the > long run) > > but independent of that I would like to suggest that you first try > explicitly configuring Solr 3.5 to use NIOFSDirectory so it's consistent > with what Solr 1.4 was doing (I'm told MMapDirectory should be faster, but > maybe there's something about your setup that makes that not true) So it > would be helpful to also try adding this to your 3.5 solrconfig.xml and > testing ... > > > > : I made some test with quiet heavy query (with frange). In both cases > : (1.4 and 3.5) I used the same newSearcher queries and started solr > : without any load. > : Results of debug timing > > Ok, well ... honestly: giving us *one* example of the timing data for > *one* query (w/o even telling us what the exact query was) ins't really > anything we can use to help you ... the crux of the question was: "was the > slow performance you are seeing only under heavy load or was it also slow > when you did manual testing?" > > : When I send fewer than 60 rps I see that in comparsion to 1.4 median > : response time is worse, avarage is worse but maximum time is better. > : It doesn't change propotion of cpu usage (3.5 uses much more cpu). > > How much "fewer then 60 rps" ? ... I'm trying to understand if the > problems you are seeing are solely happening under "heavy" concurrent > load, or if you are seeing Solr 3.5 consistently respond much slower then > Solr 1.4 even with a single client? > > Also: I may still be missunderstanding how you are generating load, and > wether you are throttling the clients, but seeing higher CPU utilization > in Solr 3.5 isn't neccessarily an indication of something going wrong -- > in some cases higher CPU% (particularly under heavy concurrent load on a > multi-core machine) could just mean that Solr is now capable of utilizing > more CPU to process parallel request, where as previous versions might have > been hitting other bottle necks. -- but that doesn't explain the slower > response times. that's what concerns me the most. I don't think that 1200% CPU usage with the same traffic is better then 200%. I think you are wrong :) Using solr 1.4 I can reach 300rps and then reach 1200% on cpu and only 60rps in solr 3.5 > > FWIW: I'm still wondering what the stats from your caches wound up looking > like on both Solr 1.4 and Solr 3.5... > >>> 7) What do the cache stats look like on your Solr 3.5 instance after >>> you've done some of this timing testing? the output of... >>> http://localhost:8983/solr/admin/mbeans?cat=CACHE&stats=true&wt=json&indent=true >>> ...would be helpful. NOTE: you may need to add this to your >>> solrconfig.xml >>> for that URL to work... >>> ' > > ...but i don't think /admin/mbeans exists in Solr 1.4, so you may just > have to get the details from stats.jsp. > I forgot to write it earlier. QueryCache hit rate was about 0.03 (in solr 1.4 and 3.5). Filter cache hitrate was abaout 0.35 in both cases. Document hit rate was about 0.55 in both cases. Trace from thread wasn't helpful to diagnose problem? As I mentioned before - almost all threads were in the same line of code in SolrIndexSearcher.
Re: Solr 3.5 very slow (performance)
Yes it works. Thanks a lot. But I stil don't understand why in solr 1.4 that option was efficient but in solr 3.5 not On Wed, Nov 30, 2011 at 11:01 PM, Yonik Seeley wrote: > On Wed, Nov 30, 2011 at 7:08 AM, Pawel Rog wrote: >> at >> org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:702) >> at >> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1144) >> at >> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:362) > > This is interesting, and suggests that you have > useFilterForSortedQuery set in your solrconfig.xml > Can you try removing it (or setting it to false)? > > -Yonik > http://www.lucidimagination.com
Re: Realtime profile data
Thank you. I'll try NRT and some post-filter :) On Tue, Feb 7, 2012 at 3:09 PM, Erick Erickson wrote: > You have several options: > 1> if you can go to trunk (bleeding edge, I admit), you can > get into the near real time (NRT) stuff. > 2> You could maintain essentially a post-filter step where > your app maintains a list of deleted messages and > removes them from the response. This will cause > some of your counts (e.g. facets, grouping) to be slightly > off > 3> Train your users to expect whatever latency you've > built into the system (i.e. indexing, commit and replication) > > Best > Erick > > On Mon, Feb 6, 2012 at 10:42 AM, Pawel Rog wrote: >> Hello. I have some problem which i'd like to solve using solr. I have >> user profile which has some kind of messages in it. User can filter >> messages, sort them etc. The problem is with delete operation. If user >> click on message to delete it it's very hard to update index of solr >> in real time. When user deletes message, it will be still visible. >> Have you idea how to solve problem with removing data?
Re: Help with duplicate unique IDs
Once I had the same problem. I didn't know what's going on. After few moment of analysis I created completely new index and removed old one (I hadn't enough time to analyze problem). Problem didn't come back any more. -- Regards, Pawel On Fri, Mar 2, 2012 at 8:23 PM, Thomas Dowling wrote: > In a Solr index of journal articles, I thought I was safe reindexing > articles because their unique ID would cause the new record in the index to > overwrite the old one. (As stated at > http://wiki.apache.org/solr/SchemaXml#The_Unique_Key_Field - right?) > > My schema.xml includes: > > ... > required="true"/> > ... > > And: > > id > > And yet I can compose a query with two hits in the index, showing: > > #1: 03405443/v66i0003/347_mrirtaitmbpa > #2: 03405443/v66i0003/347_mrirtaitmbpa > > > Can anyone give pointers on where I'm screwing something up? > > > Thomas Dowling > thomas.dowl...@gmail.com
Re: Boosting terms
Thanks a lot, I'll read it :) It seems to be helpfull On Sun, Mar 18, 2012 at 8:58 PM, Ahmet Arslan wrote: > >> Is there any possibility to boost >> terms during indexing? Searching >> that using google I found information that there is no such >> feature in >> Solr (we can only boost fields). Is it true? > > Yes, only field and document boosting exist. > > You might find this article interesting. > > http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/ > >
Re: Usage of * as a first character in wild card query
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ReversedWildcardFilterFactory On Mon, Mar 26, 2012 at 7:08 AM, Ishan wrote: > Hi, > > I need to query on solr with * as a first character in query. > For eg. Content indexed in* "Be careful" > *and query i want to fire is **ful > *But solr does not allow * as a first character in wildcard query. > Plz let me know if there is any other alternative for doing this*. > * > -- > Thanks & Regards, > Isan Fulia. >
Re: solr hangs
You wrote that you can see such error "OutOfMemoryError". I had such problems when my caches were to big. It means that there is no more free memory in JVM and probably full gc starts running. How big is your Java heap? Maybe cache sizes in yout solr are to big according to your JVM settings. -- Regards, Pawel On Tue, Apr 10, 2012 at 9:51 PM, Peter Markey wrote: > Hello, > > I have a solr cloud setup based on a blog ( > http://outerthought.org/blog/491-ot.html) and am able to bring up the > instances and cores. But when I start indexing data (through csv update), > the core throws a out of memory exception (null:java.lang.RuntimeException: > java.lang.OutOfMemoryError: unable to create new native thread). The thread > dump from new solr ui is below: > > cmdDistribExecutor-8-thread-777 (827) > > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1bd11b79 > > - sun.misc.Unsafe.park(Native Method) > - java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > - > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await > (AbstractQueuedSynchronizer.java:2043) > - > > org.apache.http.impl.conn.tsccm.WaitingThread.await(WaitingThread.java:158) > - > org.apache.http.impl.conn.tsccm.ConnPoolByRoute.getEntryBlocking > (ConnPoolByRoute.java:403) > - > org.apache.http.impl.conn.tsccm.ConnPoolByRoute$1.getPoolEntry > (ConnPoolByRoute.java:300) > - > > org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager$1.getConnection > (ThreadSafeClientConnManager.java:224) > - > org.apache.http.impl.client.DefaultRequestDirector.execute > (DefaultRequestDirector.java:401) > - > org.apache.http.impl.client.AbstractHttpClient.execute > (AbstractHttpClient.java:820) > - > org.apache.http.impl.client.AbstractHttpClient.execute > (AbstractHttpClient.java:754) > - > org.apache.http.impl.client.AbstractHttpClient.execute > (AbstractHttpClient.java:732) > - > org.apache.solr.client.solrj.impl.HttpSolrServer.request > (HttpSolrServer.java:304) > - > org.apache.solr.client.solrj.impl.HttpSolrServer.request > (HttpSolrServer.java:209) > - > org.apache.solr.update.SolrCmdDistributor$1.call > (SolrCmdDistributor.java:320) > - > org.apache.solr.update.SolrCmdDistributor$1.call > (SolrCmdDistributor.java:301) > - java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > - java.util.concurrent.FutureTask.run(FutureTask.java:166) > - > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > - java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > - java.util.concurrent.FutureTask.run(FutureTask.java:166) > - > java.util.concurrent.ThreadPoolExecutor.runWorker > (ThreadPoolExecutor.java:1110) > - > java.util.concurrent.ThreadPoolExecutor$Worker.run > (ThreadPoolExecutor.java:603) > - java.lang.Thread.run(Thread.java:679) > > > > Apparently I do see lots of threads like above in the thread dump. I'm > using latest build from the trunk (Apr 10th). Any insights into this issue > woudl be really helpful. Thanks a lot. >
Re: Difference between two solr indexes
If there are only 100'000 documents dump all document ids and make diff If you're using linux based system you can just use simple tools to do it. Something like that can be helpful curl "http://your.hostA:port/solr/index/select?*:*&fl=id&wt=csv"; > /tmp/idsA curl "http://your.hostB:port/solr/index/select?*:*&fl=id&wt=csv"; > /tmp/idsB diff /tmp/idsA /tmp/idsB | grep "<\|>" | awk '{print $2;}' | sed 's/\(.*\)/\1<\/id>/g' > /tmp/ids_to_delete.xml Now you have file. Now you can just add to that file "" and "" and upload that file into solr using curl curl -X POST -d @/tmp/ids_to_delete.xml "http://your.hostA:port /solr/index/upadte" On Tue, Apr 17, 2012 at 2:09 PM, nutchsolruser wrote: > I'm Also seeking solution for similar problem. > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Difference-between-two-solr-indexes-tp3916328p3917050.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: FilterCache - maximum size of document set
Thanks for your response Yes, maybe you are right. I thought that filters can be larger than 3M. All kinds of filters uses BitSet? Moreover maxSize of filterCache is set to 16000 in my case. There are evictions during day traffic but not during night traffic. Version of Solr which I use is 3.5 I haven't used Memory Anayzer yet. Could you write more details about it? -- Regards, Pawel On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson wrote: > Hmmm, I think you may be looking at the wrong thing here. Generally, a > filterCache > entry will be maxDocs/8 (plus some overhead), so in your case they really > shouldn't be all that large, on the order of 3M/filter. That shouldn't > vary based > on the number of docs that match the fq, it's just a bitset. To see if > that makes any > sense, take a look at the admin page and the number of evictions in > your filterCache. If > that is > 0, you're probably using all the memory you're going to in > the filterCache during > the day.. > > But you haven't indicated what version of Solr you're using, I'm going > from a > relatively recent 3x knowledge-base. > > Have you put a memory analyzer against your Solr instance to see where > the memory > is being used? > > Best > Erick > > On Wed, Jun 13, 2012 at 1:05 PM, Pawel wrote: > > Hi, > > I have solr index with about 25M documents. I optimized FilterCache size > to > > reach the best performance (considering traffic characteristic that my > Solr > > handles). I see that the only way to limit size of a Filter Cace is to > set > > number of document sets that Solr can cache. There is no way to set > memory > > limit (eg. 2GB, 4GB or something like that). When I process a standard > > trafiic (during day) everything is fine. But when Solr handle night > traffic > > (and the charateristic of requests change) some problems appear. There is > > JVM out of memory error. I know what is the reason. Some filters on some > > fields are quite poor filters. They returns 15M of documents or even > more. > > You could say 'Just put that into q'. I tried to put that filters into > > "Query" part but then, the statistics of request processing time (during > > day) become much worse. Reduction of Filter Cache maxSize is also not > good > > solution because during day cache filters are very very helpful. > > You could be interested in type of filters that I use. These are range > > filters (I tried standard range filters and frange) - eg. price:[* TO > > 1]. Some fq with price can return few thousands of results (eg. > > price:[40 TO 50]), but some (eg. price:[* TO 1]) can return milions > of > > documents. I'd also like to avoid solution which will introduce strict > > ranges that user can choose. > > Have you any suggestions what can I do? Is there any way to limit for > > example maximum size of docSet which is cached in FilterCache? > > > > -- > > Pawel >
Re: FilterCache - maximum size of document set
It can be true that filters cache max size is set to high value. That is also true that. We looked at evictions and hit rate earlier. Maybe you are right that evictions are not always unwanted. Some time ago we made tests. There are not so high difference in hit rate when filters maxSize is set to 4000 (hit rate about 85%) and 16000 (hitrate about 91%). I think that also using LFU cache can be helpful but it makes me to migrate to 3.6. Do you think it is reasonable to use slave on version 3.6 and master on 3.5? Once again, Thanks for your help -- Pawel On Thu, Jun 14, 2012 at 7:22 PM, Erick Erickson wrote: > Hmmm, your maxSize is pretty high, it may just be that you've set this > much higher > than is wise. The maxSize setting governs the number of entries. I'd start > with > a much lower number here, and monitor the solr/admin page for both > hit ratio and evictions. Well, and size too. 16,000 entries puts a > ceiling of, what, > 48G on it? Ouch! It sounds like what's happening here is you're just > accumulating > more and more fqs over the course of the evening and blowing memory. > > Not all FQs will be that big, there's some heuristics in there to just > store the > document numbers for sparse filters, maxDocs/8 is pretty much the upper > bound though. > > Evictions are not necessarily a bad thing, the hit-ratio is important > here. And > if you're using a bare NOW in your filter queries, you're probably never > re-using them anyway, see: > > http://www.lucidimagination.com/blog/2012/02/23/date-math-now-and-filter-queries/ > > I really question whether this limit is reasonable, but you know your > situation best. > > Best > Erick > > On Wed, Jun 13, 2012 at 5:40 PM, Pawel Rog wrote: > > Thanks for your response > > Yes, maybe you are right. I thought that filters can be larger than 3M. > All > > kinds of filters uses BitSet? > > Moreover maxSize of filterCache is set to 16000 in my case. There are > > evictions during day traffic > > but not during night traffic. > > > > Version of Solr which I use is 3.5 > > > > I haven't used Memory Anayzer yet. Could you write more details about it? > > > > -- > > Regards, > > Pawel > > > > On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson < > erickerick...@gmail.com>wrote: > > > >> Hmmm, I think you may be looking at the wrong thing here. Generally, a > >> filterCache > >> entry will be maxDocs/8 (plus some overhead), so in your case they > really > >> shouldn't be all that large, on the order of 3M/filter. That shouldn't > >> vary based > >> on the number of docs that match the fq, it's just a bitset. To see if > >> that makes any > >> sense, take a look at the admin page and the number of evictions in > >> your filterCache. If > >> that is > 0, you're probably using all the memory you're going to in > >> the filterCache during > >> the day.. > >> > >> But you haven't indicated what version of Solr you're using, I'm going > >> from a > >> relatively recent 3x knowledge-base. > >> > >> Have you put a memory analyzer against your Solr instance to see where > >> the memory > >> is being used? > >> > >> Best > >> Erick > >> > >> On Wed, Jun 13, 2012 at 1:05 PM, Pawel wrote: > >> > Hi, > >> > I have solr index with about 25M documents. I optimized FilterCache > size > >> to > >> > reach the best performance (considering traffic characteristic that my > >> Solr > >> > handles). I see that the only way to limit size of a Filter Cace is to > >> set > >> > number of document sets that Solr can cache. There is no way to set > >> memory > >> > limit (eg. 2GB, 4GB or something like that). When I process a standard > >> > trafiic (during day) everything is fine. But when Solr handle night > >> traffic > >> > (and the charateristic of requests change) some problems appear. > There is > >> > JVM out of memory error. I know what is the reason. Some filters on > some > >> > fields are quite poor filters. They returns 15M of documents or even > >> more. > >> > You could say 'Just put that into q'. I tried to put that filters into > >> > "Query" part but then, the statistics of request processing time > (during > >> > day) become much worse. Reduction of Filter Cache maxSize is also not > >> good > >> > solution because during day cache filters are very very helpful. > >> > You could be interested in type of filters that I use. These are range > >> > filters (I tried standard range filters and frange) - eg. price:[* TO > >> > 1]. Some fq with price can return few thousands of results (eg. > >> > price:[40 TO 50]), but some (eg. price:[* TO 1]) can return > milions > >> of > >> > documents. I'd also like to avoid solution which will introduce strict > >> > ranges that user can choose. > >> > Have you any suggestions what can I do? Is there any way to limit for > >> > example maximum size of docSet which is cached in FilterCache? > >> > > >> > -- > >> > Pawel > >> >
Re: FilterCache - maximum size of document set
Thanks I don't use NOW in queries. All my filters with timestamp are rounded to hundreds of seconds to increase hitrate. The only problem could be in price filters which can be varied (users are unpredictable :P), but also that filters from fq or setting cache=false" is also bad idea ... checked it :) Load rised three times :) -- Pawel On Fri, Jun 15, 2012 at 1:30 PM, Erick Erickson wrote: > Test first, of course, but slave on 3.6 and master on 3.5 should be > fine. If you're > getting evictions with the cache settings that high, you really want > to look at why. > > Note that in particular, using NOW in your filter queries virtually > guarantees > that they won't be re-used as per the link I sent yesterday. > > Best > Erick > > On Fri, Jun 15, 2012 at 1:15 AM, Pawel Rog wrote: > > It can be true that filters cache max size is set to high value. That is > > also true that. > > We looked at evictions and hit rate earlier. Maybe you are right that > > evictions are > > not always unwanted. Some time ago we made tests. There are not so high > > difference in hit rate when filters maxSize is set to 4000 (hit rate > about > > 85%) and > > 16000 (hitrate about 91%). I think that also using LFU cache can be > helpful > > but > > it makes me to migrate to 3.6. Do you think it is reasonable to use > slave on > > version 3.6 and master on 3.5? > > > > Once again, Thanks for your help > > > > -- > > Pawel > > > > On Thu, Jun 14, 2012 at 7:22 PM, Erick Erickson >wrote: > > > >> Hmmm, your maxSize is pretty high, it may just be that you've set this > >> much higher > >> than is wise. The maxSize setting governs the number of entries. I'd > start > >> with > >> a much lower number here, and monitor the solr/admin page for both > >> hit ratio and evictions. Well, and size too. 16,000 entries puts a > >> ceiling of, what, > >> 48G on it? Ouch! It sounds like what's happening here is you're just > >> accumulating > >> more and more fqs over the course of the evening and blowing memory. > >> > >> Not all FQs will be that big, there's some heuristics in there to just > >> store the > >> document numbers for sparse filters, maxDocs/8 is pretty much the upper > >> bound though. > >> > >> Evictions are not necessarily a bad thing, the hit-ratio is important > >> here. And > >> if you're using a bare NOW in your filter queries, you're probably never > >> re-using them anyway, see: > >> > >> > http://www.lucidimagination.com/blog/2012/02/23/date-math-now-and-filter-queries/ > >> > >> I really question whether this limit is reasonable, but you know your > >> situation best. > >> > >> Best > >> Erick > >> > >> On Wed, Jun 13, 2012 at 5:40 PM, Pawel Rog > wrote: > >> > Thanks for your response > >> > Yes, maybe you are right. I thought that filters can be larger than > 3M. > >> All > >> > kinds of filters uses BitSet? > >> > Moreover maxSize of filterCache is set to 16000 in my case. There are > >> > evictions during day traffic > >> > but not during night traffic. > >> > > >> > Version of Solr which I use is 3.5 > >> > > >> > I haven't used Memory Anayzer yet. Could you write more details about > it? > >> > > >> > -- > >> > Regards, > >> > Pawel > >> > > >> > On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson < > >> erickerick...@gmail.com>wrote: > >> > > >> >> Hmmm, I think you may be looking at the wrong thing here. Generally, > a > >> >> filterCache > >> >> entry will be maxDocs/8 (plus some overhead), so in your case they > >> really > >> >> shouldn't be all that large, on the order of 3M/filter. That > shouldn't > >> >> vary based > >> >> on the number of docs that match the fq, it's just a bitset. To see > if > >> >> that makes any > >> >> sense, take a look at the admin page and the number of evictions in > >> >> your filterCache. If > >> >> that is > 0, you're probably using all the memory you're going to in > >> >> the filterCache during > >> >> the day.. > >> >> > >> >> But you haven't indicated what version of Solr you're using
Re: Wildcard query vs facet.prefix for autocomplete?
Maybe try EdgeNgramFilterFactory http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/#solr.EdgeNGramFilterFactory On Mon, Jul 16, 2012 at 6:57 AM, santamaria2 wrote: > I'm about to implement an autocomplete mechanism for my search box. I've > read > about some of the common approaches, but I have a question about wildcard > query vs facet.prefix. > > Say I want autocomplete for a title: 'Shadows of the Damned'. I want this > to > appear as a suggestion if I type 'sha' or 'dam' or 'the'. I don't care that > it won't appear if I type 'hadows'. > > While indexing, I'd use a whitespace tokenizer and a lowercase filter to > store that title in the index. > Now I'm thinking two approaches for 'dam' typed in the search box: > > 1) q=title:dam* > > 2) q=*:*&facet=on&facet.field=title&facet.prefix=dam > > > So any reason that I should favour one over the other? Speed a factor? The > index has around 200,000 items. > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Wildcard-query-vs-facet-prefix-for-autocomplete-tp3995199.html > Sent from the Solr - User mailing list archive at Nabble.com. >