Contribute QParserPlugin

2014-05-28 Thread Pawel Rog
Hi,
I need QParserPlugin that will use Redis as a backend to prepare filter
queries. There are several data structures available in Redis (hash, set,
etc.). From some reasons I cannot fetch data from redis data structures,
build and send big requests from application. That's why I want to build
that filters on backend (Solr) side.

I'm wondering what do I have to do to contribute QParserPlugin into Solr
repository. Can you suggest me a way (in a few steps) to publish it in Solr
repository, probably as a contrib?

--
Paweł Róg


Solr cloud hangs

2014-02-17 Thread Pawel Rog
Hi,
I have quite annoying problem with Solr cloud. I have a cluster with 8
shards and with 2 replicas in each. (Solr 4.6.1)
After some time cluster doesn't respond to any update requests. Restarting
the cluster nodes doesn't help.

There are a lot of such stack traces (waiting for very long time):


   - sun.misc.Unsafe.park(Native Method)
   - java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
   -
   
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
   -
   org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:342)
   -
   
org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:526)
   -
   
org.eclipse.jetty.util.thread.QueuedThreadPool.access$600(QueuedThreadPool.java:44)
   -
   
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
   - java.lang.Thread.run(Thread.java:722)


Do you have any idea where can I look for?

--
Pawel


Re: Solr cloud hangs

2014-02-17 Thread Pawel Rog
Hi,
Here is the whole stack trace: https://gist.github.com/anonymous/9056783

--
Pawel

On Mon, Feb 17, 2014 at 4:53 PM, Mark Miller  wrote:

> Can you share the full stack trace dump?
>
> - Mark
>
> http://about.me/markrmiller
>
> On Feb 17, 2014, at 7:07 AM, Pawel Rog  wrote:
>
> > Hi,
> > I have quite annoying problem with Solr cloud. I have a cluster with 8
> > shards and with 2 replicas in each. (Solr 4.6.1)
> > After some time cluster doesn't respond to any update requests.
> Restarting
> > the cluster nodes doesn't help.
> >
> > There are a lot of such stack traces (waiting for very long time):
> >
> >
> >   - sun.misc.Unsafe.park(Native Method)
> >   -
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> >   -
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
> >   -
> >
> org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:342)
> >   -
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:526)
> >   -
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.access$600(QueuedThreadPool.java:44)
> >   -
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
> >   - java.lang.Thread.run(Thread.java:722)
> >
> >
> > Do you have any idea where can I look for?
> >
> > --
> > Pawel
>
>


Re: Solr cloud hangs

2014-02-17 Thread Pawel Rog
There are also many errors in solr log like that one:

org.apache.solr.update.StreamingSolrServers$1; error
org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for
connection from pool
at
org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:232)
at
org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:199)
at
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:456)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:232)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)


--
Pawel


On Mon, Feb 17, 2014 at 8:01 PM, Pawel Rog  wrote:

> Hi,
> Here is the whole stack trace: https://gist.github.com/anonymous/9056783
>
> --
> Pawel
>
>
> On Mon, Feb 17, 2014 at 4:53 PM, Mark Miller wrote:
>
>> Can you share the full stack trace dump?
>>
>> - Mark
>>
>> http://about.me/markrmiller
>>
>> On Feb 17, 2014, at 7:07 AM, Pawel Rog  wrote:
>>
>> > Hi,
>> > I have quite annoying problem with Solr cloud. I have a cluster with 8
>> > shards and with 2 replicas in each. (Solr 4.6.1)
>> > After some time cluster doesn't respond to any update requests.
>> Restarting
>> > the cluster nodes doesn't help.
>> >
>> > There are a lot of such stack traces (waiting for very long time):
>> >
>> >
>> >   - sun.misc.Unsafe.park(Native Method)
>> >   -
>> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>> >   -
>> >
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
>> >   -
>> >
>> org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:342)
>> >   -
>> >
>> org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:526)
>> >   -
>> >
>> org.eclipse.jetty.util.thread.QueuedThreadPool.access$600(QueuedThreadPool.java:44)
>> >   -
>> >
>> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
>> >   - java.lang.Thread.run(Thread.java:722)
>> >
>> >
>> > Do you have any idea where can I look for?
>> >
>> > --
>> > Pawel
>>
>>
>


Edismax parser and boosts

2014-10-08 Thread Pawel Rog
Hi,
I use edismax query with q parameter set as below:

q=foo^1.0+AND+bar

For such a query for the same document I see different (lower) scoring
value than for

q=foo+AND+bar

By default boost of term is 1 as far as i know so why the scoring differs?

When I check debugQuery parameter in parsedQuery for "foo^1.0+AND+bar" I
see Boolean query which one of clauses is a phrase query "foo 1.0 bar". It
seems that edismax parser takes whole q parameter as a phrase without
removing boost value and add it as a boolean clause. Is it a bug or it
should work like that?

--
Paweł Róg


Re: Edismax parser and boosts

2014-10-09 Thread Pawel Rog
Hi,
Thank you for your response.
I checked it in Solr 4.8 but I think this works as I described from very
long time. I'm not 100% sure if it is really bug or not. When I run phrase
query like "foo^1.0 bar" this works very similarto what happens in edismax
with set *pf* parameter (boost part is not removed).

--
Paweł Róg

On Thu, Oct 9, 2014 at 12:07 AM, Jack Krupansky 
wrote:

> Definitely sounds like a bug! File a Jira. Thanks for reporting this. What
> release of Solr?
>
>
>
> -- Jack Krupansky
> -----Original Message- From: Pawel Rog
> Sent: Wednesday, October 8, 2014 3:57 PM
> To: solr-user@lucene.apache.org
> Subject: Edismax parser and boosts
>
>
> Hi,
> I use edismax query with q parameter set as below:
>
> q=foo^1.0+AND+bar
>
> For such a query for the same document I see different (lower) scoring
> value than for
>
> q=foo+AND+bar
>
> By default boost of term is 1 as far as i know so why the scoring differs?
>
> When I check debugQuery parameter in parsedQuery for "foo^1.0+AND+bar" I
> see Boolean query which one of clauses is a phrase query "foo 1.0 bar". It
> seems that edismax parser takes whole q parameter as a phrase without
> removing boost value and add it as a boolean clause. Is it a bug or it
> should work like that?
>
> --
> Paweł Róg
>


Highlighting integer field

2014-12-11 Thread Pawel Rog
Hi,
Is it possible to highlight int (TrieLongField) or long (TrieLongField)
field in Solr?

--
Paweł


Re: Solr 3.5 very slow (performance)

2011-11-29 Thread Pawel Rog
examples

facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=0&q=name:(kurtka+skóry+brazowe42)&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50

facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=1350&q=name:naczepa&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50

facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=0&q=it_name:(miłosz+giedroyc)&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50

default operation ANDpromoted - intending - intb_count - intname -
textcat1 - intcat2 -int
these are only few examples. almost all queries are much slower. there
was about 60 searches per second on old and new version of solr. solr
1.4 reached 200% cpu utilization and solr 3.5 reached 1200% cpu
utilization on same machine

On Tue, Nov 29, 2011 at 7:05 PM, Yonik Seeley
 wrote:
> On Tue, Nov 29, 2011 at 12:25 PM, Pawel  wrote:
>> I've build index on solr 1.4 some time ago (about 18milions documents,
>> about 8GB). I need new features from newer version of solr, so i
>> decided to upgrade solr version from 1.4 to 3.5.
>>
>> * I created new solr master on new physical machine
>> * then I created new index using the same schema as in earlier version
>> * then I indexed some slave, and start sending the same requests as
>> earlier but to newer version of solr (3.5, but the same situation is
>> on solr 3.4).
>>
>> The CPU went from 200% to 1200% and load went from 3 to 15. Avarage
>> QTime went from 15ms to 180ms and median went from 1ms to 150ms
>> I didn't change any parameters in solrconfig and schema.
>
> What are the requests that look slower?
>
> -Yonik
> http://www.lucidimagination.com


Re: Solr 3.5 very slow (performance)

2011-11-29 Thread Pawel Rog
in my last pos i mean
default operation AND
promoted - int
ending - int
b_count - int
name - text
cat1 - int
cat2 - int

On Tue, Nov 29, 2011 at 7:54 PM, Pawel Rog  wrote:
> examples
>
> facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=0&q=name:(kurtka+skóry+brazowe42)&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50
>
> facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=1350&q=name:naczepa&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50
>
> facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=0&q=it_name:(miłosz+giedroyc)&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50
>
> default operation ANDpromoted - intending - intb_count - intname -
> textcat1 - intcat2 -int
> these are only few examples. almost all queries are much slower. there
> was about 60 searches per second on old and new version of solr. solr
> 1.4 reached 200% cpu utilization and solr 3.5 reached 1200% cpu
> utilization on same machine
>
> On Tue, Nov 29, 2011 at 7:05 PM, Yonik Seeley
>  wrote:
>> On Tue, Nov 29, 2011 at 12:25 PM, Pawel  wrote:
>>> I've build index on solr 1.4 some time ago (about 18milions documents,
>>> about 8GB). I need new features from newer version of solr, so i
>>> decided to upgrade solr version from 1.4 to 3.5.
>>>
>>> * I created new solr master on new physical machine
>>> * then I created new index using the same schema as in earlier version
>>> * then I indexed some slave, and start sending the same requests as
>>> earlier but to newer version of solr (3.5, but the same situation is
>>> on solr 3.4).
>>>
>>> The CPU went from 200% to 1200% and load went from 3 to 15. Avarage
>>> QTime went from 15ms to 180ms and median went from 1ms to 150ms
>>> I didn't change any parameters in solrconfig and schema.
>>
>> What are the requests that look slower?
>>
>> -Yonik
>> http://www.lucidimagination.com


Re: Solr 3.5 very slow (performance)

2011-11-29 Thread Pawel Rog
On Tue, Nov 29, 2011 at 9:13 PM, Chris Hostetter
 wrote:
>
> Let's back up a minute and cover some basics...
>
> 1) You said that you built a brand new index on a brand new master server,
> using Solr 3.5 -- how do you build your indexes?  did the source data
> change at all? does your new index have the same number of docs as your
> previous Solr 1.4 index?  what does a directory listing (including file
> sizes) look like for both your old and new indexes?

Yes, both indexes have same data. Indexes are build using some C++
programm which reads data from database and inserts it into Solr
(using XML). Both indexes have about 8GB size and 18milions documents.


> 2) Did you try using your Solr 1.4 index (and configs) directly in Solr
> 3.5 w/o rebuilding from scratch?

Yes I used the same configs in solr 1.4 and solr 3.5 (adding only line
about "luceneMatchVersion")
As I see in example of solr 3.5 in repository (solrconfig.xml) there
are not many diffrences.

> 3) You said you build the new index on a new mmachine, but then you said
> you used a slave where the performanne was worse then Solr 1.4 "on the
> same machine" ... are you running both the Solr 1.4 and Solr 3.5 instances
> concurrently on your slave machine?  How much physical ram is on that
> machine? what JVM options are using when running the Solr 3.5 instance?
> what servlet container are you using?

Mayby I didn't wrote precisely enough. I have some machine on which
there is master node. I have second machine on which there is slave. I
tested solr 1.4 on that machine, then turned it off and turned on
solr-3.5. I have 36GB RAM on that machine.
On both - solr 1.4 and 3.5 configuration of JVM is the same, and the
same servlet container ... jetty-6

JVM options: -server -Xms12000m -Xmx12000m -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:NewSize=1500m -XX:ParallelGCThreads=8
-XX:CMSInitiatingOccupancyFraction=60

> 4) what does your request handler configuration look like?  do you have
> any default/invariant/appended request params?



explicit






http://${masterHost}:${masterPort}/solr-3.5/${solr.core.instanceDir}replication
00:00:02
5000
1
 



> 5) The descriptions youve given of how the performance has changed sound
> like you are doing concurrent load testing -- did you do cache warming before 
> you
> started your testing?  how many client threads are hitting the solr server
> at one time?

Maybe I wasn't precise enough again. CPU on solr 1.4 was 200% and on
solr 3.5 1200%
yes there is cache warming. There are 50-100 client threads on both
1.4 and 3.5. There are about 60 requests per second on 3.5 and on 1.4,
but on 3.5 responses are slower and CPU usage much higher.

> 6) have you tried doing some basic manual testing to see how individual
> requests performe?  ie: single client at a time, loading a URL, then
> request the same URL again to verify that your Solr caches are in use and
> the QTime is low.  If you see slow respone times even when manually
> executing single requests at a time, have you tried using "debug=timing"
> to see which serach components are contributing the most to the slow
> QTimes?

Most time is in org.apache.solr.handler.component.QueryComponent and
org.apache.solr.handler.component.DebugComponent in process. I didn't
comare individual request performance.

> 7) What do the cache stats look like on your Solr 3.5 instance after
> you've done some of this timing testing?  the output of...
> http://localhost:8983/solr/admin/mbeans?cat=CACHE&stats=true&wt=json&indent=true
> ...would be helpful. NOTE: you may need to add this to your solrconfig.xml
> for that URL to work...
>  '
>

Will check it :)

>
> : in my last pos i mean
> : default operation AND
> : promoted - int
> : ending - int
> : b_count - int
> : name - text
> : cat1 - int
> : cat2 - int
> :
> : On Tue, Nov 29, 2011 at 7:54 PM, Pawel Rog  wrote:
> : > examples
> : >
> : > 
> facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=0&q=name:(kurtka+skóry+brazowe42)&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50
> : >
> : > 
> facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=1350&q=name:naczepa&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50
> : >
> : > 
> facet=true&sort=promoted+desc,ending+asc,b_count+desc&facet.mincount=1&start=0&q=it_name:(miłosz+giedroyc)&facet.limit=500&facet.field=cat1&facet.field=cat2&wt=json&rows=50
> : >
> : > default operation ANDpro

Re: Solr 3.5 very slow (performance)

2011-11-29 Thread Pawel Rog
IO waits about 0-2%
Didn't see any suspicious activity in logs, but I can check it again

On Tue, Nov 29, 2011 at 11:40 PM, Darren Govoni  wrote:
> Any suspicous activity in the logs? what about disk activity?
>
>
> On 11/29/2011 05:22 PM, Pawel Rog wrote:
>>
>> On Tue, Nov 29, 2011 at 9:13 PM, Chris Hostetter
>>   wrote:
>>>
>>> Let's back up a minute and cover some basics...
>>>
>>> 1) You said that you built a brand new index on a brand new master
>>> server,
>>> using Solr 3.5 -- how do you build your indexes?  did the source data
>>> change at all? does your new index have the same number of docs as your
>>> previous Solr 1.4 index?  what does a directory listing (including file
>>> sizes) look like for both your old and new indexes?
>>
>> Yes, both indexes have same data. Indexes are build using some C++
>> programm which reads data from database and inserts it into Solr
>> (using XML). Both indexes have about 8GB size and 18milions documents.
>>
>>
>>> 2) Did you try using your Solr 1.4 index (and configs) directly in Solr
>>> 3.5 w/o rebuilding from scratch?
>>
>> Yes I used the same configs in solr 1.4 and solr 3.5 (adding only line
>> about "luceneMatchVersion")
>> As I see in example of solr 3.5 in repository (solrconfig.xml) there
>> are not many diffrences.
>>
>>> 3) You said you build the new index on a new mmachine, but then you said
>>> you used a slave where the performanne was worse then Solr 1.4 "on the
>>> same machine" ... are you running both the Solr 1.4 and Solr 3.5
>>> instances
>>> concurrently on your slave machine?  How much physical ram is on that
>>> machine? what JVM options are using when running the Solr 3.5 instance?
>>> what servlet container are you using?
>>
>> Mayby I didn't wrote precisely enough. I have some machine on which
>> there is master node. I have second machine on which there is slave. I
>> tested solr 1.4 on that machine, then turned it off and turned on
>> solr-3.5. I have 36GB RAM on that machine.
>> On both - solr 1.4 and 3.5 configuration of JVM is the same, and the
>> same servlet container ... jetty-6
>>
>> JVM options: -server -Xms12000m -Xmx12000m -XX:+UseParNewGC
>> -XX:+UseConcMarkSweepGC -XX:NewSize=1500m -XX:ParallelGCThreads=8
>> -XX:CMSInitiatingOccupancyFraction=60
>>
>>> 4) what does your request handler configuration look like?  do you have
>>> any default/invariant/appended request params?
>>
>> 
>>        
>>        explicit
>>        
>> 
>> > class="org.apache.solr.handler.admin.AdminHandlers" />
>> 
>>        
>>                        
>>                > name="masterUrl">http://${masterHost}:${masterPort}/solr-3.5/${solr.core.instanceDir}replication
>>                00:00:02
>>                5000
>>                1
>>        
>> 
>>
>>
>>> 5) The descriptions youve given of how the performance has changed sound
>>> like you are doing concurrent load testing -- did you do cache warming
>>> before you
>>> started your testing?  how many client threads are hitting the solr
>>> server
>>> at one time?
>>
>> Maybe I wasn't precise enough again. CPU on solr 1.4 was 200% and on
>> solr 3.5 1200%
>> yes there is cache warming. There are 50-100 client threads on both
>> 1.4 and 3.5. There are about 60 requests per second on 3.5 and on 1.4,
>> but on 3.5 responses are slower and CPU usage much higher.
>>
>>> 6) have you tried doing some basic manual testing to see how individual
>>> requests performe?  ie: single client at a time, loading a URL, then
>>> request the same URL again to verify that your Solr caches are in use and
>>> the QTime is low.  If you see slow respone times even when manually
>>> executing single requests at a time, have you tried using "debug=timing"
>>> to see which serach components are contributing the most to the slow
>>> QTimes?
>>
>> Most time is in org.apache.solr.handler.component.QueryComponent and
>> org.apache.solr.handler.component.DebugComponent in process. I didn't
>> comare individual request performance.
>>
>>> 7) What do the cache stats look like on your Solr 3.5 instance after
>>> you've done some of this timing testing?  the output of...
>>>
>>> http://localhost:8983/solr/admin/mbeans?cat=CACHE&stats=true&wt=json

Re: Solr 3.5 very slow (performance)

2011-11-30 Thread Pawel Rog
* 1st question (ls from index directory)

solr 1.4

-rw-r--r-- 1 user user2180582 Nov 30 07:26 _3g1_cf.del
-rw-r--r-- 1 user user 5190652802 Nov 28 17:57 _3g1.fdt
-rw-r--r-- 1 user user  139556724 Nov 28 17:57 _3g1.fdx
-rw-r--r-- 1 user user   4963 Nov 28 17:56 _3g1.fnm
-rw-r--r-- 1 user user 1879006175 Nov 28 18:01 _3g1.frq
-rw-r--r-- 1 user user  513919573 Nov 28 18:01 _3g1.prx
-rw-r--r-- 1 user user2745451 Nov 28 18:01 _3g1.tii
-rw-r--r-- 1 user user  218731810 Nov 28 18:01 _3g1.tis
-rw-r--r-- 1 user user 275268 Nov 30 07:26 _3uu_1a.del
-rw-r--r-- 1 user user  666375513 Nov 30 03:35 _3uu.fdt
-rw-r--r-- 1 user user   17616636 Nov 30 03:35 _3uu.fdx
-rw-r--r-- 1 user user   4884 Nov 30 03:35 _3uu.fnm
-rw-r--r-- 1 user user  243847897 Nov 30 03:35 _3uu.frq
-rw-r--r-- 1 user user   64791316 Nov 30 03:35 _3uu.prx
-rw-r--r-- 1 user user 545317 Nov 30 03:35 _3uu.tii
-rw-r--r-- 1 user user   42993472 Nov 30 03:35 _3uu.tis
-rw-r--r-- 1 user user   1178 Nov 30 07:26 _3wj_1.del
-rw-r--r-- 1 user user2813124 Nov 30 07:26 _3wj.fdt
-rw-r--r-- 1 user user  74852 Nov 30 07:26 _3wj.fdx
-rw-r--r-- 1 user user   2175 Nov 30 07:26 _3wj.fnm
-rw-r--r-- 1 user user 911051 Nov 30 07:26 _3wj.frq
-rw-r--r-- 1 user user  4 Nov 30 07:26 _3wj.nrm
-rw-r--r-- 1 user user 285405 Nov 30 07:26 _3wj.prx
-rw-r--r-- 1 user user   7951 Nov 30 07:26 _3wj.tii
-rw-r--r-- 1 user user 624702 Nov 30 07:26 _3wj.tis
-rw-r--r-- 1 user user   35859092 Nov 30 07:26 _3wk.fdt
-rw-r--r-- 1 user user 958148 Nov 30 07:26 _3wk.fdx
-rw-r--r-- 1 user user   4104 Nov 30 07:26 _3wk.fnm
-rw-r--r-- 1 user user   12228212 Nov 30 07:26 _3wk.frq
-rw-r--r-- 1 user user3438508 Nov 30 07:26 _3wk.prx
-rw-r--r-- 1 user user  58672 Nov 30 07:26 _3wk.tii
-rw-r--r-- 1 user user4621519 Nov 30 07:26 _3wk.tis
-rw-r--r-- 1 user user  0 Nov 30 07:27
lucene-9445a367a714cc9bf70d0ebdf83b9e01-write.lock
-rw-r--r-- 1 user user   1010 Nov 30 07:26 segments_2tr
-rw-r--r-- 1 user user 20 Nov 17 14:06 segments.gen

solr 3.5 (dates are older - because I turned off feeding 3.5 instance)

-rw-r--r-- 1 user user2188376 Nov 29 13:10 _2x_6g.del
-rw-r--r-- 1 user user 4955406209 Nov 28 17:38 _2x.fdt
-rw-r--r-- 1 user user  140054140 Nov 28 17:38 _2x.fdx
-rw-r--r-- 1 user user   4852 Nov 28 17:37 _2x.fnm
-rw-r--r-- 1 user user 1845719205 Nov 28 17:42 _2x.frq
-rw-r--r-- 1 user user  497871055 Nov 28 17:42 _2x.prx
-rw-r--r-- 1 user user3006635 Nov 28 17:42 _2x.tii
-rw-r--r-- 1 user user  230304265 Nov 28 17:42 _2x.tis
-rw-r--r-- 1 user user  50128 Nov 29 13:10 _5s_48.del
-rw-r--r-- 1 user user  116159640 Nov 29 00:25 _5s.fdt
-rw-r--r-- 1 user user3206268 Nov 29 00:25 _5s.fdx
-rw-r--r-- 1 user user   4963 Nov 29 00:25 _5s.fnm
-rw-r--r-- 1 user user   44556139 Nov 29 00:25 _5s.frq
-rw-r--r-- 1 user user   11405232 Nov 29 00:25 _5s.prx
-rw-r--r-- 1 user user 149965 Nov 29 00:25 _5s.tii
-rw-r--r-- 1 user user   11662163 Nov 29 00:25 _5s.tis
-rw-r--r-- 1 user user  63191 Nov 29 13:10 _97_1o.del
-rw-r--r-- 1 user user  145482785 Nov 29 08:08 _97.fdt
-rw-r--r-- 1 user user4042300 Nov 29 08:08 _97.fdx
-rw-r--r-- 1 user user   4963 Nov 29 08:08 _97.fnm
-rw-r--r-- 1 user user   55361299 Nov 29 08:08 _97.frq
-rw-r--r-- 1 user user   14181208 Nov 29 08:08 _97.prx
-rw-r--r-- 1 user user 187731 Nov 29 08:08 _97.tii
-rw-r--r-- 1 user user   14617940 Nov 29 08:08 _97.tis
-rw-r--r-- 1 user user  21310 Nov 29 13:10 _9q_1a.del
-rw-r--r-- 1 user user   49864395 Nov 29 09:19 _9q.fdt
-rw-r--r-- 1 user user1361884 Nov 29 09:19 _9q.fdx
-rw-r--r-- 1 user user   4963 Nov 29 09:19 _9q.fnm
-rw-r--r-- 1 user user   17879364 Nov 29 09:19 _9q.frq
-rw-r--r-- 1 user user4970178 Nov 29 09:19 _9q.prx
-rw-r--r-- 1 user user  75969 Nov 29 09:19 _9q.tii
-rw-r--r-- 1 user user5932085 Nov 29 09:19 _9q.tis
-rw-r--r-- 1 user user   62661357 Nov 29 10:19 _a6.fdt
-rw-r--r-- 1 user user1717820 Nov 29 10:19 _a6.fdx
-rw-r--r-- 1 user user   4963 Nov 29 10:19 _a6.fnm
-rw-r--r-- 1 user user   23283028 Nov 29 10:19 _a6.frq
-rw-r--r-- 1 user user6196945 Nov 29 10:19 _a6.prx
-rw-r--r-- 1 user user  92528 Nov 29 10:19 _a6.tii
-rw-r--r-- 1 user user7209783 Nov 29 10:19 _a6.tis
-rw-r--r-- 1 user user  26871 Nov 29 13:10 _a6_y.del
-rw-r--r-- 1 user user   16372020 Nov 29 10:39 _ab.fdt
-rw-r--r-- 1 user user 455476 Nov 29 10:39 _ab.fdx
-rw-r--r-- 1 user user   4963 Nov 29 10:39 _ab.fnm
-rw-r--r-- 1 user user6025966 Nov 29 10:39 _ab.frq
-rw-r--r-- 1 user user1622841 Nov 29 10:39 _ab.prx
-rw-r--r-- 1 user user  35252 Nov 29 10:39 _ab.tii
-rw-r--r-- 1 user user2766468 Nov 29 10:39 _ab.tis
-rw-r--r-- 1 user user   7147 Nov 29 13:10 _ab_u.del
-rw-r--r-- 1 user user   14818116 Nov 29 11:09 _aj.fdt
-rw-r--r-- 1 user user 409356 Nov 29 11:09 _aj.fdx
-rw-r--r-- 1 user user   4963 Nov 29 11:09 _aj.fnm
-rw-r--r-- 1 user user5461353 N

Re: Solr 3.5 very slow (performance)

2011-11-30 Thread Pawel Rog
I attach chart which presents cpu usage. Solr 3.5 uses almost all cpu
(left side of chart).
at the begining of chart there was about 60rps and about 100rps
(before turning off solr 3.5). Then there was 1.4 turned on with
100rps.

--
Pawel

On Wed, Nov 30, 2011 at 9:07 AM, Pawel Rog  wrote:
> * 1st question (ls from index directory)
>
> solr 1.4
>
> -rw-r--r-- 1 user user    2180582 Nov 30 07:26 _3g1_cf.del
> -rw-r--r-- 1 user user 5190652802 Nov 28 17:57 _3g1.fdt
> -rw-r--r-- 1 user user  139556724 Nov 28 17:57 _3g1.fdx
> -rw-r--r-- 1 user user       4963 Nov 28 17:56 _3g1.fnm
> -rw-r--r-- 1 user user 1879006175 Nov 28 18:01 _3g1.frq
> -rw-r--r-- 1 user user  513919573 Nov 28 18:01 _3g1.prx
> -rw-r--r-- 1 user user    2745451 Nov 28 18:01 _3g1.tii
> -rw-r--r-- 1 user user  218731810 Nov 28 18:01 _3g1.tis
> -rw-r--r-- 1 user user     275268 Nov 30 07:26 _3uu_1a.del
> -rw-r--r-- 1 user user  666375513 Nov 30 03:35 _3uu.fdt
> -rw-r--r-- 1 user user   17616636 Nov 30 03:35 _3uu.fdx
> -rw-r--r-- 1 user user       4884 Nov 30 03:35 _3uu.fnm
> -rw-r--r-- 1 user user  243847897 Nov 30 03:35 _3uu.frq
> -rw-r--r-- 1 user user   64791316 Nov 30 03:35 _3uu.prx
> -rw-r--r-- 1 user user     545317 Nov 30 03:35 _3uu.tii
> -rw-r--r-- 1 user user   42993472 Nov 30 03:35 _3uu.tis
> -rw-r--r-- 1 user user       1178 Nov 30 07:26 _3wj_1.del
> -rw-r--r-- 1 user user    2813124 Nov 30 07:26 _3wj.fdt
> -rw-r--r-- 1 user user      74852 Nov 30 07:26 _3wj.fdx
> -rw-r--r-- 1 user user       2175 Nov 30 07:26 _3wj.fnm
> -rw-r--r-- 1 user user     911051 Nov 30 07:26 _3wj.frq
> -rw-r--r-- 1 user user          4 Nov 30 07:26 _3wj.nrm
> -rw-r--r-- 1 user user     285405 Nov 30 07:26 _3wj.prx
> -rw-r--r-- 1 user user       7951 Nov 30 07:26 _3wj.tii
> -rw-r--r-- 1 user user     624702 Nov 30 07:26 _3wj.tis
> -rw-r--r-- 1 user user   35859092 Nov 30 07:26 _3wk.fdt
> -rw-r--r-- 1 user user     958148 Nov 30 07:26 _3wk.fdx
> -rw-r--r-- 1 user user       4104 Nov 30 07:26 _3wk.fnm
> -rw-r--r-- 1 user user   12228212 Nov 30 07:26 _3wk.frq
> -rw-r--r-- 1 user user    3438508 Nov 30 07:26 _3wk.prx
> -rw-r--r-- 1 user user      58672 Nov 30 07:26 _3wk.tii
> -rw-r--r-- 1 user user    4621519 Nov 30 07:26 _3wk.tis
> -rw-r--r-- 1 user user          0 Nov 30 07:27
> lucene-9445a367a714cc9bf70d0ebdf83b9e01-write.lock
> -rw-r--r-- 1 user user       1010 Nov 30 07:26 segments_2tr
> -rw-r--r-- 1 user user         20 Nov 17 14:06 segments.gen
>
> solr 3.5 (dates are older - because I turned off feeding 3.5 instance)
>
> -rw-r--r-- 1 user user    2188376 Nov 29 13:10 _2x_6g.del
> -rw-r--r-- 1 user user 4955406209 Nov 28 17:38 _2x.fdt
> -rw-r--r-- 1 user user  140054140 Nov 28 17:38 _2x.fdx
> -rw-r--r-- 1 user user       4852 Nov 28 17:37 _2x.fnm
> -rw-r--r-- 1 user user 1845719205 Nov 28 17:42 _2x.frq
> -rw-r--r-- 1 user user  497871055 Nov 28 17:42 _2x.prx
> -rw-r--r-- 1 user user    3006635 Nov 28 17:42 _2x.tii
> -rw-r--r-- 1 user user  230304265 Nov 28 17:42 _2x.tis
> -rw-r--r-- 1 user user      50128 Nov 29 13:10 _5s_48.del
> -rw-r--r-- 1 user user  116159640 Nov 29 00:25 _5s.fdt
> -rw-r--r-- 1 user user    3206268 Nov 29 00:25 _5s.fdx
> -rw-r--r-- 1 user user       4963 Nov 29 00:25 _5s.fnm
> -rw-r--r-- 1 user user   44556139 Nov 29 00:25 _5s.frq
> -rw-r--r-- 1 user user   11405232 Nov 29 00:25 _5s.prx
> -rw-r--r-- 1 user user     149965 Nov 29 00:25 _5s.tii
> -rw-r--r-- 1 user user   11662163 Nov 29 00:25 _5s.tis
> -rw-r--r-- 1 user user      63191 Nov 29 13:10 _97_1o.del
> -rw-r--r-- 1 user user  145482785 Nov 29 08:08 _97.fdt
> -rw-r--r-- 1 user user    4042300 Nov 29 08:08 _97.fdx
> -rw-r--r-- 1 user user       4963 Nov 29 08:08 _97.fnm
> -rw-r--r-- 1 user user   55361299 Nov 29 08:08 _97.frq
> -rw-r--r-- 1 user user   14181208 Nov 29 08:08 _97.prx
> -rw-r--r-- 1 user user     187731 Nov 29 08:08 _97.tii
> -rw-r--r-- 1 user user   14617940 Nov 29 08:08 _97.tis
> -rw-r--r-- 1 user user      21310 Nov 29 13:10 _9q_1a.del
> -rw-r--r-- 1 user user   49864395 Nov 29 09:19 _9q.fdt
> -rw-r--r-- 1 user user    1361884 Nov 29 09:19 _9q.fdx
> -rw-r--r-- 1 user user       4963 Nov 29 09:19 _9q.fnm
> -rw-r--r-- 1 user user   17879364 Nov 29 09:19 _9q.frq
> -rw-r--r-- 1 user user    4970178 Nov 29 09:19 _9q.prx
> -rw-r--r-- 1 user user      75969 Nov 29 09:19 _9q.tii
> -rw-r--r-- 1 user user    5932085 Nov 29 09:19 _9q.tis
> -rw-r--r-- 1 user user   62661357 Nov 29 10:19 _a6.fdt
> -rw-r--r-- 1 user user    1717820 Nov 29 10:19 _a6.fdx
> -rw-r--r-- 1 user user       4963 Nov 29 10:19 _a6.fnm
> -rw-r--r-- 1 user user   23283028 Nov 29 10:19 _a6.frq
> -rw-r--r-- 1 user user    6196945 Nov 29 10:19 _a6.prx
> -rw-r--r-- 1 user user      92528 Nov 29 10:19 _a6.tii
> -rw-r--r-- 1 user user    7209783 Nov 29 10:19 _a6.ti

Re: Solr 3.5 very slow (performance)

2011-11-30 Thread Pawel Rog
I made thread dump. Most active threads have such trace:

"471003383@qtp-536357250-245" - Thread t@270
   java.lang.Thread.State: RUNNABLE
at 
org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:702)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1144)
at 
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:362)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:378)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)


On Wed, Nov 30, 2011 at 10:31 AM, Pawel Rog  wrote:
> I attach chart which presents cpu usage. Solr 3.5 uses almost all cpu
> (left side of chart).
> at the begining of chart there was about 60rps and about 100rps
> (before turning off solr 3.5). Then there was 1.4 turned on with
> 100rps.
>
> --
> Pawel
>
> On Wed, Nov 30, 2011 at 9:07 AM, Pawel Rog  wrote:
>> * 1st question (ls from index directory)
>>
>> solr 1.4
>>
>> -rw-r--r-- 1 user user    2180582 Nov 30 07:26 _3g1_cf.del
>> -rw-r--r-- 1 user user 5190652802 Nov 28 17:57 _3g1.fdt
>> -rw-r--r-- 1 user user  139556724 Nov 28 17:57 _3g1.fdx
>> -rw-r--r-- 1 user user       4963 Nov 28 17:56 _3g1.fnm
>> -rw-r--r-- 1 user user 1879006175 Nov 28 18:01 _3g1.frq
>> -rw-r--r-- 1 user user  513919573 Nov 28 18:01 _3g1.prx
>> -rw-r--r-- 1 user user    2745451 Nov 28 18:01 _3g1.tii
>> -rw-r--r-- 1 user user  218731810 Nov 28 18:01 _3g1.tis
>> -rw-r--r-- 1 user user     275268 Nov 30 07:26 _3uu_1a.del
>> -rw-r--r-- 1 user user  666375513 Nov 30 03:35 _3uu.fdt
>> -rw-r--r-- 1 user user   17616636 Nov 30 03:35 _3uu.fdx
>> -rw-r--r-- 1 user user       4884 Nov 30 03:35 _3uu.fnm
>> -rw-r--r-- 1 user user  243847897 Nov 30 03:35 _3uu.frq
>> -rw-r--r-- 1 user user   64791316 Nov 30 03:35 _3uu.prx
>> -rw-r--r-- 1 user user     545317 Nov 30 03:35 _3uu.tii
>> -rw-r--r-- 1 user user   42993472 Nov 30 03:35 _3uu.tis
>> -rw-r--r-- 1 user user       1178 Nov 30 07:26 _3wj_1.del
>> -rw-r--r-- 1 user user    2813124 Nov 30 07:26 _3wj.fdt
>> -rw-r--r-- 1 user user      74852 Nov 30 07:26 _3wj.fdx
>> -rw-r--r-- 1 user user       2175 Nov 30 07:26 _3wj.fnm
>> -rw-r--r-- 1 user user     911051 Nov 30 07:26 _3wj.frq
>> -rw-r--r-- 1 user user          4 Nov 30 07:26 _3wj.nrm
>> -rw-r--r-- 1 user user     285405 Nov 30 07:26 _3wj.prx
>> -rw-r--r-- 1 user user       7951 Nov 30 07:26 _3wj.tii
>> -rw-r--r-- 1 user user     624702 Nov 30 07:26 _3wj.tis
>> -rw-r--r-- 1 user user   35859092 Nov 30 07:26 _3wk.fdt
>> -rw-r--r-- 1 user user     958148 Nov 30 07:26 _3wk.fdx
>> -rw-r--r-- 1 user user       4104 Nov 30 07:26 _3wk.fnm
>> -rw-r--r-- 1 user user   12228212 Nov 30 07:26 _3wk.frq
>> -rw-r--r-- 1 user user    3438508 Nov 30 07:26 _3wk.prx
>> -rw-r--r-- 1 user user      58672 Nov 30 07:26 _3wk.tii
>> -rw-r--r-- 1 user user    4621519 Nov 30 07:26 _3wk.tis
>&

Re: Solr 3.5 very slow (performance)

2011-11-30 Thread Pawel Rog
http://imageshack.us/photo/my-images/838/cpuusage.png/

On Wed, Nov 30, 2011 at 9:18 PM, Chris Hostetter
 wrote:
>
> : I attach chart which presents cpu usage. Solr 3.5 uses almost all cpu
> : (left side of chart).
>
> FWIW: The mailing list software filters out most attachments (there are
> some exceptions for certain text mime types)
>
>
> -Hoss


Re: Solr 3.5 very slow (performance)

2011-11-30 Thread Pawel Rog
On Wed, Nov 30, 2011 at 9:05 PM, Chris Hostetter
 wrote:
>
> : I tried to use index from 1.4 (load was the same as on index from 3.5)
> : but there was problem with synchronization with master (invalid
> : javabin format)
> : Then I built new index on 3.5 with luceneMatchVersion LUCENE_35
>
> why would you need to re-replicate from the master?
>
> You already have a copy of the Solr 1.4 index on the slave machine where
> you are doing testing correct? Just (make sure Solr 1.4 isn't running
> and) point Solr 3.5 at that solr home directory for the configs and data
> and time that.  (Just because Solr 3.5 can't replicate from Solr 1.4
> over HTTP doesn't mean it can't open indexes built by Solr 1.4)
>

I made It before sending earlier e-mail. Efect was the same.

> It's important to understand if the discrepencies you are seeing have to
> do with *building* the index under Solr 3.5, or *searching* in Solr 3.5.
>
> : reader : 
> SolrIndexReader{this=8cca36c,r=ReadOnlyDirectoryReader@8cca36c,refCnt=1,segments=4}
> : readerDir : 
> org.apache.lucene.store.NIOFSDirectory@/data/solr_data/itemsfull/index
> :
> : solr 3.5
> : reader : 
> SolrIndexReader{this=3d01e178,r=ReadOnlyDirectoryReader@3d01e178,refCnt=1,segments=14}
> : readerDir : 
> org.apache.lucene.store.MMapDirectory@/data/solr_data_350/itemsfull/index
> : lockFactory=org.apache.lucene.store.NativeFSLockFactory@294ce5eb
>
> As mentioned, the difference in the number of segments may be contributing
> to the perf differences you are seeing, so optimizing both indexes (or
> doing a partial optimize of your 3.5 index down to 4 segments) for
> comparison would probably be worthwhile.  (and if that is the entirety of
> hte problem, then explicitly configuring a MergePolicy may help you in the
> long run)
>
> but independent of that I would like to suggest that you first try
> explicitly configuring Solr 3.5 to use NIOFSDirectory so it's consistent
> with what Solr 1.4 was doing (I'm told MMapDirectory should be faster, but
> maybe there's something about your setup that makes that not true) So it
> would be helpful to also try adding this to your 3.5 solrconfig.xml and
> testing ...
>
> 
>
> : I made some test with quiet heavy query (with frange). In both cases
> : (1.4 and 3.5) I used the same newSearcher queries and started solr
> : without any load.
> : Results of debug timing
>
> Ok, well ... honestly: giving us *one* example of the timing data for
> *one* query (w/o even telling us what the exact query was) ins't really
> anything we can use to help you ... the crux of the question was: "was the
> slow performance you are seeing only under heavy load or was it also slow
> when you did manual testing?"
>
> : When I send fewer than 60 rps I see that in comparsion to 1.4 median
> : response time is worse, avarage is worse but maximum time is better.
> : It doesn't change propotion of cpu usage (3.5 uses much more cpu).
>
> How much "fewer then 60 rps" ? ... I'm trying to understand if the
> problems you are seeing are solely happening under "heavy" concurrent
> load, or if you are seeing Solr 3.5 consistently respond much slower then
> Solr 1.4 even with a single client?
>
> Also: I may still be missunderstanding how you are generating load, and
> wether you are throttling the clients, but seeing higher CPU utilization
> in Solr 3.5 isn't neccessarily an indication of something going wrong --
> in some cases higher CPU% (particularly under heavy concurrent load on a
> multi-core machine) could just mean that Solr is now capable of utilizing
> more CPU to process parallel request, where as previous versions might have
> been hitting other bottle necks. -- but that doesn't explain the slower
> response times. that's what concerns me the most.

I don't think that 1200% CPU usage with the same traffic is better
then 200%. I think you are wrong :) Using solr 1.4 I can reach 300rps
and then reach 1200% on cpu and only 60rps in solr 3.5

>
> FWIW: I'm still wondering what the stats from your caches wound up looking
> like on both Solr 1.4 and Solr 3.5...
>
>>> 7) What do the cache stats look like on your Solr 3.5 instance after
>>> you've done some of this timing testing?  the output of...
>>> http://localhost:8983/solr/admin/mbeans?cat=CACHE&stats=true&wt=json&indent=true
>>> ...would be helpful. NOTE: you may need to add this to your
>>> solrconfig.xml
>>> for that URL to work...
>>>  '
>
> ...but i don't think /admin/mbeans exists in Solr 1.4, so you may just
> have to get the details from stats.jsp.
>

I forgot to write it earlier. QueryCache hit rate was about 0.03 (in
solr 1.4 and 3.5). Filter cache hitrate was abaout 0.35 in both cases.
Document hit rate was about 0.55 in both cases.

Trace from thread wasn't helpful to diagnose problem? As I mentioned
before - almost all threads were in the same line of code in
SolrIndexSearcher.


Re: Solr 3.5 very slow (performance)

2011-11-30 Thread Pawel Rog
Yes it works. Thanks a lot.
But I stil don't understand why in solr 1.4 that option was efficient
but in solr 3.5 not

On Wed, Nov 30, 2011 at 11:01 PM, Yonik Seeley
 wrote:
> On Wed, Nov 30, 2011 at 7:08 AM, Pawel Rog  wrote:
>>        at 
>> org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:702)
>>        at 
>> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1144)
>>        at 
>> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:362)
>
> This is interesting, and suggests that you have
> useFilterForSortedQuery set in your solrconfig.xml
> Can you try removing it (or setting it to false)?
>
> -Yonik
> http://www.lucidimagination.com


Re: Realtime profile data

2012-02-07 Thread Pawel Rog
Thank you. I'll try NRT and some post-filter :)


On Tue, Feb 7, 2012 at 3:09 PM, Erick Erickson  wrote:
> You have several options:
> 1> if you can go to trunk (bleeding edge, I admit), you can
>     get into the near real time (NRT) stuff.
> 2> You could maintain essentially a post-filter step where
>      your app maintains a list of deleted messages and
>     removes them from the response. This will cause
>     some of your counts (e.g. facets, grouping) to be slightly
>     off
> 3> Train your users to expect whatever latency you've
>      built into the system (i.e. indexing, commit and replication)
>
> Best
> Erick
>
> On Mon, Feb 6, 2012 at 10:42 AM, Pawel Rog  wrote:
>> Hello. I have some problem which i'd like to solve using solr. I have
>> user profile which has some kind of messages in it. User can filter
>> messages, sort them etc. The problem is with delete operation. If user
>> click on message to delete it it's very hard to update index of solr
>> in real time. When user deletes message, it will be still visible.
>> Have you idea how to solve problem with removing data?


Re: Help with duplicate unique IDs

2012-03-02 Thread Pawel Rog
Once I had the same problem. I didn't know what's going on. After few
moment of analysis I created completely new index and removed old one
(I hadn't enough time to analyze problem). Problem didn't come back
any more.

--
Regards,
Pawel

On Fri, Mar 2, 2012 at 8:23 PM, Thomas Dowling  wrote:
> In a Solr index of journal articles, I thought I was safe reindexing
> articles because their unique ID would cause the new record in the index to
> overwrite the old one. (As stated at
> http://wiki.apache.org/solr/SchemaXml#The_Unique_Key_Field - right?)
>
> My schema.xml includes:
>
> ...
>   required="true"/>
> ...
>
> And:
>
> id
>
> And yet I can compose a query with two hits in the index, showing:
>
> #1: 03405443/v66i0003/347_mrirtaitmbpa
> #2: 03405443/v66i0003/347_mrirtaitmbpa
>
>
> Can anyone give pointers on where I'm screwing something up?
>
>
> Thomas Dowling
> thomas.dowl...@gmail.com


Re: Boosting terms

2012-03-19 Thread Pawel Rog
Thanks a lot, I'll read it :) It seems to be helpfull

On Sun, Mar 18, 2012 at 8:58 PM, Ahmet Arslan  wrote:
>
>> Is there any possibility to boost
>> terms during indexing? Searching
>> that using google I found information that there is no such
>> feature in
>> Solr (we can only boost fields). Is it true?
>
> Yes, only field and document boosting exist.
>
> You might find this article interesting.
>
> http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/
>
>


Re: Usage of * as a first character in wild card query

2012-03-25 Thread Pawel Rog
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ReversedWildcardFilterFactory

On Mon, Mar 26, 2012 at 7:08 AM, Ishan  wrote:

> Hi,
>
> I need to query on solr with * as a first character in query.
> For eg. Content indexed in*  "Be careful"
> *and query i want to fire is  **ful
> *But solr does not allow * as  a first character in wildcard query.
> Plz let me know if there is any other alternative for doing this*.
> *
> --
> Thanks & Regards,
> Isan Fulia.
>


Re: solr hangs

2012-04-11 Thread Pawel Rog
You wrote that you can see such error "OutOfMemoryError". I had such
problems when my caches were to big. It means that there is no more free
memory in JVM and probably full gc starts running. How big is your Java
heap? Maybe cache sizes in yout solr are to big according to your JVM
settings.

--
Regards,
Pawel

On Tue, Apr 10, 2012 at 9:51 PM, Peter Markey  wrote:

> Hello,
>
> I have a solr cloud setup based on a blog (
> http://outerthought.org/blog/491-ot.html) and am able to bring up the
> instances and cores. But when I start indexing data (through csv update),
> the core throws a out of memory exception (null:java.lang.RuntimeException:
> java.lang.OutOfMemoryError: unable to create new native thread). The thread
> dump from new solr ui is below:
>
> cmdDistribExecutor-8-thread-777 (827)
>
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1bd11b79
>
>   - sun.misc.Unsafe.park​(Native Method)
>   - java.util.concurrent.locks.LockSupport.park​(LockSupport.java:186)
>   -
>
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await
> (AbstractQueuedSynchronizer.java:2043)
>   -
>
> org.apache.http.impl.conn.tsccm.WaitingThread.await​(WaitingThread.java:158)
>   -
>   org.apache.http.impl.conn.tsccm.ConnPoolByRoute.getEntryBlocking
> (ConnPoolByRoute.java:403)
>   -
>   org.apache.http.impl.conn.tsccm.ConnPoolByRoute$1.getPoolEntry
> (ConnPoolByRoute.java:300)
>   -
>
> org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager$1.getConnection
> (ThreadSafeClientConnManager.java:224)
>   -
>   org.apache.http.impl.client.DefaultRequestDirector.execute
> (DefaultRequestDirector.java:401)
>   -
>   org.apache.http.impl.client.AbstractHttpClient.execute
> (AbstractHttpClient.java:820)
>   -
>   org.apache.http.impl.client.AbstractHttpClient.execute
> (AbstractHttpClient.java:754)
>   -
>   org.apache.http.impl.client.AbstractHttpClient.execute
> (AbstractHttpClient.java:732)
>   -
>   org.apache.solr.client.solrj.impl.HttpSolrServer.request
> (HttpSolrServer.java:304)
>   -
>   org.apache.solr.client.solrj.impl.HttpSolrServer.request
> (HttpSolrServer.java:209)
>   -
>   org.apache.solr.update.SolrCmdDistributor$1.call
> (SolrCmdDistributor.java:320)
>   -
>   org.apache.solr.update.SolrCmdDistributor$1.call
> (SolrCmdDistributor.java:301)
>   - java.util.concurrent.FutureTask$Sync.innerRun​(FutureTask.java:334)
>   - java.util.concurrent.FutureTask.run​(FutureTask.java:166)
>   -
>   java.util.concurrent.Executors$RunnableAdapter.call​(Executors.java:471)
>   - java.util.concurrent.FutureTask$Sync.innerRun​(FutureTask.java:334)
>   - java.util.concurrent.FutureTask.run​(FutureTask.java:166)
>   -
>   java.util.concurrent.ThreadPoolExecutor.runWorker
> (ThreadPoolExecutor.java:1110)
>   -
>   java.util.concurrent.ThreadPoolExecutor$Worker.run
> (ThreadPoolExecutor.java:603)
>   - java.lang.Thread.run​(Thread.java:679)
>
>
>
> Apparently I do see lots of threads like above in the thread dump. I'm
> using latest build from the trunk (Apr 10th). Any insights into this issue
> woudl be really helpful. Thanks a lot.
>


Re: Difference between two solr indexes

2012-04-17 Thread Pawel Rog
If there are only 100'000 documents dump all document ids and make diff
If you're using linux based system you can just use simple tools to do it.
Something like that can be helpful

curl "http://your.hostA:port/solr/index/select?*:*&fl=id&wt=csv"; > /tmp/idsA
curl "http://your.hostB:port/solr/index/select?*:*&fl=id&wt=csv"; > /tmp/idsB
diff /tmp/idsA /tmp/idsB | grep "<\|>" | awk '{print $2;}' | sed
's/\(.*\)/\1<\/id>/g' > /tmp/ids_to_delete.xml

Now you have file. Now you can just add to that file "" and
"" and upload that file into solr using curl
curl -X POST -d @/tmp/ids_to_delete.xml "http://your.hostA:port
/solr/index/upadte"

On Tue, Apr 17, 2012 at 2:09 PM, nutchsolruser wrote:

> I'm Also seeking solution for similar problem.
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Difference-between-two-solr-indexes-tp3916328p3917050.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: FilterCache - maximum size of document set

2012-06-13 Thread Pawel Rog
Thanks for your response
Yes, maybe you are right. I thought that filters can be larger than 3M. All
kinds of filters uses BitSet?
Moreover maxSize of filterCache is set to 16000 in my case. There are
evictions during day traffic
but not during night traffic.

Version of Solr which I use is 3.5

I haven't used Memory Anayzer yet. Could you write more details about it?

--
Regards,
Pawel

On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson wrote:

> Hmmm, I think you may be looking at the wrong thing here. Generally, a
> filterCache
> entry will be maxDocs/8 (plus some overhead), so in your case they really
> shouldn't be all that large, on the order of 3M/filter. That shouldn't
> vary based
> on the number of docs that match the fq, it's just a bitset. To see if
> that makes any
> sense, take a look at the admin page and the number of evictions in
> your filterCache. If
> that is > 0, you're probably using all the memory you're going to in
> the filterCache during
> the day..
>
> But you haven't indicated what version of Solr you're using, I'm going
> from a
> relatively recent 3x knowledge-base.
>
> Have you put a memory analyzer against your Solr instance to see where
> the memory
> is being used?
>
> Best
> Erick
>
> On Wed, Jun 13, 2012 at 1:05 PM, Pawel  wrote:
> > Hi,
> > I have solr index with about 25M documents. I optimized FilterCache size
> to
> > reach the best performance (considering traffic characteristic that my
> Solr
> > handles). I see that the only way to limit size of a Filter Cace is to
> set
> > number of document sets that Solr can cache. There is no way to set
> memory
> > limit (eg. 2GB, 4GB or something like that). When I process a standard
> > trafiic (during day) everything is fine. But when Solr handle night
> traffic
> > (and the charateristic of requests change) some problems appear. There is
> > JVM out of memory error. I know what is the reason. Some filters on some
> > fields are quite poor filters. They returns 15M of documents or even
> more.
> > You could say 'Just put that into q'. I tried to put that filters into
> > "Query" part but then, the statistics of request processing time (during
> > day) become much worse. Reduction of Filter Cache maxSize is also not
> good
> > solution because during day cache filters are very very helpful.
> > You could be interested in type of filters that I use. These are range
> > filters (I tried standard range filters and frange) - eg. price:[* TO
> > 1]. Some fq with price can return few thousands of results (eg.
> > price:[40 TO 50]), but some (eg. price:[* TO 1]) can return milions
> of
> > documents. I'd also like to avoid solution which will introduce strict
> > ranges that user can choose.
> > Have you any suggestions what can I do? Is there any way to limit for
> > example maximum size of docSet which is cached in FilterCache?
> >
> > --
> > Pawel
>


Re: FilterCache - maximum size of document set

2012-06-14 Thread Pawel Rog
It can be true that filters cache max size is set to high value. That is
also true that.
We looked at evictions and hit rate earlier. Maybe you are right that
evictions are
not always unwanted. Some time ago we made tests. There are not so high
difference in hit rate when filters maxSize is set to 4000 (hit rate about
85%) and
16000 (hitrate about 91%). I think that also using LFU cache can be helpful
but
it makes me to migrate to 3.6. Do you think it is reasonable to use slave on
version 3.6 and master on 3.5?

Once again, Thanks for your help

--
Pawel

On Thu, Jun 14, 2012 at 7:22 PM, Erick Erickson wrote:

> Hmmm, your maxSize is pretty high, it may just be that you've set this
> much higher
> than is wise. The maxSize setting governs the number of entries. I'd start
> with
> a much lower number here, and monitor the solr/admin page for both
> hit ratio and evictions. Well, and size too. 16,000 entries puts a
> ceiling of, what,
> 48G on it? Ouch! It sounds like what's happening here is you're just
> accumulating
> more and more fqs over the course of the evening and blowing memory.
>
> Not all FQs will be that big, there's some heuristics in there to just
> store the
> document numbers for sparse filters, maxDocs/8 is pretty much the upper
> bound though.
>
> Evictions are not necessarily a bad thing, the hit-ratio is important
> here. And
> if you're using a bare NOW in your filter queries, you're probably never
> re-using them anyway, see:
>
> http://www.lucidimagination.com/blog/2012/02/23/date-math-now-and-filter-queries/
>
> I really question whether this limit is reasonable, but you know your
> situation best.
>
> Best
> Erick
>
> On Wed, Jun 13, 2012 at 5:40 PM, Pawel Rog  wrote:
> > Thanks for your response
> > Yes, maybe you are right. I thought that filters can be larger than 3M.
> All
> > kinds of filters uses BitSet?
> > Moreover maxSize of filterCache is set to 16000 in my case. There are
> > evictions during day traffic
> > but not during night traffic.
> >
> > Version of Solr which I use is 3.5
> >
> > I haven't used Memory Anayzer yet. Could you write more details about it?
> >
> > --
> > Regards,
> > Pawel
> >
> > On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson <
> erickerick...@gmail.com>wrote:
> >
> >> Hmmm, I think you may be looking at the wrong thing here. Generally, a
> >> filterCache
> >> entry will be maxDocs/8 (plus some overhead), so in your case they
> really
> >> shouldn't be all that large, on the order of 3M/filter. That shouldn't
> >> vary based
> >> on the number of docs that match the fq, it's just a bitset. To see if
> >> that makes any
> >> sense, take a look at the admin page and the number of evictions in
> >> your filterCache. If
> >> that is > 0, you're probably using all the memory you're going to in
> >> the filterCache during
> >> the day..
> >>
> >> But you haven't indicated what version of Solr you're using, I'm going
> >> from a
> >> relatively recent 3x knowledge-base.
> >>
> >> Have you put a memory analyzer against your Solr instance to see where
> >> the memory
> >> is being used?
> >>
> >> Best
> >> Erick
> >>
> >> On Wed, Jun 13, 2012 at 1:05 PM, Pawel  wrote:
> >> > Hi,
> >> > I have solr index with about 25M documents. I optimized FilterCache
> size
> >> to
> >> > reach the best performance (considering traffic characteristic that my
> >> Solr
> >> > handles). I see that the only way to limit size of a Filter Cace is to
> >> set
> >> > number of document sets that Solr can cache. There is no way to set
> >> memory
> >> > limit (eg. 2GB, 4GB or something like that). When I process a standard
> >> > trafiic (during day) everything is fine. But when Solr handle night
> >> traffic
> >> > (and the charateristic of requests change) some problems appear.
> There is
> >> > JVM out of memory error. I know what is the reason. Some filters on
> some
> >> > fields are quite poor filters. They returns 15M of documents or even
> >> more.
> >> > You could say 'Just put that into q'. I tried to put that filters into
> >> > "Query" part but then, the statistics of request processing time
> (during
> >> > day) become much worse. Reduction of Filter Cache maxSize is also not
> >> good
> >> > solution because during day cache filters are very very helpful.
> >> > You could be interested in type of filters that I use. These are range
> >> > filters (I tried standard range filters and frange) - eg. price:[* TO
> >> > 1]. Some fq with price can return few thousands of results (eg.
> >> > price:[40 TO 50]), but some (eg. price:[* TO 1]) can return
> milions
> >> of
> >> > documents. I'd also like to avoid solution which will introduce strict
> >> > ranges that user can choose.
> >> > Have you any suggestions what can I do? Is there any way to limit for
> >> > example maximum size of docSet which is cached in FilterCache?
> >> >
> >> > --
> >> > Pawel
> >>
>


Re: FilterCache - maximum size of document set

2012-06-15 Thread Pawel Rog
Thanks
I don't use NOW in queries. All my filters with timestamp are rounded to
hundreds of
seconds to increase hitrate. The only problem could be in price filters
which can be
varied (users are unpredictable :P), but also that filters from fq or
setting cache=false"
is also bad idea ... checked it :) Load rised three times :)

--
Pawel

On Fri, Jun 15, 2012 at 1:30 PM, Erick Erickson wrote:

> Test first, of course, but slave on 3.6 and master on 3.5 should be
> fine. If you're
> getting evictions with the cache settings that high, you really want
> to look at why.
>
> Note that in particular, using NOW in your filter queries virtually
> guarantees
> that they won't be re-used as per the link I sent yesterday.
>
> Best
> Erick
>
> On Fri, Jun 15, 2012 at 1:15 AM, Pawel Rog  wrote:
> > It can be true that filters cache max size is set to high value. That is
> > also true that.
> > We looked at evictions and hit rate earlier. Maybe you are right that
> > evictions are
> > not always unwanted. Some time ago we made tests. There are not so high
> > difference in hit rate when filters maxSize is set to 4000 (hit rate
> about
> > 85%) and
> > 16000 (hitrate about 91%). I think that also using LFU cache can be
> helpful
> > but
> > it makes me to migrate to 3.6. Do you think it is reasonable to use
> slave on
> > version 3.6 and master on 3.5?
> >
> > Once again, Thanks for your help
> >
> > --
> > Pawel
> >
> > On Thu, Jun 14, 2012 at 7:22 PM, Erick Erickson  >wrote:
> >
> >> Hmmm, your maxSize is pretty high, it may just be that you've set this
> >> much higher
> >> than is wise. The maxSize setting governs the number of entries. I'd
> start
> >> with
> >> a much lower number here, and monitor the solr/admin page for both
> >> hit ratio and evictions. Well, and size too. 16,000 entries puts a
> >> ceiling of, what,
> >> 48G on it? Ouch! It sounds like what's happening here is you're just
> >> accumulating
> >> more and more fqs over the course of the evening and blowing memory.
> >>
> >> Not all FQs will be that big, there's some heuristics in there to just
> >> store the
> >> document numbers for sparse filters, maxDocs/8 is pretty much the upper
> >> bound though.
> >>
> >> Evictions are not necessarily a bad thing, the hit-ratio is important
> >> here. And
> >> if you're using a bare NOW in your filter queries, you're probably never
> >> re-using them anyway, see:
> >>
> >>
> http://www.lucidimagination.com/blog/2012/02/23/date-math-now-and-filter-queries/
> >>
> >> I really question whether this limit is reasonable, but you know your
> >> situation best.
> >>
> >> Best
> >> Erick
> >>
> >> On Wed, Jun 13, 2012 at 5:40 PM, Pawel Rog 
> wrote:
> >> > Thanks for your response
> >> > Yes, maybe you are right. I thought that filters can be larger than
> 3M.
> >> All
> >> > kinds of filters uses BitSet?
> >> > Moreover maxSize of filterCache is set to 16000 in my case. There are
> >> > evictions during day traffic
> >> > but not during night traffic.
> >> >
> >> > Version of Solr which I use is 3.5
> >> >
> >> > I haven't used Memory Anayzer yet. Could you write more details about
> it?
> >> >
> >> > --
> >> > Regards,
> >> > Pawel
> >> >
> >> > On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson <
> >> erickerick...@gmail.com>wrote:
> >> >
> >> >> Hmmm, I think you may be looking at the wrong thing here. Generally,
> a
> >> >> filterCache
> >> >> entry will be maxDocs/8 (plus some overhead), so in your case they
> >> really
> >> >> shouldn't be all that large, on the order of 3M/filter. That
> shouldn't
> >> >> vary based
> >> >> on the number of docs that match the fq, it's just a bitset. To see
> if
> >> >> that makes any
> >> >> sense, take a look at the admin page and the number of evictions in
> >> >> your filterCache. If
> >> >> that is > 0, you're probably using all the memory you're going to in
> >> >> the filterCache during
> >> >> the day..
> >> >>
> >> >> But you haven't indicated what version of Solr you're using

Re: Wildcard query vs facet.prefix for autocomplete?

2012-07-16 Thread Pawel Rog
Maybe try EdgeNgramFilterFactory
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/#solr.EdgeNGramFilterFactory


On Mon, Jul 16, 2012 at 6:57 AM, santamaria2 wrote:

> I'm about to implement an autocomplete mechanism for my search box. I've
> read
> about some of the common approaches, but I have a question about wildcard
> query vs facet.prefix.
>
> Say I want autocomplete for a title: 'Shadows of the Damned'. I want this
> to
> appear as a suggestion if I type 'sha' or 'dam' or 'the'. I don't care that
> it won't appear if I type 'hadows'.
>
> While indexing, I'd use a whitespace tokenizer and a lowercase filter to
> store that title in the index.
> Now I'm thinking two approaches for 'dam' typed in the search box:
>
> 1) q=title:dam*
>
> 2) q=*:*&facet=on&facet.field=title&facet.prefix=dam
>
>
> So any reason that I should favour one over the other? Speed a factor? The
> index has around 200,000 items.
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Wildcard-query-vs-facet-prefix-for-autocomplete-tp3995199.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>