What is the org.apache.solr.uninverting.FieldCacheImpl?

2017-08-24 Thread Sundeep T
Hi,

In our enterprise application, we occasionally get range facet queries
ordered by the timestamp field. The timestamp field is of date type.

Below is the query from solr.log -

2017-08-25 05:18:51.048 INFO  (qtp1321530272-90) [   x:drums]
o.a.s.c.S.Request [drums]  webapp=/solr path=/select
params={df=text&distrib=false&_facet_={}&fl=id&fl=score&shards.purpose=1048580&start=0&fsv=true&shard.url=
http://localhost:8983/solr/drums&rows=0&version=2&q=*:*&json.facet={“timestamp”:{“type”:“range”,“field”:“timestamp”,“start”:“2016-05-28T16:19:09.857Z”,“end”:“2017-08-18T10:57:10.365Z”,“gap”:“+5000SECOND”,“limit”:10,“sort”:{“index”:“desc”},“facet”:{}}}&NOW=1503638261623&isShard=true&timeAllowed=-1&wt=javabin}
hits=68541066 status=0 QTime=69422

Whenever such query runs we see that
org.apache.solr.uninverting.FieldCacheImpl is being populated in the
backend jvm heap. When we analyzed using heapdump, all the underlying
objects in the FieldCacheImpl have timestamp as the cache key. It seems to
be taking quite a bit of memory.

Does any one have an idea what this cache is and why its being populated?
Also, what is the criteria for clearing this cache?

Really appreciate your response. Thanks!


Error opening new searcher due to LockObtainFailedException

2017-08-30 Thread Sundeep T
Hello,

Occasionally we are seeing errors opening new searcher for certain solr
cores. Whenever this happens, we are unable to query or ingest new data
into these cores. It seems to clear up after some time though. The root
cause seems to be - *"org.apache.lucene.store.LockObtainFailedException:
Lock held by this virtual machine:
/opt/solr/volumes/data9/7d50b38e114af075-core-24/data/index/write.lock"*

Below is the full stack trace. Any ideas on what could be going on that
causes such an exception and how to mitigate this? thanks a lot for your
help!

Unable to create core
[7d50b38e114af075-core-24],trace=org.apache.solr.common.SolrException:
Unable to create core [7d50b38e114af075-core-24]
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:903)
at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:1167)
at org.apache.solr.servlet.HttpSolrCall.init(HttpSolrCall.java:252)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:418)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:296)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
at
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.(SolrCore.java:952)
at org.apache.solr.core.SolrCore.(SolrCore.java:816)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:890)
... 30 more
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1891)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2011)
at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1041)
at org.apache.solr.core.SolrCore.(SolrCore.java:925)
... 32 more
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held by
this virtual machine:
/opt/solr/volumes/data9/7d50b38e114af075-core-24/data/index/write.lock
at
org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:127)
at org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41)
at org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45)
at
org.apache.lucene.store.FilterDirectory.obtainLock(FilterDirectory.java:104)
at org.apache.lucene.index.IndexWriter.(IndexWriter.java:804)
at org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:125)
at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:100)
at
org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:240)
at
org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:114)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1852)
... 35 more
,code=500}```


Possible memory leak with VersionBucket objects

2017-09-25 Thread Sundeep T
Hello,

We are running our solr 6.4.2 instance on a single node without zookeeper.
So, we are not using solr cloud. We have been ingesting about 50k messages
per second into this instance spread over 4 cores.

When we looked at the heapdump we see that it has there are around 385
million instances of VersionBucket objects taking about 8gb memory. This
number seems to grow based on the number of cores into which we are
ingesting data into.PFA a screen cap of heap recording.

Browsing through the jira list we saw a similar issue -
https://issues.apache.org/jira/browse/SOLR-9803

This issue is recently resolved by Erick. But this issue seems be
specifically tied to SolrCloud mode and Zookeeper. We are not using any of
these.

So, we are thinking this could be another issue. Any one has ideas on what
this could be and if there is a fix for it?

Thanks
Sundeep


Re: Possible memory leak with VersionBucket objects

2017-09-25 Thread Sundeep T
Yes, but that issue seems specific to SolrCloud like I mentioned. We are
running Solr in cloud mode and don't have Zookeeper configured

Thanks
Sundeep

On Mon, Sep 25, 2017 at 12:52 PM, Steve Rowe  wrote:

> Hi Sundeep,
>
> This looks to me like <https://issues.apache.org/jira/browse/SOLR-9803> /
> <https://issues.apache.org/jira/browse/SOLR-10506>, which was fixed in
> Solr 7.0.
>
> --
> Steve
> www.lucidworks.com
>
> > On Sep 25, 2017, at 2:42 PM, Sundeep T  wrote:
> >
> > Hello,
> >
> > We are running our solr 6.4.2 instance on a single node without
> zookeeper. So, we are not using solr cloud. We have been ingesting about
> 50k messages per second into this instance spread over 4 cores.
> >
> > When we looked at the heapdump we see that it has there are around 385
> million instances of VersionBucket objects taking about 8gb memory. This
> number seems to grow based on the number of cores into which we are
> ingesting data into.PFA a screen cap of heap recording.
> >
> > Browsing through the jira list we saw a similar issue -
> https://issues.apache.org/jira/browse/SOLR-9803
> >
> > This issue is recently resolved by Erick. But this issue seems be
> specifically tied to SolrCloud mode and Zookeeper. We are not using any of
> these.
> >
> > So, we are thinking this could be another issue. Any one has ideas on
> what this could be and if there is a fix for it?
> >
> > Thanks
> > Sundeep
>
>


Re: Possible memory leak with VersionBucket objects

2017-09-25 Thread Sundeep T
Sorry, I meant we are "not" running Solr in cloud mode

On Mon, Sep 25, 2017 at 1:29 PM, Sundeep T  wrote:

> Yes, but that issue seems specific to SolrCloud like I mentioned. We are
> running Solr in cloud mode and don't have Zookeeper configured
>
> Thanks
> Sundeep
>
> On Mon, Sep 25, 2017 at 12:52 PM, Steve Rowe  wrote:
>
>> Hi Sundeep,
>>
>> This looks to me like <https://issues.apache.org/jira/browse/SOLR-9803>
>> / <https://issues.apache.org/jira/browse/SOLR-10506>, which was fixed in
>> Solr 7.0.
>>
>> --
>> Steve
>> www.lucidworks.com
>>
>> > On Sep 25, 2017, at 2:42 PM, Sundeep T  wrote:
>> >
>> > Hello,
>> >
>> > We are running our solr 6.4.2 instance on a single node without
>> zookeeper. So, we are not using solr cloud. We have been ingesting about
>> 50k messages per second into this instance spread over 4 cores.
>> >
>> > When we looked at the heapdump we see that it has there are around 385
>> million instances of VersionBucket objects taking about 8gb memory. This
>> number seems to grow based on the number of cores into which we are
>> ingesting data into.PFA a screen cap of heap recording.
>> >
>> > Browsing through the jira list we saw a similar issue -
>> https://issues.apache.org/jira/browse/SOLR-9803
>> >
>> > This issue is recently resolved by Erick. But this issue seems be
>> specifically tied to SolrCloud mode and Zookeeper. We are not using any of
>> these.
>> >
>> > So, we are thinking this could be another issue. Any one has ideas on
>> what this could be and if there is a fix for it?
>> >
>> > Thanks
>> > Sundeep
>>
>>
>


Solr deep paging queries run very slow due to redundant q param

2017-10-14 Thread Sundeep T
Hello,

In our scale environment, we see that the deep paging queries  using
cursormark are running really slow. When we traced out the calls, we see
that the second query which queries the individual id's of matched pages is
sending the q param that is already sent by the first query again. If we
remove the q param and directly query for ids, the query runs really fast.

For example, the initial pagination query is like this with q param on
timestamp field -

2017-10-14 12:20:51.647 UTC INFO  (qtp331844619-393343)
[core='x:c6e422fc3054c475-core-1']
org.apache.solr.core.SolrCore.Request@2304 [c6e422fc3054c475-core-1]
webapp=/solr path=/select
params={distrib=false&df=text&paginatedQuery=true&fl=id&shards.purpose=4&start=0&fsv=true&sort=timestamp+desc+,id+asc&shard.url=
http://ops-data-solr-svc-1.rattle.svc.cluster.local:80/solr/c6e422fc3054c475-core-1&;
*rows=50*&version=2&
*q=(timestamp:["2017-10-13T18:42:36Z"+TO+"2017-10-13T21:09:00Z"])*
&shards.tolerant=true&*cursorMark=**&NOW=1507928978918&isShard=true&timeAllowed=-1&wt=javabin&trackingId=d5eff5476247487555b7413214648}
hits=40294067 status=0 QTime=12727

This query results in a second query due to solr implementation of deep
paging like below. In this query, we already know the ids to be matched.
So, there is no reason to pass the q param again. We tried manually
executing the below query without the q param and just passing the ids
alone and that executes in 50ms. So, this looks like a bug that Solr is
passing in the q param again. Any ideas if there is workaround for this
problem we can use?

2017-10-14 12:21:09.193 UTC INFO  (qtp331844619-742579)
[core='x:6d63f95961c46475-core-1']
org.apache.solr.core.SolrCore.Request@2304 [6d63f95961c46475-core-1]
webapp=/solr path=/select
params={distrib=false&df=text&paginatedQuery=true&fl=*,[removedocvaluesuffix]&shards.purpose=64&shard.url=
http://ops-data-solr-svc-1.rattle.svc.cluster.local:80/solr/6d63f95961c46475-core-1&rows=50&version=2&;
*q=(timestamp:["2017-10-14T08:50:16.340Z"+TO+"2017-10-14T19:19:50Z"])*&shards.tolerant=true&NOW=1507983581099&ids=00f037832e571941ed46ddd195920502,145c82e3eaa7678564b9e520822a3de1,09633cfabc6c830dfb44e04c313ba6b4,0032a76ed4ea01207c2891070348ea39,1b5179ee23fe3e17236da37d6b8d991f,04ee42e481b2a657bd3bb3c9f91b5ed5,2a910cf8a259925046a0c9fb5ee013c3,1d1d607b03c18ec59c14c2f9ca0ab47f,034e775c96633dae7e629a1d37da86e6,2759ca26d449d5df9f41689aa8ed3bac,16995a57699a7bb56d5018fe145028ce,0509d16399e679470ffc07a8af22a918,1797ab6e0174c65bf2f6b650b3538407,11c804ec4ae153a31929abe8613de336,11d20ed5dc0cf3d71f57aefc4e4b3ee2,0135baecd2d3ae819909a0c021bbd48b,224b0671196fd141196b15add2e49b91,271088227cf81e3641130d3bd5de8cc6,01f266b9c130239a06b00e45bda277a0,1438bed6ffd956f1c49d765b942f1988,2fc9fef6500124b1b48218169a7cf974,2d85d00593847398bf09e168bb3a190c,10e1c2803df1db3d47e525d3bd8a1868,28b6d72729e79da3ad65ac3879740133,14be34af9995721b358b3fdb0bcb18d7,1f2e0867bd495b8a332b8c8bd8ce2508,12cf1a1c07d9b9550ece4079b15f7583,022cd0b3eef93cd9a6389c0394cf3899,11aa3132e00a96df6a49540612b91c8f,0ff348e0433c9e751f1475db6dcab213,2b48279c9ff833f43a910edfa455a31d,241e002d744ff0215155f214166fdd49,0fee30860c82d9a24bb8389317cd772c,07f04d380832f514b0575866958eebaa,20b0efa5d88e2a9950fa4fd8ba930455,14a9cadb7c75274bfc028bb9ae31236b,1829730aa4ee4750eb242266830b576b,1ad5012e83bd271cf00b0c70ea86a856,0af4247d057bd833753e1f7bef959fc4,0a09767d81cb351ab1598987022b6955,2f166fae9ca809642b8e20cea3020c24,2c4d900575d8594a040c94751af59cb5,03f1c46a004a4e3b995295b512e1e324,2c2aae83afc7426424c7de5301f8c692,034baf21ac1db436a7f3f2cf2cc668b0,1dda29d03fb8611f8de80b90685fd9ee,0632292ab704dcaa606440cb1fee017b,0fbd68f293c6964458a93f3034348625,2cdff46ab2e4d44b42f3381d5e3250b7,1b2c90dce4a51b5e5c344fc2f9ab431d&isShard=true&timeAllowed=-1&wt=javabin&trackingId=d5eff5476247487555b80c9ac7b82}
status=0 QTime=18136

Thanks
Sundeep


Re: Solr deep paging queries run very slow due to redundant q param

2017-10-23 Thread Sundeep T
Pinging again. Anyone has ideas on this? Thanks

On Sat, Oct 14, 2017 at 4:52 PM, Sundeep T  wrote:

> Hello,
>
> In our scale environment, we see that the deep paging queries  using
> cursormark are running really slow. When we traced out the calls, we see
> that the second query which queries the individual id's of matched pages is
> sending the q param that is already sent by the first query again. If we
> remove the q param and directly query for ids, the query runs really fast.
>
> For example, the initial pagination query is like this with q param on
> timestamp field -
>
> 2017-10-14 12:20:51.647 UTC INFO  (qtp331844619-393343)
> [core='x:c6e422fc3054c475-core-1'] org.apache.solr.core.SolrCore.
> Request@2304 [c6e422fc3054c475-core-1]  webapp=/solr path=/select
> params={distrib=false&df=text&paginatedQuery=true&fl=id&
> shards.purpose=4&start=0&fsv=true&sort=timestamp+desc+,id+asc&shard.url=
> http://ops-data-solr-svc-1.rattle.svc.cluster.local:80/solr/
> c6e422fc3054c475-core-1&*rows=50*&version=2&
> *q=(timestamp:["2017-10-13T18:42:36Z"+TO+"2017-10-13T21:09:00Z"])*&
> shards.tolerant=true&*cursorMark=**&NOW=1507928978918&isShard=
> true&timeAllowed=-1&wt=javabin&trackingId=d5eff5476247487555b7413214648}
> hits=40294067 status=0 QTime=12727
>
> This query results in a second query due to solr implementation of deep
> paging like below. In this query, we already know the ids to be matched.
> So, there is no reason to pass the q param again. We tried manually
> executing the below query without the q param and just passing the ids
> alone and that executes in 50ms. So, this looks like a bug that Solr is
> passing in the q param again. Any ideas if there is workaround for this
> problem we can use?
>
> 2017-10-14 12:21:09.193 UTC INFO  (qtp331844619-742579)
> [core='x:6d63f95961c46475-core-1'] org.apache.solr.core.SolrCore.
> Request@2304 [6d63f95961c46475-core-1]  webapp=/solr path=/select
> params={distrib=false&df=text&paginatedQuery=true&fl=*,[
> removedocvaluesuffix]&shards.purpose=64&shard.url=http://
> ops-data-solr-svc-1.rattle.svc.cluster.local:80/solr/
> 6d63f95961c46475-core-1&rows=50&version=2&
> *q=(timestamp:["2017-10-14T08:50:16.340Z"+TO+"2017-10-14T19:19:50Z"])*&
> shards.tolerant=true&NOW=1507983581099&ids=00f037832e571941ed46ddd1959205
> 02,145c82e3eaa7678564b9e520822a3de1,09633cfabc6c830dfb44e04c313ba6b4,
> 0032a76ed4ea01207c2891070348ea39,1b5179ee23fe3e17236da37d6b8d991f,
> 04ee42e481b2a657bd3bb3c9f91b5ed5,2a910cf8a259925046a0c9fb5ee013c3,
> 1d1d607b03c18ec59c14c2f9ca0ab47f,034e775c96633dae7e629a1d37da86e6,
> 2759ca26d449d5df9f41689aa8ed3bac,16995a57699a7bb56d5018fe145028ce,
> 0509d16399e679470ffc07a8af22a918,1797ab6e0174c65bf2f6b650b3538407,
> 11c804ec4ae153a31929abe8613de336,11d20ed5dc0cf3d71f57aefc4e4b3ee2,
> 0135baecd2d3ae819909a0c021bbd48b,224b0671196fd141196b15add2e49b91,
> 271088227cf81e3641130d3bd5de8cc6,01f266b9c130239a06b00e45bda277a0,
> 1438bed6ffd956f1c49d765b942f1988,2fc9fef6500124b1b48218169a7cf974,
> 2d85d00593847398bf09e168bb3a190c,10e1c2803df1db3d47e525d3bd8a1868,
> 28b6d72729e79da3ad65ac3879740133,14be34af9995721b358b3fdb0bcb18d7,
> 1f2e0867bd495b8a332b8c8bd8ce2508,12cf1a1c07d9b9550ece4079b15f7583,
> 022cd0b3eef93cd9a6389c0394cf3899,11aa3132e00a96df6a49540612b91c8f,
> 0ff348e0433c9e751f1475db6dcab213,2b48279c9ff833f43a910edfa455a31d,
> 241e002d744ff0215155f214166fdd49,0fee30860c82d9a24bb8389317cd772c,
> 07f04d380832f514b0575866958eebaa,20b0efa5d88e2a9950fa4fd8ba930455,
> 14a9cadb7c75274bfc028bb9ae31236b,1829730aa4ee4750eb242266830b576b,
> 1ad5012e83bd271cf00b0c70ea86a856,0af4247d057bd833753e1f7bef959fc4,
> 0a09767d81cb351ab1598987022b6955,2f166fae9ca809642b8e20cea3020c24,
> 2c4d900575d8594a040c94751af59cb5,03f1c46a004a4e3b995295b512e1e324,
> 2c2aae83afc7426424c7de5301f8c692,034baf21ac1db436a7f3f2cf2cc668b0,
> 1dda29d03fb8611f8de80b90685fd9ee,0632292ab704dcaa606440cb1fee017b,
> 0fbd68f293c6964458a93f3034348625,2cdff46ab2e4d44b42f3381d5e3250b7,
> 1b2c90dce4a51b5e5c344fc2f9ab431d&isShard=true&timeAllowed=-
> 1&wt=javabin&trackingId=d5eff5476247487555b80c9ac7b82} status=0
> QTime=18136
>
> Thanks
> Sundeep
>


Leading wildcard searches very slow

2017-11-17 Thread Sundeep T
Hi,

We have several indexed string fields which is not tokenized  and does not
have docValues enabled.

When we do leading wildcard searches on these fields they are running very
slow. We were thinking that since this field is indexed, such queries
should be running pretty quickly. We are using Solr 6.6.1. Anyone has ideas
on why these queries are running slow and if there are any ways to speed
them up?

Thanks
Sundeep


Trailing wild card searches very slow in Solr

2017-11-20 Thread Sundeep T
Hi,

We have several indexed string fields which is not tokenized and does not
have docValues enabled.

When we do trailing wildcard searches on these fields they are running very
slow. We were thinking that since this field is indexed, such queries
should be running pretty quickly. We are using Solr 6.6.1. Anyone has ideas
on why these queries are running slow and if there are any ways to speed
them up?

Thanks
Sundeep


Re: Trailing wild card searches very slow in Solr

2017-11-20 Thread Sundeep T
Hi Erick.

I initially asked this question regarding leading wildcards. This was a
typo, and what I meant was trailing wild card queries were slow. So queries
like text:'hello*" are slow. We were expecting since the string field is
already indexed, the searches should be fast, but that seems to be not the
case

Thanks
Sundeep

On Mon, Nov 20, 2017 at 9:39 AM, Erick Erickson 
wrote:

> You already asked that question and got several answers, did you not
> see them? If you did see them, what is unclear?
>
> Best,
> Erick
>
> On Mon, Nov 20, 2017 at 9:33 AM, Sundeep T  wrote:
> > Hi,
> >
> > We have several indexed string fields which is not tokenized and does not
> > have docValues enabled.
> >
> > When we do trailing wildcard searches on these fields they are running
> very
> > slow. We were thinking that since this field is indexed, such queries
> > should be running pretty quickly. We are using Solr 6.6.1. Anyone has
> ideas
> > on why these queries are running slow and if there are any ways to speed
> > them up?
> >
> > Thanks
> > Sundeep
>


Re: Trailing wild card searches very slow in Solr

2017-11-20 Thread Sundeep T
Hi Erick,

Thanks for the reply. Here are more details on our setup -

*Setup/schema details -*

100 million doc solr core

String field (not tokenized) is docValues=true, indexed=true and stored=true

Field is almost unique in the index, around 80 million are unique

no commits on index

all caches disabled in solrconfig.xml

solr jvm heap 1GB

single solr core in jvm

solr core is not optimized and has about 50 segment files some up to 5GB

index size on disk is around 150GB

solr v6.5.0



*Performance -*


q=myfield:abc* has QTime=30secs+ first time

q=myfield:abc* has QTime=17-20secs after filecache on OS is primed


Thanks
Sundeep


On Mon, Nov 20, 2017 at 12:16 PM, Erick Erickson 
wrote:

> Well, define "slow". Conceptually a large OR clause is created that
> contains all the terms that start with the indicated text. (actually a
> PrefixQuery should be formed).
>
> That said, I'd expect hello* to be reasonably fast as not many terms
> _probably_ start with 'hello'. Not the same at all for, say, h*.
>
> You might review: https://wiki.apache.org/solr/UsingMailingLists,
> you're not really providing much information to go on here.
>
> What is the result of adding &debug=query? Particularly it would be
> useful to see the parsed query.
>
> Are all such queries slow? What happens if you submit hel* followed by
> hello*, the first one will bring the underlying index structures into
> memory, for all we know this could simply be an autowarming issue.
>
> Are you indexing at the same time? Do you have a short autocommit interval?
>
> What version of Solr?
>
> Details matter.
> Best,
> Erick
>
> On Mon, Nov 20, 2017 at 11:50 AM, Sundeep T  wrote:
> > Hi Erick.
> >
> > I initially asked this question regarding leading wildcards. This was a
> > typo, and what I meant was trailing wild card queries were slow. So
> queries
> > like text:'hello*" are slow. We were expecting since the string field is
> > already indexed, the searches should be fast, but that seems to be not
> the
> > case
> >
> > Thanks
> > Sundeep
> >
> > On Mon, Nov 20, 2017 at 9:39 AM, Erick Erickson  >
> > wrote:
> >
> >> You already asked that question and got several answers, did you not
> >> see them? If you did see them, what is unclear?
> >>
> >> Best,
> >> Erick
> >>
> >> On Mon, Nov 20, 2017 at 9:33 AM, Sundeep T 
> wrote:
> >> > Hi,
> >> >
> >> > We have several indexed string fields which is not tokenized and does
> not
> >> > have docValues enabled.
> >> >
> >> > When we do trailing wildcard searches on these fields they are running
> >> very
> >> > slow. We were thinking that since this field is indexed, such queries
> >> > should be running pretty quickly. We are using Solr 6.6.1. Anyone has
> >> ideas
> >> > on why these queries are running slow and if there are any ways to
> speed
> >> > them up?
> >> >
> >> > Thanks
> >> > Sundeep
> >>
>


Problems executing boolean queries involving NOT clauses

2017-03-08 Thread Sundeep T
Hi,

I am using solr 6.3 version.

We are seeing issues involving NOT clauses when they are paired in boolean 
expressions. The issues specifically occur when the “NOT” clause is surrounded 
by paratheses.

For example, the following solr query does not return any results -

(timestamp:[* TO "2017-08-17T07:12:55.807Z"]) AND (-text:"Daemon”)

But if I remove the parantheses around the “NOT” clause for text param it 
returns expected results. Like, the below query works as expected -

(timestamp:[* TO "2017-08-17T07:12:55.807Z"]) AND -text:”Daemon”

This problem seems to happen only for boolean expression queries. If i give a 
singular query like below involving NOT with parantheses, it still works  -
(-text:"Daemon”)

I see that the parantheses around the expression is added in SQLVisitor class 
in these lines. I tried removing the parantheses for NOT case and the code 
works.

case NOT_EQUAL:
buf.append('-').append(field).append(":").append(value);
return null;

Any ideas what’s going on here and why parantheses are causing an issue?

Thanks
Sundeep




Re: Problems executing boolean queries involving NOT clauses

2017-03-08 Thread Sundeep T
I am just trying to clarify whether there is a bug here in solr. It seems
that when solr tranlsates sql into the underlying solr query, it puts
parantheses around "NOT" clause expressions. But that does not seem to be
working correctly and is not returning expected results. If parantheses
around the "NOT" clause, are removed, then correct results are returned

On Wed, Mar 8, 2017 at 7:39 PM, Erick Erickson 
wrote:

> What _exactly_ are you testing? It's unclear whether you're asking
> about general Lucene/Solr syntax or some of the recent streaming SQL
> work.
>
> On Wed, Mar 8, 2017 at 7:34 PM, Sundeep T  wrote:
> > Hi,
> >
> > I am using solr 6.3 version.
> >
> > We are seeing issues involving NOT clauses when they are paired in
> boolean expressions. The issues specifically occur when the “NOT” clause is
> surrounded by paratheses.
> >
> > For example, the following solr query does not return any results -
> >
> > (timestamp:[* TO "2017-08-17T07:12:55.807Z"]) AND (-text:"Daemon”)
> >
> > But if I remove the parantheses around the “NOT” clause for text param
> it returns expected results. Like, the below query works as expected -
> >
> > (timestamp:[* TO "2017-08-17T07:12:55.807Z"]) AND -text:”Daemon”
> >
> > This problem seems to happen only for boolean expression queries. If i
> give a singular query like below involving NOT with parantheses, it still
> works  -
> > (-text:"Daemon”)
> >
> > I see that the parantheses around the expression is added in SQLVisitor
> class in these lines. I tried removing the parantheses for NOT case and the
> code works.
> >
> > case NOT_EQUAL:
> > buf.append('-').append(field).append(":").append(value);
> > return null;
> >
> > Any ideas what’s going on here and why parantheses are causing an issue?
> >
> > Thanks
> > Sundeep
> >
> >
>


Re: q=-id:xxxx in export handler does not work but works ok in select.

2017-03-12 Thread Sundeep T
Hi Erick,

It looks like solr by default takes care of adding the *:*  for /select API
for NOT queries like this. In the newer /export API, it is not doing that
by default. So, it is kind of inconsistent, and a lot of users will run
into this if they try to use the /export api for streaming results.

I think the ExportQParserPlugin which /export API calls is missing to
automatically add this. Is it possible to fix this in solr, so that /export
and /select APIs work similarly?

Thanks
Sundeep

On Sun, Mar 12, 2017 at 9:13 AM, Erick Erickson 
wrote:

> Oh, you're running into a "quirk" of Solr. Pure negative queries in
> main clauses require a *:* in front unless there's some special
> handling. So try:
> q=*:* -id:8733
> instead in both cases.
>
> Best,
> Erick
>
> On Sun, Mar 12, 2017 at 7:57 AM, radha krishnan
>  wrote:
> > q=-id: in export handler does not work but works ok in select.
> >
> > Works:
> > http://localhost:8983/solr/bucket4/select?q=id:8733*&;
> rows=1&sort=_version_%20desc&fl=id
> > Does not work:
> > http://localhost:8983/solr/bucket4/export?q=id:8733*&;
> rows=1&sort=_version_%20desc&fl=id
> >
> > looks a bug with solr or am i making a mistake here
> >
> >
> > Thanks,
> > Radhakrishnan
>


Parallelizing post filter for better performance

2017-03-17 Thread Sundeep T
Hello,

Is there a way to execute the post filter in a parallel mode so that
multiple query results can be filtered in parallel?

Right now, in our code, the post filter is becoming kind of bottleneck as
we had to do some post processing on every returned result, and it runs
serially in a single thread.

Thanks
Sundeep


How to do sorting in lucene layer instead of Solr?

2017-04-14 Thread Sundeep T
Hi,

I am using /export API, and in Solr 6.3, the sorting in done by Solr in
SortingResponserWriter class after the lucene query execution is done.

I want to know if it is possible to do the sorting in lucene layer itself
and get the results, so that its more efficient if we only want top 10 rows
for instance. We have docValues enabled for all the fields.

Thanks
Sundeep


Is there a way to specify word position in solr search query on text fields?

2017-04-25 Thread Sundeep T
Hello,

We have a text field in our schema that is indexed using the
StandardTokenizerFactory. We have set omitPositions= false, so that
positional information of individual tokens is also included in the index
data.

Question is if there is a way to construct a query in which we can specify
the position information as well?

For example, if I have two text strings like "foo bar" and "bar foo".

Now, i want to find strings which only start with "foo". Is there a way to
do that? Basically, looking whether something like position=0 for the word
"foo" can be specified as a parameter in the query

Thanks
Sundeep