Give boosting to a grouped documents in Solr Based on number of results in a group

2020-04-22 Thread Ajay Sharma
Hi Community Members,

There is a logic that I need to implement using Solr.

Suppose there are two suppliers on a site dealing in Mobile Phones

   1. Supplier 1 has 10 products related to mobile phones
   2. Supplier 2 has 20 products related to mobile phones


I need to give a boost to supplier 2 because he has more number of products
related to mobile phones.

Is there a way in Solr where I can boost a supplier and give boosting to a
grouped documents in Solr Based on the number of results in a group.

Any help will be appreciated.

-- 
Thanks & Regards,
Ajay Sharma
Software Engineer, Product-Search,
IndiaMART InterMESH Ltd,
Mob.: +91-8954492245

-- 
*
*

 


ResourceManager : unable to find resource 'custom.vm' in any resource loader.

2020-04-22 Thread Prakhar Kumar
Hello Team,

I am getting this weird error in Solr logs.

null:java.io.IOException: Unable to find resource 'custom.vm'
at 
org.apache.solr.response.VelocityResponseWriter.getTemplate(VelocityResponseWriter.java:308)
at 
org.apache.solr.response.VelocityResponseWriter.write(VelocityResponseWriter.java:141)
at 
org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:53)
at 
org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:727)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:459)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:497)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at 
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)


Could anyone please tell me how to fix this.


-- 
Kind Regards,
Prakhar Kumar
Sr. Enterprise Software Engineer

*HotWax Systems*
*Enterprise open source experts*
cell: +91-89628-81820
office: 0731-409-3684
http://www.hotwaxsystems.com


Re: ResourceManager : unable to find resource 'custom.vm' in any resource loader.

2020-04-22 Thread Erick Erickson
You haven’t told us what version of Solr you’re using, so this is largely a 
guess.

You need to add a lib directive to solrconfig.xml, velocity is a contrib, but no
longer accessible by default, see: SOLR-13978

Best,
Erick

> On Apr 22, 2020, at 8:07 AM, Prakhar Kumar  
> wrote:
> 
> Hello Team,
> 
> I am getting this weird error in Solr logs.
> 
> null:java.io.IOException: Unable to find resource 'custom.vm'
>   at 
> org.apache.solr.response.VelocityResponseWriter.getTemplate(VelocityResponseWriter.java:308)
>   at 
> org.apache.solr.response.VelocityResponseWriter.write(VelocityResponseWriter.java:141)
>   at 
> org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:53)
>   at 
> org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:727)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:459)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
>   at org.eclipse.jetty.server.Server.handle(Server.java:497)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
>   at 
> org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
>   at java.lang.Thread.run(Thread.java:745)
> 
> 
> Could anyone please tell me how to fix this.
> 
> 
> -- 
> Kind Regards,
> Prakhar Kumar
> Sr. Enterprise Software Engineer
> 
> *HotWax Systems*
> *Enterprise open source experts*
> cell: +91-89628-81820
> office: 0731-409-3684
> http://www.hotwaxsystems.com



Re: ResourceManager : unable to find resource 'custom.vm' in any resource loader.

2020-04-22 Thread Erik Hatcher
What's the full request that is logged?   You're using the Velocity response 
writer (wt=velocity) and a request is being made to render a custom.vm template 
(v.template=custom, or a template is #parse'ing("custom.vm")) that doesn't 
exist.

Erik

> On Apr 22, 2020, at 8:07 AM, Prakhar Kumar  
> wrote:
> 
> Hello Team,
> 
> I am getting this weird error in Solr logs.
> 
> null:java.io.IOException: Unable to find resource 'custom.vm'
>   at 
> org.apache.solr.response.VelocityResponseWriter.getTemplate(VelocityResponseWriter.java:308)
>   at 
> org.apache.solr.response.VelocityResponseWriter.write(VelocityResponseWriter.java:141)
>   at 
> org.apache.solr.response.QueryResponseWriterUtil.writeQueryResponse(QueryResponseWriterUtil.java:53)
>   at 
> org.apache.solr.servlet.HttpSolrCall.writeResponse(HttpSolrCall.java:727)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:459)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:227)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:196)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
>   at org.eclipse.jetty.server.Server.handle(Server.java:497)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
>   at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
>   at 
> org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
>   at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
>   at java.lang.Thread.run(Thread.java:745)
> 
> 
> Could anyone please tell me how to fix this.
> 
> 
> -- 
> Kind Regards,
> Prakhar Kumar
> Sr. Enterprise Software Engineer
> 
> *HotWax Systems*
> *Enterprise open source experts*
> cell: +91-89628-81820
> office: 0731-409-3684
> http://www.hotwaxsystems.com



Re: How to implement spellcheck for custom suggest component?

2020-04-22 Thread Paras Lehana
Hi Buddy,

We have built Auto-Suggest over Solr with EdgeNGrams, Custom Spellcheck
Factory and Synonyms (for spelling mistakes). This solves for most cases.

If you have the dictionary for spelling mistakes, EdneNGrams after
Synonym factory will do the job.

On Thu, 16 Apr 2020 at 13:35, aTan  wrote:

> Hello.
> I'm new to Solr and would be thankful for advice for the following case:
> We have Suggest API running on production using Solr 6, which currently
> prevent changes in the response and query parameters. That's why SpellCheck
> component can't be used (parameter is custom, not 'q'or 'spellcheck.q').
> I've tried to search for the solution, but many threads ends without any
> clear answer.
>
> To my understanding there is 2 main ways.
> 1. Combine default filters, to emulate spellcheck behavior.
> Question: which combination might give good enough result?
> Advantage: will be very easy to integrate.
> Disadvantage: the quality and flexibility will be not very good
> 2. Implement custom filter, inside which implement advanced spellcheck
> functionality, using some open-source.
> Advantage: quality will be much higher
> Disadvantage: "invention of the bicycle" and even add custom filter to the
> production currently quite complicated.
> 3. Something else... open for suggestions :)
>
> The expected behavior:
> myrequestparam.q=iphon
> suggest: iphone, iphone 8...
>
> myrequestparam.q=iphonn
> suggest: iphone, iphone 8...
>
> If there is both cases possible and corrected suggestion is highly possible
> along with original one, maybe put it with lower weight in the list. But
> the
> response list should be the single entity (merged).
>
> Thanks.
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


-- 
-- 
Regards,

*Paras Lehana* [65871]
Development Engineer, *Auto-Suggest*,
IndiaMART InterMESH Ltd,

11th Floor, Tower 2, Assotech Business Cresterra,
Plot No. 22, Sector 135, Noida, Uttar Pradesh, India 201305

Mob.: +91-9560911996
Work: 0120-4056700 | Extn:
*1196*

-- 
*
*

 


Re: How upgrade to Solr 8 impact performance

2020-04-22 Thread Paras Lehana
Hi Rajeswari,

I can only share my experience of moving from Solr 6 to Solr 8. I suggest
you to move and then reevaluate your performance metrics. To recall another
experience, we moved from Java 8 to 11 for Solr 8.

Please note experiences can differ! :)

On Wed, 22 Apr 2020 at 00:50, Natarajan, Rajeswari <
rajeswari.natara...@sap.com> wrote:

> Any other experience from solr 7 to sol8 upgrade performance  .Please
> share.
>
> Thanks,
> Rajeswari
>
> On 4/15/20, 4:00 PM, "Paras Lehana"  wrote:
>
> In January, we upgraded Solr from version 6 to 8 skipping all versions
> in
> between.
>
> The hardware and Solr configurations were kept the same but we still
> faced
> degradation in response time by 30-50%. We had exceptional Query times
> around 25 ms with Solr 6 and now we are hovering around 36 ms.
>
> Since response times under 50 ms are very good even for Auto-Suggest,
> we
> have not tried any changes regarding this. Nevertheless, you can try
> using
> Caffeine Cache. Looking forward to read community inputs as well.
>
>
>
> On Thu, 16 Apr 2020 at 01:34, ChienHuaWang 
> wrote:
>
> > Do anyone have experience to upgrade the application with Solr 7.X
> to 8.X?
> > How's the query performance?
> > Found out a little slower response time from application with Solr8
> based
> > on
> > current measurement, still looking into more detail it.
> > But wondering is any one have similar experience? is that something
> we
> > should expect for Solr 8.X?
> >
> > Please kindly share, thanks.
> >
> > Regards,
> > ChienHua
> >
> >
> >
> > --
> > Sent from:
> https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> >
>
>
> --
> --
> Regards,
>
> *Paras Lehana* [65871]
> Development Engineer, *Auto-Suggest*,
> IndiaMART InterMESH Ltd,
>
> 11th Floor, Tower 2, Assotech Business Cresterra,
> Plot No. 22, Sector 135, Noida, Uttar Pradesh, India 201305
>
> Mob.: +91-9560911996
> Work: 0120-4056700 | Extn:
> *1196*
>
> --
> *
> *
>
>  
>
>

-- 
-- 
Regards,

*Paras Lehana* [65871]
Development Engineer, *Auto-Suggest*,
IndiaMART InterMESH Ltd,

11th Floor, Tower 2, Assotech Business Cresterra,
Plot No. 22, Sector 135, Noida, Uttar Pradesh, India 201305

Mob.: +91-9560911996
Work: 0120-4056700 | Extn:
*1196*

-- 
*
*

 


Re: SolrJ connection leak with SolrCloud and Jetty Gzip compression enabled

2020-04-22 Thread Jason Gerlowski
Hi Samuel,

Thanks for the very detailed description of the problem here.  Very
thorough!  I don't think you're missing anything obvious, please file the
jira tickets if you haven't already.

Best,

Jason

On Mon, Apr 13, 2020 at 6:12 PM Samuel Garcia Martinez <
samuel...@inditex.com> wrote:

> Reading again the last two paragraphs I realized that, those two
> specially, are very poorly worded (grammar 😓). I tried to rephrase them
> and correct some of the errors below.
>
> Here I can see three different problems:
>
> * HttpSolrCall should not use HttpServletResponse#setCharacterEncoding to
> set the Content-Encoding header. This is obviously a mistake.
> * HttpSolrClient, specifically the HttpClientUtil, should be modified to
> prevent that if the Content-Encoding header lies about the actual content,
> the connection is leaked forever. It should the exception though.
> * HttpSolrClient should allow clients to customize HttpClient's
> connectionRequestTimeout, preventing the application to be blocked forever
> waiting for a connection to be available. This way, the application could
> respond to requests that won’t use Solr instead of rejecting any incoming
> requests because all threads are blocked forever for a connection that
> won’t be available ever.
>
> I think the two first points are bugs that should be fixed.  The third one
> is a feature improvement to me.
>
> Unless I missed something, I'll file the two bugs and provide a patch for
> them. The same goes for the the feature improvement.
>
>
>
> Get Outlook for iOS
>
>
>
> En el caso de haber recibido este mensaje por error, le rogamos que nos lo
> comunique por esta misma vía, proceda a su eliminación y se abstenga de
> utilizarlo en modo alguno.
> If you receive this message by error, please notify the sender by return
> e-mail and delete it. Its use is forbidden.
>
>
>
> 
> From: Samuel Garcia Martinez 
> Sent: Monday, April 13, 2020 10:08:36 PM
> To: solr-user@lucene.apache.orG 
> Subject: SolrJ connection leak with SolrCloud and Jetty Gzip compression
> enabled
>
> Hi!
>
> Today, I've seen a weird issue in production workloads when the gzip
> compression was enabled. After some minutes, the client app ran out of
> connections and stopped responding.
>
> The cluster setup is pretty simple:
> Solr version: 7.7.2
> Solr cloud enabled
> Cluster topology: 6 nodes, 1 single collection, 10 shards and 3 replicas.
> 1 HTTP LB using Round Robin over all nodes
> All cluster nodes have gzip enabled for all paths, all HTTP verbs and all
> MIME types.
> Solr client: HttpSolrClient targeting the HTTP LB
>
> Problem description: when the Solr node that receives the request has to
> forward the request to a Solr Node that actually can perform the query, the
> response headers are added incorrectly to the client response, causing the
> SolrJ client to fail and to never release the connection back to the pool.
>
> To simplify the case, let's try to start from the following repro scenario:
>
>   *   Start one node with cloud mode and port 8983
>   *   Create one single collection (1 shard, 1 replica)
>   *   Start another node with port 8984 and the previusly started zk (-z
> localhost:9983)
>   *   Start a java application and query the cluster using the node on
> port 8984 (the one that doesn't host the collection)
>
> So, the steps occur like:
>
>   *   The application queries node:8984 with compression enabled
> ("Accept-Encoding: gzip") and wt=javabin
>   *   Node:8984 can't perform the query and creates a http request behind
> the scenes to node:8983
>   *   Node:8983 returns a gzipped response with "Content-Encoding: gzip"
> and "Content-Type: application/octet-stream"
>   *   Node:8984 adds the "Content-Encoding: gzip" header as character
> stream to the response (it should be forwarded as "Content-Encoding"
> header, not character encoding)
>   *   HttpSolrClient receives a "Content-Type:
> application/octet-stream;charset=gzip", causing an exception.
>   *   HttpSolrClient tries to quietly close the connection, but since the
> stream is broken, the Utils.consumeFully fails to actually consume the
> entity (it throws another exception in GzipDecompressingEntity#getContent()
> with "not in GZIP format")
>
> The exception thrown by HttpSolrClient is:
> java.nio.charset.UnsupportedCharsetException: gzip
>at java.nio.charset.Charset.forName(Charset.java:531)
>at
> org.apache.http.entity.ContentType.create(ContentType.java:271)
>at
> org.apache.http.entity.ContentType.create(ContentType.java:261)
>at
> org.apache.http.entity.ContentType.parse(ContentType.java:319)
>at
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:591)
>at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
>at
> org.apache.solr.client.solrj.impl.HttpSolrCli

Re: Solr indexing with Tika DIH - ZeroByteFileException

2020-04-22 Thread ravi kumar amaravadi
Hi,
Iam also facing same issue. Does anyone have any update/soulution how to fix
this issue as part DIH?

Thanks.

Regards,
Ravi kumar



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Potential bug with optimistic concurrency

2020-04-22 Thread Sachin Divekar
Hi all,

I am facing the exact same issue reported
https://issues.apache.org/jira/browse/SOLR-8733 and
https://issues.apache.org/jira/browse/SOLR-7404

I have tried it with Solr v8.4.1 and v8.5.1. In both cases, the cluster
consisted of three nodes and a collection with 3 shards and 2 replicas.

Following simple test case fails.

Collection "test" contains only two documents with ids "1" and "2"

Update operation:

curl -X POST -H 'Content-Type: application/json' '
http://localhost:8983/solr/test/update?versions=true&failOnVersionConflicts=false'
--data-binary '
[ { "id" : "2", "attr": "val", },
  { "id" : "1", "attr": "val", "_version_": -1 } ]'

Consistent response:

{
  "adds":[
"2",0,
"1",0],
  "error":{
"metadata":[
  "error-class","org.apache.solr.common.SolrException",
  "root-error-class","org.apache.solr.common.SolrException",

"error-class","org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException",

"root-error-class","org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException"],
"msg":"Async exception during distributed update: Error from server at
http://10.0.5.237:8983/solr/test_shard1_replica_n1/: null\n\n\n\nrequest:
http://10.0.5.237:8983/solr/test_shard1_replica_n1/\nRemote error message:
version conflict for 1 expected=-1 actual=1664690075695316992",
"code":409}}

I tried different updates using combinations of _version_ and document
values to generate conflicts. Every time the result is the same. There is
no problem with system resources. These servers are running only these Solr
nodes and Solr has been given a few GB of heap.

Are those issues SOLR-7404 and SOLR-8733 still unfixed? Unlike these
issues, I am not using the schema and config from example templates. These
nodes are set up by following Solr's production deployment document.

What are your thoughts/suggestions?

thanks
Sachin


FuzzyQuery causing Out of Memory Errors in 8.5.x

2020-04-22 Thread Colvin Cowie
Hello,

I'm moving our product from 8.3.1 to 8.5.1 in dev and we've got tests
failing because Solr is getting OOMEs with a 512mb heap where it was
previously fine.

I ran our tests on both versions with jconsole to track the heap usage.
Here's a little comparison. 8.5.1 dies part way through
https://drive.google.com/open?id=113Ujts-lzv9ZBJOUB78LA2Qw5PsIsajO

We have our own query parser as an extension to Solr, and we do various
things with user queries, including generating FuzzyQuery-s. Our
implementation of org.apache.solr.search.QParser.parse() isn't stateful and
parses the qstr and returns new Query objects each time it's called.
With JProfiler on I can see that the majority of the heap is being
allocated through FuzzyQuery's constructor.
https://issues.apache.org/jira/browse/LUCENE-9068 moved construction of the
automata from the FuzzyTermsEnum to the FuzzyQuery's constructor.

When profiling on 8.3.1 we still have a fairly large number of
FuzzyTermEnums created at times, but that's accounting for about ~40mb of
the heap for a few seconds rather than the 100mb to 300mb of continual
allocation for FuzzyQuery I'm seeing in 8.5.

It's definitely possible that we're doing something wrong in our extension
(which I can't share the source of) but it seems like the memory cost of
FuzzyQuery now is totally disproportionate to what it was before. We've not
had issues like this with our extension before (which doesn't mean that our
parser is flawless, but it's not been causing noticeable problems for the
last 4 years).


So I suppose the question is, are we misusing FuzzyQuery in some way (hard
for you to say without seeing the source), or are the recent changes using
more memory than they should?

I will investigate further into what we're doing. But I could maybe use
some help to create a stress test for Lucene itself that compares the
memory consumption of the old FuzzyQuery vs the new, to see whether it's
fundamentally bad for memory or if it's just how we're using it.

Regards,
Colvin


SegmentsInfoRequestHandler does not release IndexWriter

2020-04-22 Thread Tiziano Degaetano
Hello,

I’m digging in an issue getting timeouts doing a managed schema change using 
the schema api.
The call  hangs reloading the cores (does not recover until restarting the 
node):

sun.misc.Unsafe.park​(Native Method)
java.util.concurrent.locks.LockSupport.parkNanos​(Unknown Source)
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireNanos​(Unknown 
Source)
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos​(Unknown 
Source)
java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock​(Unknown 
Source)
org.apache.solr.update.DefaultSolrCoreState.lock​(DefaultSolrCoreState.java:179)
org.apache.solr.update.DefaultSolrCoreState.newIndexWriter​(DefaultSolrCoreState.java:230)
org.apache.solr.core.SolrCore.reload​(SolrCore.java:696)
org.apache.solr.core.CoreContainer.reload​(CoreContainer.java:1558)
org.apache.solr.schema.SchemaManager.doOperations​(SchemaManager.java:133)
org.apache.solr.schema.SchemaManager.performOperations​(SchemaManager.java:92)
org.apache.solr.handler.SchemaHandler.handleRequestBody​(SchemaHandler.java:90)
org.apache.solr.handler.RequestHandlerBase.handleRequest​(RequestHandlerBase.java:211)
org.apache.solr.core.SolrCore.execute​(SolrCore.java:2596)
org.apache.solr.servlet.HttpSolrCall.execute​(HttpSolrCall.java:802)
org.apache.solr.servlet.HttpSolrCall.call​(HttpSolrCall.java:579)

After a while I realized it was only deadlocked, after I used the AdminUI to 
view the segments info of the core.

So my question: is this line correct? If withCoreInfo is false iwRef.decref() 
will not be called to release the reader lock, preventing any further writer 
locks.
https://github.com/apache/lucene-solr/blob/3a743ea953f0ecfc35fc7b198f68d142ce99d789/solr/core/src/java/org/apache/solr/handler/admin/SegmentsInfoRequestHandler.java#L144

Regards,
Tiziano



using S3 as the Directory for Solr

2020-04-22 Thread dhurandar S
Hi,

I am looking to use S3 as the place to store indexes. Just how Solr uses
HdfsDirectory to store the index and all the other documents.

We want to provide a search capability that is okay to be a little slow but
cheaper in terms of the cost. We have close to 2 petabytes of data on which
we want to provide the Search using Solr.

Are there any open-source implementations around using S3 as the Directory
for Solr ??

Any recommendations on this approach?

regards,
Rahul