Re: Solr server partial update is very slow

2017-11-09 Thread Sujay Bawaskar
Any reason we get below log even if client does not issue commit or we can ignore this log? Log: 2017-11-10 05:13:33.730 INFO (qtp225493257-38746) [ x:collection] o.a.s.s.SolrIndexSearcher Opening [Searcher@7010b1c6[collection] realtime] On Fri, Nov 10, 2017 at 12:06 PM, Sujay Bawaskar wrote:

Re: How to routing document for send to particular shard range

2017-11-09 Thread Erick Erickson
You cannot just make configuration changes, whether you use implicit or compositeId is defined when you _create_ the collection and cannot be changed later. You need to create a new collection and specify router.name=implicit when you create it. Then you can route documents as you desire. I would

Re: Solr server partial update is very slow

2017-11-09 Thread Sujay Bawaskar
We are not issuing client side commit for partial update. We have openSearcher=false in solrconfig.xml, in this case we have set softCommit interval as 15 minutes. Solr version is 6.4.1. Thanks, Sujay On Fri, Nov 10, 2017 at 11:58 AM, Erick Erickson wrote: > bq: We are getting below log without

Re: Solr server partial update is very slow

2017-11-09 Thread Erick Erickson
bq: We are getting below log without invoking commit operation after every partial update call Not sure what you mean here. If you're issuing a commit from the client every time you update a doc (or even a batch) that's an anti-pattern and you're opening searchers all the time. Don't do that ;).

RE: How to routing document for send to particular shard range

2017-11-09 Thread Ketan Thanki
Thanks Amrit, For suggesting me the approach. I have got some understanding regarding to it and i need to implement implicit routing for specific shard based. I have try by make changes on core.properties. but it can't work So can you please let me for the configuration changes needed. Is it n

ygc problem on solr 5.5.1

2017-11-09 Thread 胡一博
Hello everyone! I run a solr cloud on version 5.5.1. Sometime ,the ygc time would increase from 0.1s to 10+seconds and keep 10+seconds for several hours.Even after I trigger a fullgc ,the ygc still cost 10+seconds. This happened seldom. jvm params:(java version "1.7.0_60", Java HotSpot (TM) 64-

Solr server partial update is very slow

2017-11-09 Thread Sujay Bawaskar
Hi, We are getting below log without invoking commit operation after every partial update call. We have configured soft commit and commit time as below. With below configuration we are able to perform 800 partial updates per minutes which I think is very slow. Our Index size is 10GB for this parti

Re: Solr7: Bad query throughput around commit time

2017-11-09 Thread Erick Erickson
What evidence to you have that the changes you've made to your configs are useful? There's lots of things in here that are suspect: 1 First, this is useless unless you are forceMerging/optimizing. Which you shouldn't be doing under most circumstances. And you're going to be rewriting a lot of d

Solr7: Bad query throughput around commit time

2017-11-09 Thread Nawab Zada Asad Iqbal
Hi, I am committing every 5 minutes using a periodic cron job "curl http://localhost:8984/solr/core1/update?commit=true";. Besides this, my app doesn't do any soft or hard commits. With Solr 7 upgrade, I am noticing that query throughput plummets every 5 minutes - probably when the commit happens

Re: Streaming and large resultsets

2017-11-09 Thread Lanny Ripple
First, Joel, thanks for your help on this. 1) I have to admit we really haven't played with a lot of system tuning recently (before DocValues for sure). We'll go through another tuning round. 2) At the time I ran these numbers this morning we were not indexing. We build this collection once a

Re: Multiple collections for a write-alias

2017-11-09 Thread Shawn Heisey
On 11/9/2017 11:09 AM, S G wrote: > However, re-ingestion takes several hours to complete and during that time, > the customer has to write to both the collections - previous collection and > the one being bootstrapped. > This dual-write is harder to do from the client side (because client needs >

Re: Streaming and large resultsets

2017-11-09 Thread Joel Bernstein
In my experience this should be very fast: search(graph-october, q="outs:tokenA", fl="id,token", sort="id asc", qt="/export", wt=javabin) When the DocValues cache is statically warmed for the two output fields I would see somewhere around 500,000 docs per second exported fro

Re: Core size - distinguish between merge and deletes

2017-11-09 Thread Erick Erickson
Please don't do that ;) Unless you're willing to do it frequently. See: https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/ expungeDeletes is really a variety of optimize, so the issues outlined in that blog apply. Best, Erick On Thu, Nov 9, 2017 at 12:24 PM, S

Re: Core size - distinguish between merge and deletes

2017-11-09 Thread Shashank Pedamallu
Thanks for the response Erick. I’m deleting the documents with expungeDeletes option set as true. So, that does trigger a merge to throw away the deleted documents. On 11/9/17, 12:17 PM, "Erick Erickson" wrote: bq: Is there a way to distinguish between when size is being reduced becaus

Re: Core size - distinguish between merge and deletes

2017-11-09 Thread Erick Erickson
bq: Is there a way to distinguish between when size is being reduced because of a delete from that of during a lucene merge. Not sure what you're really looking for here. Size on disk is _never_ reduced by a delete operation, the document is only 'marked as deleted'. Only when segments are merged

Core size - distinguish between merge and deletes

2017-11-09 Thread Shashank Pedamallu
Hi, I wanted to get accurate metrics regarding to the amount of data being indexed in Solr. In this regard, I observe that sometimes, this number decreases due to lucene merges. But I’m also deleting data at times. Is there a way to distinguish between when size is being reduced because of a de

Re: Long blocking during indexing + deleteByQuery

2017-11-09 Thread Chris Troullis
Thanks Mike, I will experiment with that and see if it does anything for this particular issue. I implemented Shawn's workaround and the problem has gone away, so that is good at least for the time being. Do we think that this is something that should be tracked in JIRA for 6.X? Or should I confi

RE: Phrase suggester - field limit and order

2017-11-09 Thread Peter Lancaster
Hi, The weight field in combination with the BlenderType will determine the order, so yes you can control the order. I don't think you can return only the matched phrase, but I would guess that highlighting would enable you to pick off the phrase that was matched in your client. Cheers, Peter

Phrase suggester - field limit and order

2017-11-09 Thread ruby
I'm using the BlendedInfixLookupFactory to get phrase suggestions. It returns the entire field content. I've tried the others and they do the same. AnalyzingInfixSuggester BlendedInfixLookupFactory DocumentDictionaryFactory title price text_en Is there a way to only return a fracti

Re: Streaming and large resultsets

2017-11-09 Thread Lanny Ripple
Happy to do so. I am testing streams for the first time so we don't have any 5.x experience. The collection I'm testing was loaded after going to 6.6.1 and fixing up the solrconfig for lucene_version and removing the /export clause. The indexes run 57G per replica. We are using 64G hosts with 4

Re: Multiple collections for a write-alias

2017-11-09 Thread Erick Erickson
Aliases can already point to multiple collections, have you just tried that? I'm not totally sure what the behavior would be, but nothing you've written indicates you tried so I thought I'd point it out. It's not clear to me how useful this is though, or what failure messages are returned. Or how

Re: A problem of tracking the commits of Lucene using SHA num

2017-11-09 Thread Chris Hostetter
: In the first few weeks of 2016, the Lucene/Solr project migrated from : svn to git.  Prior to this, there was a github mirror of the subversion : repository, but when the official repository was converted, that github : mirror was completely deleted, and replaced with an exact mirror of the : off

Multiple collections for a write-alias

2017-11-09 Thread S G
Hi, We have a use-case to re-create a solr-collection by re-ingesting everything but not tolerate a downtime while that is happening. We are using collection alias feature to point to the new collection when it has been re-ingested fully. However, re-ingestion takes several hours to complete and

Re: Streaming and large resultsets

2017-11-09 Thread Joel Bernstein
Can you post the exact streaming query you are using? The size of the index and field types will help understand the issue as well. Also are you seeing different performance behaviors after the upgrade or just testing the streaming for the first time on 6.6.1? When using the /export handler to str

Re: A problem of tracking the commits of Lucene using SHA num

2017-11-09 Thread Shawn Heisey
On 11/9/2017 3:56 AM, TOM wrote: > Thanks for your patience and helps. > > Recently, I acquired a batch of commits’ SHA data of Lucene, of which > the time span is from 2010 to 2015. In order to get original info, I tried to > use these SHA data to track commits. First, I cloned Lucene repos

Re: Make search on the particular field to be case sensitive

2017-11-09 Thread Amrit Sarkar
Ah ok. I didn't test and laid it over. Thank you Erick for correcting me out. On 9 Nov 2017 9:06 p.m., "Erick Erickson" wrote: > This won't quite work. "string" types are totally un-analyzed you > cannot add filters to a solr.StrField, you must use solr.TextField > rather than solr.StrField. >

Re: Make search on the particular field to be case sensitive

2017-11-09 Thread Erick Erickson
This won't quite work. "string" types are totally un-analyzed you cannot add filters to a solr.StrField, you must use solr.TextField rather than solr.StrField. start over and re-index from scratch in a new collection of course. You also need to make sure you really want to

Streaming and large resultsets

2017-11-09 Thread Lanny Ripple
We've recently upgraded our SolrCloud (16 shards, 2 replicas) to 6.6.1 on our way to 7 and I'm getting surprising /stream results. In one example I /select (wt=csv) and /stream [using search(...,wt=javabin)] with a query that gives a resultset size of 541 tuples. The select comes back in under a

Re: Make search on the particular field to be case sensitive

2017-11-09 Thread Amrit Sarkar
Behavior of the field values is defined by fieldType analyzer declaration. If you look at the managed-schema; You will find fieldType declarations like: > > ignoreCase="true"/> class="solr.EnglishPossessiveFilterFactory"/> "solr.KeywordMarkerFilterFactory" protected="protwords.txt"/> cla

Re: Java 9

2017-11-09 Thread Furkan KAMACI
Hi, Here is an explanation about deprecation of https://docs.oracle.com/javase/9/gctuning/concurrent-mark-sweep-cms-collector.htm Kind Regards, Furkan KAMACI On Tue, Nov 7, 2017 at 10:46 AM, Daniel Collins wrote: > Oh, blimey, have Oracle gone with Ubuntu-style numbering now? :) > > On 7 Novem

Re: solr cloud updatehandler stats mismatch

2017-11-09 Thread Amrit Sarkar
Wei, Are the requests coming through to collection has multiple shards and replicas. Please mind a update request is received by a node, redirected to particular shard the doc belong, and then distributed to replicas of the collection. On each replica, each core, update request is played. Can be

Make search on the particular field to be case sensitive

2017-11-09 Thread Karan Saini
Hi guys, Solr version :: 6.6.1 ** I have around 10 fields in my core. I want to make the search on this specific field to be case sensitive. Please advise, how to introduce case sensitivity at the field level. What changes do i need to make for this field ? Thanks, Karan

Re: solr cloud updatehandler stats mismatch

2017-11-09 Thread Furkan KAMACI
Hi Wei, Do you compare it with files which are under /var/solr/logs by default? Kind Regards, Furkan KAMACI On Sun, Nov 5, 2017 at 6:59 PM, Wei wrote: > Hi, > > I use the following api to track the number of update requests: > > /solr/collection1/admin/mbeans?cat=UPDATE&stats=true&wt=json > >

A problem of tracking the commits of Lucene using SHA num

2017-11-09 Thread TOM
Thanks for your patience and helps. Recently, I acquired a batch of commits?? SHA data of Lucene, of which the time span is from 2010 to 2015. In order to get original info, I tried to use these SHA data to track commits. First, I cloned Lucene repository to my local host, using the cmd gi

Re: Quick Query about

2017-11-09 Thread Karan Saini
Thanks Charlie Hull for the quick answer. It worked for me in windows. *baseDir="\\CLDserver02\RemoteK1Depot"* Regards, Karan On 9 November 2017 at 14:58, Charlie Hull wrote: > On 09/11/2017 09:13, Karan Saini wrote: > >> Hi there, >> > > Hi Karan, > > Have you tried the syntax baseDir="//se

Re: Faceting Word Count

2017-11-09 Thread Toke Eskildsen
On Wed, 2017-11-08 at 16:58 +0200, Wael Kader wrote: > Facets are taking around 1 minute to return data now. Can you verify if this is due to updates causing a new searcher to be opened or if it just takes that long? Easy way to test it to stop updating the index then do a few call with different

Re: Quick Query about

2017-11-09 Thread Charlie Hull
On 09/11/2017 09:13, Karan Saini wrote: Hi there, Hi Karan, Have you tried the syntax baseDir="//servername/sharedfoldername" ? I believe this should work on a Windows network. Regards Charlie I am new to the Apache Solr and currently exploring how to use this technology to search in th

Re: Quick Query about

2017-11-09 Thread Karan Saini
Hi Deepak, I think you mistaken my query. I am looking to access the PDF files from another server, not the database. Thanks, Karan On 9 November 2017 at 14:49, Deepak Vohra wrote: > Provide the url to the data source on a different server. > dataConfig> > driver="com.microso

Re: Atomic Updates with SolrJ

2017-11-09 Thread Amrit Sarkar
Hi Martin, I tested the same application SolrJ code on my system, it worked just fine on Solr 6.6.x. My Solrclient is "CloudSolrJClient", which I think doesn't make any difference. Can you show the response and field declarations if you are continuously facing the issue. Amrit Sarkar Search Engin

Re: Quick Query about

2017-11-09 Thread Deepak Vohra
Provide the url to the data source on a different server. dataConfig> On Thu, 11/9/17, Karan Saini wrote: Subject: Quick Query about To: solr-user@

Quick Query about

2017-11-09 Thread Karan Saini
Hi there, I am new to the Apache Solr and currently exploring how to use this technology to search in the PDF files. https://lucene.apache.org/solr/guide/6_6/u

Atomic Updates with SolrJ

2017-11-09 Thread Martin Keller
Hello, I’m trying to Update a field in a document via SolrJ. Unfortunately, while the field itself is updated correctly, values of some other fields are removed. The code looks like this: SolrInputDocument updateDoc = new SolrInputDocument(); updateDoc.addField("id", "1234"); Map updateValue =