Re: is ConcurrentUpdateSolrClient.Builder thread safe?

2018-01-10 Thread Shawn Heisey
On 1/11/2018 12:05 AM, Bernd Fehling wrote: This will nerver pass a Jepsen test and I call it _NOT_ thread safe. I haven't looked into the code yet, to see if the queue is FIFO, otherwise this would be stupid. I was not thinking about order of operations when I said that the client was thread

Re: is ConcurrentUpdateSolrClient.Builder thread safe?

2018-01-10 Thread Bernd Fehling
Hi Shawn, from your answer I see that you are obviously not using ConcurrentUpdateSolrClient. I didn't say that I use ConcurrentUpdateSolrClient in multiple threads. I say that ConcurrentUpdateSolrClient.Builder has a method to set "withThreadCount", to empty the Clients queue with multiple threa

Re: Ingestion not scaling horizontally as I add more cores to Solr

2018-01-10 Thread Shawn Heisey
On 1/10/2018 12:58 PM, Shashank Pedamallu wrote: As you can see, the number of documents being ingested per core is not scaling horizontally as I'm adding more cores. Rather the total number of documents getting ingested for Solr JVM is being topped around 90k documents per second. I would ca

Re: Regarding document routing

2018-01-10 Thread manish tanger
Hello Shwana, First of all thanks for your response. >*For redundancy with ZK, you need three hosts minimum. A two-host ZK ensemble is actually *less* reliable than using one server. You aren't protected against failure until you have at least three. You would only need a minimum of two Solr h

Re: Spatial search, nested docs, feature density

2018-01-10 Thread Mikhail Khludnev
The problem itself sounds really challenging, but literally two point from the last question are:- - https://lucene.apache.org/solr/guide/6_6/other-parsers.html#OtherParsers-Scoring - find field in https://lucene.apache.org/solr/guide/6_6/function-queries.html#FunctionQueries-AvailableFunctions

Re: ClassicTokenizer

2018-01-10 Thread Shawn Heisey
On 1/10/2018 2:27 PM, Rick Leir wrote: I did not express that clearly. The reference guide says "The Classic Tokenizer preserves the same behavior as the Standard Tokenizer of Solr versions 3.1 and previous. " So I am curious to know why they changed StandardTokenizer after 3.1 to break on hyp

Re: ./fs-manager process run under solr

2018-01-10 Thread Shawn Heisey
On 1/10/2018 12:19 PM, Andy Fake wrote: I use Solr 5.5, I recently notice a process a process ./fs-manager is run under user solr that take quite high CPU usage. I don't think I see such process before. I have never heard of this, and have never seen it. Searching the source code, I cannot fin

Re: Ingestion not scaling horizontally as I add more cores to Solr

2018-01-10 Thread Shashank Pedamallu
They are separate cases. In attempt 1 – I was ingesting to only 1 core. Then to 3 cores and then 5 cores. Yes, they are completely independent cores. I think I was not reading the ‘iostats’ right. With –x option, the ‘avgrq-sz’ parameter is constantly above 300. From some readings online, I see

Re: regarding exposing merge metrics

2018-01-10 Thread Shawn Heisey
On 1/10/2018 11:08 AM, S G wrote: Last comment by Shawn on SOLR-10130 is: Metrics was just a theory, sounds like that's not it. It would be very interesting to know what really caused the slowdown and do we really need the config or not. That comment wasn't actually about SOLR-10130 itself.I c

Re: is ConcurrentUpdateSolrClient.Builder thread safe?

2018-01-10 Thread Shawn Heisey
On 1/10/2018 8:33 AM, Bernd Fehling wrote: after some strange search results I was trying to locate the problem and it turned out that it starts with bulk loading with SolrJ and ConcurrentUpdateSolrClient.Builder with several threads. I assume that ConcurrentUpdateSolrClient.Builder is _NOT_ thr

Re: Ingestion not scaling horizontally as I add more cores to Solr

2018-01-10 Thread Erick Erickson
OK, so I'm assuming your indexer indexes to 1, 3 and 5 separate cores depending on how many are available, right? And these cores are essentially totally independent. I'd guess your gating factor is your ingestion process. Try spinning up two identical ones from two separate clients. Eventually yo

Mixing simple and nested docs in same update?

2018-01-10 Thread Jan Høydahl
Hi, We index several large nested documents. We found that querying the data behaves differently depending on how the documents are indexed. To reproduce: solr start solr create -c nested # Index one plain document, “friend" and a nested one, “mother” and “daughter”, in same request: curl loca

Spatial search, nested docs, feature density

2018-01-10 Thread Leila Deljkovic
Hi, https://lucene.apache.org/solr/guide/7_0/uploading-data-with-index-handlers.html#UploadingDatawithIndexHandlers-NestedChildDocuments I have never used neste

Re: Spatial search (and nested docs)

2018-01-10 Thread Leila Deljkovic
Hi Emir, Thanks for the reply. My problem has been simplified a bit now. https://lucene.apache.org/solr/guide/7_0/uploading-data-with-index-handlers.html#UploadingDatawithIndexHandlers-NestedChildDocuments

Re: Ingestion not scaling horizontally as I add more cores to Solr

2018-01-10 Thread Shashank Pedamallu
- Did you sept up an actual multiple node cluster or are you running this all on one box? Sorry, I should have mentioned this earlier. I’m running Solr in non-cloud mode. It is just a single node Solr. - Are you configuring Jmeter to send with multiple threads? Yes, multiple threads looping a fi

./fs-manager process run under solr

2018-01-10 Thread Andy Fake
Hi, I use Solr 5.5, I recently notice a process a process ./fs-manager is run under user solr that take quite high CPU usage. I don't think I see such process before. Is that a legitimate process from Solr? Thanks.

Re: Ingestion not scaling horizontally as I add more cores to Solr

2018-01-10 Thread Erick Erickson
And I'd add - are you sending one document at a time or batching them up? See: https://lucidworks.com/2015/10/05/really-batch-updates-solr-2/ Best, Erick On Wed, Jan 10, 2018 at 1:35 PM, Gus Heck wrote: > Ok then here's a few things to check... > >- Did you sept up an actual multiple node c

Re: Very high number of deleted docs, part 2

2018-01-10 Thread Erick Erickson
There's some background here: https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/ the 2.5 "live" document limit is really "50% of the max segment size", hard-coded in TieredMergePolicy. bq: Well, maxSegments with optimize or commit with expungeDeletes did not do

Re: Ingestion not scaling horizontally as I add more cores to Solr

2018-01-10 Thread Gus Heck
Ok then here's a few things to check... - Did you sept up an actual multiple node cluster or are you running this all on one box? - Are you configuring Jmeter to send with multiple threads? - Are they all sending to the same node, or are you distributing across nodes? Is there a loa

Re: ClassicTokenizer

2018-01-10 Thread Rick Leir
Shawn I did not express that clearly. The reference guide says "The Classic Tokenizer preserves the same behavior as the Standard Tokenizer of Solr versions 3.1 and previous. " So I am curious to know why they changed StandardTokenizer after 3.1 to break on hyphens, when it seems to me to work

Re: Ingestion not scaling horizontally as I add more cores to Solr

2018-01-10 Thread Shashank Pedamallu
Hi Gus, Thank for the reply. I’m sending via jmeter running on my local machine to Solr running on a remote vm. Thanks, Shashank On 1/10/18, 12:34 PM, "Gus Heck" wrote: Ingested how? Sounds like your document sending mechanism is maxed, not the solr cluster... On Wed, Jan 10

Re: Ingestion not scaling horizontally as I add more cores to Solr

2018-01-10 Thread Gus Heck
Ingested how? Sounds like your document sending mechanism is maxed, not the solr cluster... On Wed, Jan 10, 2018 at 2:58 PM, Shashank Pedamallu wrote: > Hi, > > > > I’m trying to find the upper thresholds of ingestion and I have tried the > following. In each of the experiments, I’m ingesting ra

Ingestion not scaling horizontally as I add more cores to Solr

2018-01-10 Thread Shashank Pedamallu
Hi, I’m trying to find the upper thresholds of ingestion and I have tried the following. In each of the experiments, I’m ingesting random documents with 5 fields. Number of Cores Number of documents ingested per second per core 1 89000 3 33000 5 18000 As you can see, the

Re: regarding exposing merge metrics

2018-01-10 Thread S G
Last comment by Shawn on SOLR-10130 is: Metrics was just a theory, sounds like that's not it. It would be very interesting to know what really caused the slowdown and do we really need the config or not. Thanks SG On Tue, Jan 9, 2018 at 12:00 PM, suresh pendap wrote: > Thanks Shalin for shar

RE: Very high number of deleted docs, part 2

2018-01-10 Thread Markus Jelsma
Well, maxSegments with optimize or commit with expungeDeletes did not do the job in testing. But tell me more about 2.5G live documents limit, no idea what it is. Thanks, Markus -Original message- > From:Erick Erickson > Sent: Friday 5th January 2018 17:56 > To: solr-user > Subject:

is ConcurrentUpdateSolrClient.Builder thread safe?

2018-01-10 Thread Bernd Fehling
Hi list, after some strange search results I was trying to locate the problem and it turned out that it starts with bulk loading with SolrJ and ConcurrentUpdateSolrClient.Builder with several threads. I assume that ConcurrentUpdateSolrClient.Builder is _NOT_ thread safe according the docs send to

Re: Regarding document routing

2018-01-10 Thread Shawn Heisey
On 1/10/2018 12:18 AM, manish tanger wrote: I am having a doubt in implicit routing and didn't find much info about this over the internet, so Please help me out on this. *About environment:* M/c 1: Zookeeper 1 and Solr 1 M/c 2: Zookeeper 2 and Solr 2 For redundancy with ZK, you need three hos

Re: Spatial search

2018-01-10 Thread Emir Arnautović
Hi Leila, Maybe I need to refresh my spatial terminology, but I am having troubles following your case. Can you explain a bit more, what is dataset that is indexed and what are query inputs and what should be the result. The one thing that puzzles me the most is “nested documents”. Thanks, Emir