Re: deleted master index files replica did not replicate

2018-06-04 Thread Jeff Courtade
Yes unix. It was an amazing moment. On Mon, Jun 4, 2018, 11:28 PM Erick Erickson wrote: > bq. To be clear I deleted the actual index files out from under the > running master > > I'm assuming *nix here since Windows won't let you delete a file that > has an open file handle... > > Did you the

Re: UUIDUpdateProcessorFactory can cause duplicate documents?

2018-06-04 Thread Erick Erickson
First, your assumption is correct. It would be A Bad Thing if two identical UUIDs were generated Is this SolrCloud? If so, then the deduplication idea won't work. The problem is that the uuid is used for routing and there is a decent (1 - 1/numShards) chance that the two "identical" docs would

Re: A good KV store/plugins to go with Solr

2018-06-04 Thread Erick Erickson
Well, you can always throw more replicas at the problem as well. But Andrea's comment is spot on. When Solr stores a field, it compresses it. So to fetch the stored info, it has to: 1> seek the disk 2> decompress at minimum 16K 3> assemble the response. All the while perhaps causing memory to be

Re: deleted master index files replica did not replicate

2018-06-04 Thread Erick Erickson
bq. To be clear I deleted the actual index files out from under the running master I'm assuming *nix here since Windows won't let you delete a file that has an open file handle... Did you then restart the master? Aside from any checks about refusing to replicate an empty index, just deleting the

Re: deleted master index files replica did not replicate

2018-06-04 Thread Walter Underwood
Check the logs. I bet it says something like “refusing to fetch empty index.” wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Jun 4, 2018, at 1:41 PM, Jeff Courtade wrote: > > I am thankful for that! > > Could you point me at something that explain

Re: sharding guidelines

2018-06-04 Thread Erik Hatcher
I’d say that 100M/shard is in the smallest doc use case possible, such as straight up log items with only a timestamp, id, and short message kind of thing. In other contexts, big full text docs, 10M/shard is kind of a max. How many documents do you have in your collection? Erik Hatcher

sharding guidelines

2018-06-04 Thread Oakley, Craig (NIH/NLM/NCBI) [C]
I have a sharding question. We have a collection (one shard, two replicas, currently running Solr6.6) which sometimes becomes unresponsive on the non-leader node. It is 214 gigabytes, and we were wondering whether there is a rule of thumb how large to allow a core to grow before sharding. I

Re: deleted master index files replica did not replicate

2018-06-04 Thread Jeff Courtade
I am thankful for that! Could you point me at something that explains this maybe? J On Mon, Jun 4, 2018, 4:31 PM Shawn Heisey wrote: > On 6/4/2018 12:15 PM, Jeff Courtade wrote: > > This was strange as I would have thought the replica would have > replicated > > an empty index from the master.

Re: deleted master index files replica did not replicate

2018-06-04 Thread Shawn Heisey
On 6/4/2018 12:15 PM, Jeff Courtade wrote: > This was strange as I would have thought the replica would have replicated > an empty index from the master. Solr actually has protections in place to specifically PREVENT index replication when the master has an empty index.  This is so that a accident

Re: A good KV store/plugins to go with Solr

2018-06-04 Thread Andrea Gazzarini
Hi Sam, I have been in a similar scenario (not recently so my answer could be outdated). As far as I remember caching, at least in that scenario, didn't help so much, probably because the field size. So we went with the second option: a custom SearchComponent connected with Redis. I'm not aware if

Re: deleted master index files replica did not replicate

2018-06-04 Thread Jeff Courtade
This is what I thought too. It happened on all 5 really weird behavior. I entirely expected blank indexes on the replica On Mon, Jun 4, 2018, 2:38 PM Aman Tandon wrote: > Hi Jeff, > > I suppose there should be slave configuration in solrconfig files which > says to ping master to check for the

A good KV store/plugins to go with Solr

2018-06-04 Thread Sambhav Kothari
Hi everyone, We at MetaBrainz are trying to scale our solr cloud instance but are hitting a bottle-neck. Each of the documents in our solr index is accompanied by a '_store' field that store our API compatible response for that document (which is basically parsed and displayed by our custom respo

Re: deleted master index files replica did not replicate

2018-06-04 Thread Aman Tandon
Hi Jeff, I suppose there should be slave configuration in solrconfig files which says to ping master to check for the version and get the modified files. If replication is configured in slave you will see commands getting triggered and you could get some idea from there. Also you could paste tha

Re: UUIDUpdateProcessorFactory can cause duplicate documents?

2018-06-04 Thread Aman Tandon
Hi, Suppose id field is the UUID linked field in the configuration and if this is missing in the document coming to index then it will generate a UUID and set it in id field. However if id field is present with some value then it shouldn't. Kindly refer http://lucene.apache.org/solr/5_5_0/solr-co

Re: deleted master index files replica did not replicate

2018-06-04 Thread Jeff Courtade
To be clear I deleted the actual index files out from under the running master On Mon, Jun 4, 2018, 2:25 PM Jeff Courtade wrote: > So are you saying it should have? > > It really acted like a normal function this happened on 5 different pairs > in the same way. > > > On Mon, Jun 4, 2018, 2:23 PM

Re: deleted master index files replica did not replicate

2018-06-04 Thread Jeff Courtade
So are you saying it should have? It really acted like a normal function this happened on 5 different pairs in the same way. On Mon, Jun 4, 2018, 2:23 PM Aman Tandon wrote: > Could you please check the replication request commands in solr logs of > slave and see if it is complaining anything. >

Re: deleted master index files replica did not replicate

2018-06-04 Thread Aman Tandon
Could you please check the replication request commands in solr logs of slave and see if it is complaining anything. On Mon, Jun 4, 2018, 23:45 Jeff Courtade wrote: > Hi, > > This I think is a very simple question. > > I have a solr 4.3 master slave setup. > > Simple replication. > > The master

UUIDUpdateProcessorFactory can cause duplicate documents?

2018-06-04 Thread S G
Hi, Is it correct to assume that UUIDUpdateProcessorFactory will produce 2 documents even if the same document is indexed twice without the "id" field ? And to avoid such a thing, we can use the technique mentioned in https://wiki.apache.org/solr/Deduplication ? Thanks SG

deleted master index files replica did not replicate

2018-06-04 Thread Jeff Courtade
Hi, This I think is a very simple question. I have a solr 4.3 master slave setup. Simple replication. The master and slave were both running and synchronized up to date I went on the master and deleted the index files while solr was running. solr created new empty index files and continued to

Re: Solr 7.3 suggest dictionary building fails in cloud mode with large number of rows

2018-06-04 Thread Walter Underwood
Yes, why are you doing this? A suggester is designed to have a smaller set of terms than the entire index. I would never expect a 130 million term suggester to work. I’m astonished that it works with 50 million terms. We typically have about 50 thousand terms in a suggester. Also, you haven’t

Parent product show in search result

2018-06-04 Thread Apurba Hazra
Hello, We are implementing solr search for our webseite using magento. Our requirement is, in search result page we have to show only parent product not all child product if the parent exist, otherwise we have to show child product. Will you please tell us how we can do that. Should we change se

Re: Solr 7.3 suggest dictionary building fails in cloud mode with large number of rows

2018-06-04 Thread Erick Erickson
bq. I have 130 million documents and each document has unique document id. I want to build suggester on document id. Why do it this way? I'm supposing you want to have someone start typing in the doc ID then do autocomplete on it. For such a simple operation, it would be far easier and pretty cert

Re: Setting preferred replica for query/read

2018-06-04 Thread Zheng Lin Edwin Yeo
Hi, SOLR-8146 has not been updated since January last year, but I have just commented it. So we need both to be updated in order to achieve the full functionality of setting preferred replica for query/read? Currently, is there a way to achieve this by other means? Regards, Edwin On 4 June 2018

Re: Setting preferred replica for query/read

2018-06-04 Thread Ere Maijala
Hi, Well, SOLR-11982 adds server-side support for part of what SOLR-8146 aims to do (shards.preference=replica.location:[something]). It doesn't do regular expressions or snitches at the moment, though it would be easy to add. So, it looks to me like SOLR-8146 would need to be updated in this

Re: SolrCloud Collection Backup - Solr 5.5.4

2018-06-04 Thread Greenhorn Techie
Thanks Shawn for your detailed reply. It has helped to better my understanding. Below is my summarised understanding. In a SolrCloud setup with version less than 6.1, there is no ‘elegant’ way of handling collection backups and restore. Instead, have to use the manual backup and restore APIs using

Re: Solr 7.3 suggest dictionary building fails in cloud mode with large number of rows

2018-06-04 Thread Yogendra Kumar Soni
I sent log of node to which i sent the request. need to check other nodes log >>In SolrCloud an investigation does not isolate to a single Solr log : you >>see a timeout, i would recommend to check both the nodes involved. monitored from admin UI, could not find any clue at the time of failure.

Re: Solr Cloud 6.6.4 in Docker containers: collection creation fails

2018-06-04 Thread Ronja Koistinen
Hi, Thanks for your reply. I have done all this. I am able to resolve all the nodes from inside all the other containers on all the other nodes. I am able to "wget" the http://othernode:8984/solr URLs from inside the containers. -- Ronja Koistinen University of Helsinki On 04.06.2018 11:34, Jan

Re: Solr 7.3 suggest dictionary building fails in cloud mode with large number of rows

2018-06-04 Thread Alessandro Benedetti
Hi Yogendra, you mentioned you are using SolrCloud. In SolrCloud an investigation does not isolate to a single Solr log : you see a timeout, i would recommend to check both the nodes involved. When you say : " heap usage is around 10 GB - 12 GB per node.", do you refer to the effective usage by th

Setting preferred replica for query/read

2018-06-04 Thread Zheng Lin Edwin Yeo
Hi, Is there any similarities between these two requests in the JIRA regarding setting of prefer replica function? (SOLR-11982) Add support for preferReplicaTypes parameter (SOLR-8146) Allowing SolrJ CloudSolrClient to have preferred replica for query/read I am looking at setting one of the rep

Re: URL to call ZooKeeper instead of Solr directly in SolrCloud

2018-06-04 Thread Zheng Lin Edwin Yeo
Hi Shawn, Thanks for the info. Regards, Edwin On 4 June 2018 at 08:59, Shawn Heisey wrote: > On 6/3/2018 10:44 AM, Zheng Lin Edwin Yeo wrote: > >> I am running Solr in Cloud Mode, there is a fault tolerant ZK setup. I >> understand that we can use CloudSolrClient, and it will automatically >>

Re: Solr Cloud 6.6.4 in Docker containers: collection creation fails

2018-06-04 Thread Jan Høydahl
Hi Probably some networking issues. I would log in to each node at a time and verify connectivity to all others. Make sure that e.g. "node1" resolves correctly on all RHEL hosts as well as inside docker containers. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 1.

Re: Mysterious Solr crash

2018-06-04 Thread Nawab Zada Asad Iqbal
I am using 7.0.1 without SolrCloud (sorry for missing that detail earlier). I totally agree with you, Shawn. An crash after an OOM is not graceful like this and it usually has an incomplete logline in the end. I tried to only copy the log which seemed related to this issue, I will see if I find