How to auto-scale Solr and maintain even distribution of shard replicas across the whole cluster?

2020-02-19 Thread Christian Platta
Hi, I'm playing around with the autoscaling feature of Solr 7.7.1 and have the following scenario to solve: - One collection with two shards - I want to activate autoscaling to achieve the following behavior: * Every time a new node comes up, it should get a new replica automatically through t

Function query scale

2019-03-10 Thread Vincenzo D'Amore
uct(termfreq(taxonomy,accessories),-5.0) product(termfreq(taxonomy,women),2.0) product(termfreq(taxonomy,most_popular),5.0) [...] This lead to the main function that sums the results of all the previous functions and then scale them to a value between 1 to 100: def(scale(sum(1.0,product(...),prod

Re: Thoughts on scaling strategy for Solr deployed on AWS EC2 instances - Scale up / out and which instance type?

2018-05-21 Thread Deepak Goel
oes exactly happen? Can you please give a bit more about this? > Our key question: Scale up (fewer instances to manage) or Scale out (more > instances to manage) and > do we switch to compute optimized instances (the answer given our usage I > assume is probably) > > Is the load

Re: Thoughts on scaling strategy for Solr deployed on AWS EC2 instances - Scale up / out and which instance type?

2018-05-21 Thread Shawn Heisey
l affect what percentage is needed. What precisely are you looking at to determine that the machine is CPU-bound?  Some of the things that people assume are evidence of CPU problems are actually evidence of I/O problems caused by not having enough memory. Our key question: Scale up (fewer instances

Re: Thoughts on scaling strategy for Solr deployed on AWS EC2 instances - Scale up / out and which instance type?

2018-05-21 Thread Kelly, Frank
gt; Our EBS traffic bandwidth seems to work great so searches on disk are >>pretty fast. >> Now though we seem CPU bound and if/ when Solr CPU gets pegged for too >>long replication falls behind and then starts to recover which causes >>more usage and then eventually shards go ³Do

Re: Thoughts on scaling strategy for Solr deployed on AWS EC2 instances - Scale up / out and which instance type?

2018-05-21 Thread Erick Erickson
are pretty > fast. > Now though we seem CPU bound and if/ when Solr CPU gets pegged for too long > replication falls behind and then starts to recover which causes more usage > and then eventually shards go “Down”. > > Our key question: Scale up (fewer instances to manage) or Scale

Thoughts on scaling strategy for Solr deployed on AWS EC2 instances - Scale up / out and which instance type?

2018-05-21 Thread Kelly, Frank
hen starts to recover which causes more usage and then eventually shards go “Down”. Our key question: Scale up (fewer instances to manage) or Scale out (more instances to manage) and do we switch to compute optimized instances (the answer given our usage I assume is probably) Appreciate

Solr scale function scale single doc to min value

2018-02-01 Thread Aman Deep Singh
Hi, I'm using scale function and facing a issue where my result set contains only one result or multiple results with same value in this case scale is sending data at min level/instead of high value,any idea how can I achieve the high value in case only one result is present or multiple re

Re: solr 6 at scale

2017-05-25 Thread Toke Eskildsen
On Thu, 2017-05-25 at 15:56 -0700, Nawab Zada Asad Iqbal wrote: > I have 31 machine cluster with 3 shards on each (93 shards). Each > machine has 250~GB ram and 3TB SSD for search index (there is another > drive for OS and stuff). One solr process runs for each shard with > 48G heap. So we have 3 l

Re: solr 6 at scale

2017-05-25 Thread Nawab Zada Asad Iqbal
Hi Toke, I don't have any blog, but here is a high level idea: I have 31 machine cluster with 3 shards on each (93 shards). Each machine has 250~GB ram and 3TB SSD for search index (there is another drive for OS and stuff). One solr process runs for each shard with 48G heap. So we have 3 large fi

Re: solr 6 at scale

2017-05-25 Thread Bram Van Dam
>>> It is relatively easy to downgrade to an earlier release within the >>> same major version. We have not switched to 6.5.1 simply because we >>> have no pressing need for it - Solr 6.3 works well for us. > >> That strikes me as a little bit dangerous, unless your indexes are very >> static. Th

Re: solr 6 at scale

2017-05-24 Thread Toke Eskildsen
Nawab Zada Asad Iqbal wrote: > @Toke, I stumbled upon your page last week but it seems that your huge > index doesn't receive a lot of query traffic. It switches between two kinds of usage: Everyday use is very low traffic by researchers using it interactively: 1-2 simultaneous queries, with fa

Re: solr 6 at scale

2017-05-24 Thread Walter Underwood
n 30 machines. > > > I look forward to hear more scale stories. > Nawab > > On Wed, May 24, 2017 at 7:58 AM, Toke Eskildsen wrote: > >> Shawn Heisey wrote: >>> On 5/24/2017 3:44 AM, Toke Eskildsen wrote: >>>> It is relatively easy to downgrade t

Re: solr 6 at scale

2017-05-24 Thread Nawab Zada Asad Iqbal
ds on 30 machines. I look forward to hear more scale stories. Nawab On Wed, May 24, 2017 at 7:58 AM, Toke Eskildsen wrote: > Shawn Heisey wrote: > > On 5/24/2017 3:44 AM, Toke Eskildsen wrote: > >> It is relatively easy to downgrade to an earlier release within the > >

Re: solr 6 at scale

2017-05-24 Thread Toke Eskildsen
Shawn Heisey wrote: > On 5/24/2017 3:44 AM, Toke Eskildsen wrote: >> It is relatively easy to downgrade to an earlier release within the >> same major version. We have not switched to 6.5.1 simply because we >> have no pressing need for it - Solr 6.3 works well for us. > That strikes me as a litt

Re: solr 6 at scale

2017-05-24 Thread Shawn Heisey
On 5/24/2017 3:44 AM, Toke Eskildsen wrote: > It is relatively easy to downgrade to an earlier release within the > same major version. We have not switched to 6.5.1 simply because we > have no pressing need for it - Solr 6.3 works well for us. That strikes me as a little bit dangerous, unless yo

Re: solr 6 at scale

2017-05-24 Thread Toke Eskildsen
On Tue, 2017-05-23 at 17:27 -0700, Nawab Zada Asad Iqbal wrote: > Anyone using solr.6.x for multi-terabytes index size: how did you > decide which version to upgrade to? We are still stuck with 4.10 for our 70TB+ (split in 83 shards) index, due to some custom hacks that has not yet been ported. If

Re: solr 6 at scale

2017-05-23 Thread Erick Erickson
I'll quibble a little with Walter and say that 6.4.2 fixes the perf problem in 6.4.0 and 6.4.1. Which doesn't change his recommendation at all, I'd go with 6.5.1. Best, Erick On Tue, May 23, 2017 at 5:49 PM, Walter Underwood wrote: > We are running 6.5.1 in a 16 node cluster, four shards and fou

Re: solr 6 at scale

2017-05-23 Thread Walter Underwood
We are running 6.5.1 in a 16 node cluster, four shards and four replicas. It is performing brilliantly. Our index is 18 million documents, but we have very heavy queries. Students are searching for homework help, so they paste in the entire problem. We truncate queries at 40 terms to limit the

solr 6 at scale

2017-05-23 Thread Nawab Zada Asad Iqbal
Hi all, I am planning to upgrade my solr.4.x installation to a recent stable version. Should I get the latest 6.5.1 bits or will a little older release be better in terms of stability? I am curious if there is way to see solr.6.x adoption in large companies. I have talked to few people and they ar

Re: scale

2016-06-02 Thread John Blythe
sure. the processes we run to do linkage take hours. we're processing ~600k records, bouncing our users data up against a few data sources that act as 'sources of truth' for us for the sake of this linkage. we get the top 3 results and run some quick checks on it algorithmically to determine if we

Re: scale

2016-06-02 Thread Erick Erickson
Without having a lot more data it's hard to say anything helpful. _What_ is slow? What does "data linkage" mean exactly? Etc. Best, Erick On Thu, Jun 2, 2016 at 9:33 AM, John Blythe wrote: > hi all, > > having lots of processing happening using multiple solr cores to do some > data linkage with

scale

2016-06-02 Thread John Blythe
hi all, having lots of processing happening using multiple solr cores to do some data linkage with our customers' transactional data. it runs pretty slowly at the moment. we were wondering if there were some solr or jetty tunings that we could implement to help make it more powerful and efficient.

Re: shell script or script in any language to scale a replica solr node with some configs from zookeeper and the remaining from svn/git

2015-02-03 Thread Rajesh Hazari
we have already started using this toolkit, we have explored it completely, Do we have any sample script in python to get the config file or other files from svn and deploy in tomcat? *Thanks,* *Rajesh**.* On Mon, Feb 2, 2015 at 3:32 PM, Anshum Gupta wrote: > Solr scale toolkit should b

Re: shell script or script in any language to scale a replica solr node with some configs from zookeeper and the remaining from svn/git

2015-02-02 Thread Anshum Gupta
Solr scale toolkit should be a good option for you when it comes to deploying/managing Solr nodes in a cluster. It has a lot of support for stuff like spinning up new nodes, stopping, patching, rolling restart etc. About not knowing python, as is mentioned in the README, you don't really ne

shell script or script in any language to scale a replica solr node with some configs from zookeeper and the remaining from svn/git

2015-02-02 Thread Rajesh Hazari
o me as we do not have any experienced developer with these skills https://github.com/LucidWorks/solr-scale-tk and http://lucidworks.com/blog/introducing-the-solr-scale-toolkit/ *Thanks,* *Rajesh.*

Large scale Update of solr indexed documents

2014-12-17 Thread atawfik
. The other challenge is that Solr have around 5 millions documents. The solution needs to be scalable as well. Any ideas or thoughts are very much welcome. Ameer -- View this message in context: http://lucene.472066.n3.nabble.com/Large-scale-Update-of-solr-indexed-documents-tp4174695.html Sent

Re: SolrCloud Scale Struggle

2014-08-11 Thread Shawn Heisey
On 8/10/2014 11:07 PM, anand.mahajan wrote: > Thank you for your suggestions. With the autoCommit (every 10 mins) and > softCommit (every 10 secs) frequencies reduced things work much better now. > The CPU usages has gone down considerably too (by about 60%) and the > read/write throughput is showi

Re: SolrCloud Scale Struggle

2014-08-10 Thread anand.mahajan
yet) Thanks, Anand -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592p4152239.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud Scale Struggle

2014-08-10 Thread rulinma
should not autoCommit openSearcher too freq. 360 true 1000 100 1 -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592p4152229.html Sent from the Solr - User mailing list archive

Re: SolrCloud Scale Struggle

2014-08-02 Thread Shawn Heisey
On 8/2/2014 2:46 PM, anand.mahajan wrote: > Also, since there are already 18 JVMs per machine - How do I go about > merging these existing cores under just 1 JVM? Would it be that I'd need to > create 1 Solr instance with 18 cores inside and then migrate data from these > separate JVMs into the new

Re: SolrCloud Scale Struggle

2014-08-02 Thread anand.mahajan
at go the same shard. Will splitting these up with the existing set of hardware help at all? -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592p4150811.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud Scale Struggle

2014-08-02 Thread anand.mahajan
existing cores under just 1 JVM? Would it be that I'd need to create 1 Solr instance with 18 cores inside and then migrate data from these separate JVMs into the new instance? -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592p4150810.html

Re: SolrCloud Scale Struggle

2014-08-02 Thread Bill Bell
uld go? >> Is there a pattern / rule that Solr follows when it creates replicas for >> split shards? >> >> 6. I read somewhere that creating a Core would cost the OS one thread and a >> file handle. Since a core repsents an index in its entirty would it not be >&

Re: SolrCloud Scale Struggle

2014-08-02 Thread Bill Bell
handle. Since a core repsents an index in its entirty would it not be > allocated the configured number of write threads? (The dafault that is 8) > > 7. The Zookeeper cluster is deployed on the same boxes as the Solr instance > - Would separating the ZK cluster out help? > > Sorr

Re: SolrCloud Scale Struggle

2014-08-01 Thread Shawn Heisey
On 8/1/2014 4:19 AM, anand.mahajan wrote: > My current deployment : > i) I'm using Solr 4.8 and have set up a SolrCloud with 6 dedicated machines > - 24 Core + 96 GB RAM each. > ii)There are over 190M docs in the SolrCloud at the moment (for all > replicas its consuming overall disk 2340GB which

Re: SolrCloud Scale Struggle

2014-08-01 Thread anand.mahajan
Node cluster? (Sorry if I'm deviating here a bit from the core problem i'm trying to fix - but if DSE could work with a very minimal time and effort requirement - i wont mind trying it out.) -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-t

Re: SolrCloud Scale Struggle

2014-08-01 Thread Shalin Shekhar Mangar
> > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592p4150615.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Regards, Shalin Shekhar Mangar.

RE: SolrCloud Scale Struggle

2014-08-01 Thread Doug Turnbull
y Windows Phone From: anand.mahajan Sent: ‎8/‎1/‎2014 9:40 AM To: solr-user@lucene.apache.org Subject: Re: SolrCloud Scale Struggle Oops - my bad - Its autoSoftCommit that is set after every doc and not an autoCommit. Following snippet from the solrconfig - 1 true

Re: SolrCloud Scale Struggle

2014-08-01 Thread anand.mahajan
? -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592p4150615.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud Scale Struggle

2014-08-01 Thread Shalin Shekhar Mangar
ecommended practice if only because a slow ZK can cause shards to go into recovery and leader failure. I doubt it will make things faster in your case. However, if you can, you should move ZK instances to separate machines. > > Sorry for the long thread _ I thought of asking these all at once rather > than posting separate ones. > > Thanks, > Anand > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Regards, Shalin Shekhar Mangar.

SolrCloud Scale Struggle

2014-08-01 Thread anand.mahajan
cene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Scale Toolkit Access Denied Error

2014-06-09 Thread Mark Gershman
ami-1e6b9d76) at AWS is > not > >> : accessible by my AWS credentials. Is this an AMI permissioning issue > or > >> is > >> : it a problem with my particular account or how it is configured at > AWS. > >> I > >> : did not experience this specif

Re: Solr Scale Toolkit Access Denied Error

2014-06-07 Thread Timothy Potter
) at AWS is not >> : accessible by my AWS credentials. Is this an AMI permissioning issue or >> is >> : it a problem with my particular account or how it is configured at AWS. >> I >> : did not experience this specific problem when working with the previous >>

Re: Solr Scale Toolkit Access Denied Error

2014-06-06 Thread Mark Gershman
ible by my AWS credentials. Is this an AMI permissioning issue or > is > : it a problem with my particular account or how it is configured at AWS. > I > : did not experience this specific problem when working with the previous > : iteration of the Solr Scale Toolkit back toward

Re: Solr Scale Toolkit Access Denied Error

2014-06-06 Thread Chris Hostetter
previous : iteration of the Solr Scale Toolkit back toward the latter part of May. It : appears that the AMI was updated from ami-96779efe to ami-1e6b9d76 with the : newest version of the toolkit. I'm not much of an AWS expert, but i seem to recall that if you don't have your AWS secu

Solr Scale Toolkit Access Denied Error

2014-06-06 Thread Mark Gershman
I've been attempting to experiment with the recently updated Solr Scale Tool Kit mentioned here: http://searchhub.org/2014/06/03/introducing-the-solr-scale-toolkit/ After making the very well documented configuration changes at AWS and installing Python, I was able to use the toolkit to co

Re: TB scale

2014-04-26 Thread Walter Underwood
I think Hathi Trust has a few terabytes of index. They do full-text search on 10 million books. http://www.hathitrust.org/blogs/Large-scale-Search wunder On Apr 26, 2014, at 8:36 AM, Toke Eskildsen wrote: >> Anyone with experience, suggestions or lessons learned in the 10 -100 TB &g

RE: TB scale

2014-04-26 Thread Toke Eskildsen
> Anyone with experience, suggestions or lessons learned in the 10 -100 TB > scale they'd like to share? > Researching optimum design for a Solr Cloud with, say, about 20TB index. We're building a web archive with a projected index size of 20TB (distributed in 20 shards). So

Re: TB scale

2014-04-25 Thread Shawn Heisey
On 4/25/2014 1:48 PM, Ed Smiley wrote: > Anyone with experience, suggestions or lessons learned in the 10 -100 TB > scale they'd like to share? > Researching optimum design for a Solr Cloud with, say, about 20TB index. You've gotten some good information already in the rep

Re: TB scale

2014-04-25 Thread Yonik Seeley
How many documents? That can be just as important (often more important) than total index size. Some other details, like the types of requests, would be helpful (i.e. what the index will be used for... the latency requirements of requests, if you will be faceting, etc). -Yonik http://heliosearch.

Re: TB scale

2014-04-25 Thread Ed Smiley
need to provide >a lot more detail to get useful help. > >Otis >-- >Performance Monitoring * Log Analytics * Search Analytics >Solr & Elasticsearch Support * http://sematext.com/ > > >On Fri, Apr 25, 2014 at 3:48 PM, Ed Smiley wrote: > >> Anyone with experien

Re: TB scale

2014-04-25 Thread Otis Gospodnetic
ey wrote: > Anyone with experience, suggestions or lessons learned in the 10 -100 TB > scale they'd like to share? > Researching optimum design for a Solr Cloud with, say, about 20TB index. > - > Thanks > > Ed Smiley, Senior Software Architect, Ebooks > ProQuest | 161 Eve

Re: TB scale

2014-04-25 Thread Jack Krupansky
? -- Jack Krupansky -Original Message- From: Ed Smiley Sent: Friday, April 25, 2014 3:48 PM To: solr-user@lucene.apache.org Subject: TB scale Anyone with experience, suggestions or lessons learned in the 10 -100 TB scale they'd like to share? Researching optimum design for a Solr

TB scale

2014-04-25 Thread Ed Smiley
Anyone with experience, suggestions or lessons learned in the 10 -100 TB scale they'd like to share? Researching optimum design for a Solr Cloud with, say, about 20TB index. - Thanks Ed Smiley, Senior Software Architect, Ebooks ProQuest | 161 Evelyn Ave. | Mountain View, CA 94041 USA | +

Re: SolrException Error when indexing new documents at scale in SolrCloud -

2014-01-16 Thread Erick Erickson
ect it to simply crash like it's doing. > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/SolrException-Error-when-indexing-new-documents-at-scale-in-SolrCloud-tp4111551p4111680.html > Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrException Error when indexing new documents at scale in SolrCloud -

2014-01-16 Thread cwhi
w after some threshold where things don't fit in the cache, but I'd never expect it to simply crash like it's doing. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrException-Error-when-indexing-new-documents-at-scale-in-SolrCloud-tp4111551p4111680.html Sent f

Re: SolrException Error when indexing new documents at scale in SolrCloud -

2014-01-15 Thread Shawn Heisey
On 1/15/2014 3:10 PM, cwhi wrote: Thanks for the quick reply. I did notice the exception you pointed out and had some thoughts about it maybe being the client library I'm using to connect to Solr (C# SolrNet) disconnecting too early, but that doesn't explain it eventually running out of memory a

Re: SolrException Error when indexing new documents at scale in SolrCloud -

2014-01-15 Thread cwhi
http://lucene.472066.n3.nabble.com/SolrException-Error-when-indexing-new-documents-at-scale-in-SolrCloud-tp4111551p4111561.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrException Error when indexing new documents at scale in SolrCloud -

2014-01-15 Thread Shawn Heisey
On 1/15/2014 2:43 PM, cwhi wrote: I have a SolrCloud installation with about 2 million documents indexed in it. It's been buzzing along without issue for the past 8 days, but today started throwing errors on document adds that eventually resulted in out of memory exceptions. There is nothing fun

SolrException Error when indexing new documents at scale in SolrCloud -

2014-01-15 Thread cwhi
sas0tl/solr.log Also, a short snippet of the first exception is available on pastebin at http://pastebin.com/pWZrkGEr Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/SolrException-Error-when-indexing-new-documents-at-scale-in-SolrCloud-tp4111551.html Sent from the Solr

Re: How to programatically unload a shard from a single server to horizontally scale on SolrCloud

2013-12-06 Thread cwhit
p://lucene.472066.n3.nabble.com/How-to-programatically-unload-a-shard-from-a-single-server-to-horizontally-scale-on-SolrCloud-tp4105343p4105345.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to programatically unload a shard from a single server to horizontally scale on SolrCloud

2013-12-06 Thread michael.boom
e or you can do it in your script. http://wiki.apache.org/solr/CoreAdmin#UNLOAD - Thanks, Michael -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-programatically-unload-a-shard-from-a-single-server-to-horizontally-scale-on-SolrCloud-tp4105343p4105344.html Sent from the Solr - User mailing list archive at Nabble.com.

How to programatically unload a shard from a single server to horizontally scale on SolrCloud

2013-12-06 Thread cwhit
bshard from the new machine. At this point, the data should be more evenly distributed, which will help us continue to scale. This also seems like an easily scriptable process, which is what I'm trying to do. My question is simple. I can call collections?action=SPLITSHARD to split the sha

Re: SolrCloud. Scale-test by duplicating same index to the shards and make it behave each index is different (uniqueId).

2013-10-03 Thread Otis Gospodnetic
Hi, I don't know. But, unless something outside Solr is a bottleneck, it may be wise to see if you can speed up indexing. Maybe we can help here... Otis Solr & ElasticSearch Support http://sematext.com/ On Oct 1, 2013 9:29 AM, "Thomas Egense" wrote: > Hello everyone, > I have a small challenge

SolrCloud. Scale-test by duplicating same index to the shards and make it behave each index is different (uniqueId).

2013-10-01 Thread Thomas Egense
Hello everyone, I have a small challenge performance testing a SolrCloud setup. I have 10 shards, and each shard is supposed to have index-size ~200GB. However I only have a single index of 200GB because it will take too long to build another index with different data, and I hope to somehow use th

Re: Search statistics in category scale

2013-09-26 Thread Otis Gospodnetic
Does Solr provide to collect such data and somehow receive it? > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Search-statistics-in-category-scale-tp4091734.html > Sent from the Solr - User mailing list archive at Nabble.com.

Search statistics in category scale

2013-09-24 Thread Marina
ry and number of items found. Does Solr provide to collect such data and somehow receive it? -- View this message in context: http://lucene.472066.n3.nabble.com/Search-statistics-in-category-scale-tp4091734.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Large-scale Solr publish - hanging at blockUntilFinished indefinitely - stuck on SocketInputStream.socketRead0

2013-05-22 Thread Shawn Heisey
On 5/22/2013 11:25 AM, Justin Babuscio wrote: On your overflow theory, why would this impact the client? Is is possible that a write attempt to Solr would block indefinitely while the Solr server is running wild or in a bad state due to the overflow? That's the general notion. I could be comp

Re: Large-scale Solr publish - hanging at blockUntilFinished indefinitely - stuck on SocketInputStream.socketRead0

2013-05-22 Thread Justin Babuscio
u, and tells you that everything has > succeeded even when it doesn't. > > The one advantage that SUSS/CUSS has over its Http sibling is that it is > multi-threaded, so it can send updates concurrently. You seem to know > enough about how it works, so I'll just say tha

Re: Large-scale Solr publish - hanging at blockUntilFinished indefinitely - stuck on SocketInputStream.socketRead0

2013-05-22 Thread Shawn Heisey
concurrently. You seem to know enough about how it works, so I'll just say that you don't need additional complexity that is not under your control and refuses to throw exceptions when an error occurs. You already have a large-scale concurrent and multi-threaded indexing setup, so

Large-scale Solr publish - hanging at blockUntilFinished indefinitely - stuck on SocketInputStream.socketRead0

2013-05-22 Thread Justin Babuscio
*Problem:* We periodically rebuild our Solr index from scratch. We have built a custom publisher that horizontally scales to increase write throughput. On a given rebuild, we will have ~60 JVMs running with 5 threads that are actively publishing to all Solr masters. For each thread, we instanti

Re: scale with 3 shards on single server

2012-11-08 Thread Erick Erickson
Yep, that's the usual process for growth planning. Best Erick On Wed, Nov 7, 2012 at 4:01 AM, SuoNayi wrote: > Hi all, >Because we cannot add or remove shard when solrcloud cluster has been > set up, > so we have to predict a precise shard size at first, says we need 3 shards. > but now we

Re: [External] Re: how to scale from 1 server to 3 servers with 3 shards

2012-11-08 Thread Greene, Daniel [USA]
ard on the "master", and the new server would be promoted automatically to the shard leader. Sent from my Verizon Wireless 4GLTE smartphone - Reply message - From: "Jeff Rhines" To: "solr-user@lucene.apache.org" Subject: [External] Re: how to scale from 1 ser

Re: how to scale from 1 server to 3 servers with 3 shards

2012-11-08 Thread Jeff Rhines
It's my understanding that your strategy is correct, although I expect that zookeeper would need to be updated somehow with the new second and third shards, no? On Nov 8, 2012, at 2:36 AM, SuoNayi wrote: > Hi all, > Because it' unable to add or remove shard after solrcloud cluster is > init

scale with 3 shards on single server

2012-11-07 Thread SuoNayi
Hi all, Because we cannot add or remove shard when solrcloud cluster has been set up, so we have to predict a precise shard size at first, says we need 3 shards. but now we have no enough servers to set up the cluster with one shard one server. In this situation, I have to set up the cluster

Re: How can a distributed Solr setup scale to TB-data, if URL limitations are 4000 for distributed shard search?

2012-01-19 Thread Otis Gospodnetic
Hi Daniel, - Original Message - > From: Daniel Bruegge > To: solr-user@lucene.apache.org; Otis Gospodnetic > Cc: > Sent: Thursday, January 19, 2012 5:49 AM > Subject: Re: How can a distributed Solr setup scale to TB-data, if URL > limitations are 4000 for distri

Re: How can a distributed Solr setup scale to TB-data, if URL limitations are 4000 for distributed shard search?

2012-01-19 Thread Daniel Bruegge
On Thu, Jan 19, 2012 at 4:51 AM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > > Huge is relative. ;) > Huge Solr clusters also often have huge hardware. Servers with 16 cores > and 32 GM RAM are becoming very common, for example. > Another thing to keep in mind is that while lots of orga

Re: How can a distributed Solr setup scale to TB-data, if URL limitations are 4000 for distributed shard search?

2012-01-18 Thread Otis Gospodnetic
Hi Daniel, > > From: Daniel Bruegge >Subject: Re: How can a distributed Solr setup scale to TB-data, if URL >limitations are 4000 for distributed shard search? > >But you can read so often about huge solr clusters and I am wondering how >th

Re: How can a distributed Solr setup scale to TB-data, if URL limitations are 4000 for distributed shard search?

2012-01-18 Thread Darren Govoni
Try changing the URI/HTTP/GET size limitation on your app server. On 01/18/2012 05:59 PM, Daniel Bruegge wrote: Hi, I am just wondering how I can 'grow' a distributed Solr setup to an index size of a couple of terabytes, when one of the distributed Solr limitations is max. 4000 characters in UR

Re: How can a distributed Solr setup scale to TB-data, if URL limitations are 4000 for distributed shard search?

2012-01-18 Thread Daniel Bruegge
But you can read so often about huge solr clusters and I am wondering how they do this. Because I also read often, that the Index size of one shard should fit into RAM. Or at least the heap size should be as big as the index size. So I see a lots of limitations hardware-wise. Or am I on the totally

Re: How can a distributed Solr setup scale to TB-data, if URL limitations are 4000 for distributed shard search?

2012-01-18 Thread Mark Miller
You can raise the limit to a point. On Jan 18, 2012, at 5:59 PM, Daniel Bruegge wrote: > Hi, > > I am just wondering how I can 'grow' a distributed Solr setup to an index > size of a couple of terabytes, when one of the distributed Solr limitations > is max. 4000 characters in URI limitation. Se

How can a distributed Solr setup scale to TB-data, if URL limitations are 4000 for distributed shard search?

2012-01-18 Thread Daniel Bruegge
Hi, I am just wondering how I can 'grow' a distributed Solr setup to an index size of a couple of terabytes, when one of the distributed Solr limitations is max. 4000 characters in URI limitation. See: *The number of shards is limited by number of characters allowed for GET > method's URI; most W

AW: large scale indexing issues / single threaded bottleneck

2011-11-03 Thread sebastian.reese
endet: Donnerstag, 3. November 2011 14:00 An: 'solr-user@lucene.apache.org' Betreff: RE: large scale indexing issues / single threaded bottleneck Shishir, we have 35 million "documents", and should be doing about 5000-1 new "documents" a day, but with very small &qu

RE: large scale indexing issues / single threaded bottleneck

2011-11-03 Thread Jaeger, Jay - DOT
November 01, 2011 10:58 PM To: solr-user@lucene.apache.org Subject: RE: large scale indexing issues / single threaded bottleneck Roman, How frequently do you update your index? I have a need to do real time add/delete to SOLR documents at a rate of approximately 20/min. The total number of documents

RE: large scale indexing issues / single threaded bottleneck

2011-11-01 Thread Roman Alekseenkov
> The total number of documents are in the range of 4 million. Will there > be any performance issues? > > Thanks, > Shishir > -- View this message in context: http://lucene.472066.n3.nabble.com/large-scale-indexing-issues-single-threaded-bottleneck-tp3461815p3472901.html Sen

RE: large scale indexing issues / single threaded bottleneck

2011-11-01 Thread Awasthi, Shishir
Alekseenkov [mailto:ralekseen...@gmail.com] Sent: Sunday, October 30, 2011 6:11 PM To: solr-user@lucene.apache.org Subject: Re: large scale indexing issues / single threaded bottleneck Guys, thank you for all the replies. I think I have figured out a partial solution for the problem on Friday

Re: large scale indexing issues / single threaded bottleneck

2011-10-31 Thread Kiril Menshikov
Yonik, Adding overwrite=false don't help. XMLLoader don't check this HTTP parameter. Instead it check attribute in XML tag, with the same name. -Kiril -- View this message in context: http://lucene.472066.n3.nabble.com/large-scale-indexing-issues-single-threaded-bottleneck-tp34618

Re: large scale indexing issues / single threaded bottleneck

2011-10-30 Thread Roman Alekseenkov
ot;overwrite=false" didn't help, but the hack did. Once again, thank you for the answers and recommendations Roman -- View this message in context: http://lucene.472066.n3.nabble.com/large-scale-indexing-issues-single-threaded-bottleneck-tp3461815p3466523.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: large scale indexing issues / single threaded bottleneck

2011-10-29 Thread Nagendra Nagarajayya
lr-ra.tgels.org http://rankingalgorithm.tgels.org On 10/28/2011 11:38 AM, Roman Alekseenkov wrote: Hi everyone, I'm looking for some help with Solr indexing issues on a large scale. We are indexing few terabytes/month on a sizeable Solr cluster (8 masters / serving writes, 16 slaves / serving reads)

Re: large scale indexing issues / single threaded bottleneck

2011-10-29 Thread Yonik Seeley
On Sat, Oct 29, 2011 at 6:35 AM, Michael McCandless wrote: > I saw a mention somewhere that you can tell Solr not to use > IW.addDocument (not IW.updateDocument) when you add a document if you > are certain it's not replacing a previous document with the same ID Right - adding overwrite=false to

Re: large scale indexing issues / single threaded bottleneck

2011-10-29 Thread Michael McCandless
On Fri, Oct 28, 2011 at 3:27 PM, Simon Willnauer wrote: > one more thing, after somebody (thanks robert) pointed me at the > stacktrace it seems kind of obvious what the root cause of your > problem is. Its solr :) Solr closes the IndexWriter on commit which is > very wasteful since you basically

Re: large scale indexing issues / single threaded bottleneck

2011-10-28 Thread Jason Rutherglen
> abstract away the encoding of the index Robert, this is what you wrote. "Abstract away the encoding of the index" means pluggable, otherwise it's not abstract and / or it's a flawed design. Sounds like it's the latter.

Re: large scale indexing issues / single threaded bottleneck

2011-10-28 Thread Robert Muir
On Fri, Oct 28, 2011 at 8:10 PM, Jason Rutherglen wrote: >> Otherwise we have "flexible indexing" where "flexible" means "slower >> if you do anything but the default". > > The other encodings should exist as modules since they are pluggable. > 4.0 can ship with the existing codec.  4.1 with addit

Re: large scale indexing issues / single threaded bottleneck

2011-10-28 Thread Jason Rutherglen
> Otherwise we have "flexible indexing" where "flexible" means "slower > if you do anything but the default". The other encodings should exist as modules since they are pluggable. 4.0 can ship with the existing codec. 4.1 with additional codecs and the bulk postings at a later time. Otherwise it

Re: large scale indexing issues / single threaded bottleneck

2011-10-28 Thread Robert Muir
On Fri, Oct 28, 2011 at 5:03 PM, Jason Rutherglen wrote: > +1 I suggested it should be backported a while back.  Or that Lucene > 4.x should be released.  I'm not sure what is holding up Lucene 4.x at > this point, bulk postings is only needed useful for PFOR. This is not true, most modern index

Re: large scale indexing issues / single threaded bottleneck

2011-10-28 Thread Jason Rutherglen
illnauer wrote: > On Fri, Oct 28, 2011 at 9:17 PM, Simon Willnauer > wrote: >> Hey Roman, >> >> On Fri, Oct 28, 2011 at 8:38 PM, Roman Alekseenkov >> wrote: >>> Hi everyone, >>> >>> I'm looking for some help with Solr indexing issues on a

Re: large scale indexing issues / single threaded bottleneck

2011-10-28 Thread Simon Willnauer
On Fri, Oct 28, 2011 at 9:17 PM, Simon Willnauer wrote: > Hey Roman, > > On Fri, Oct 28, 2011 at 8:38 PM, Roman Alekseenkov > wrote: >> Hi everyone, >> >> I'm looking for some help with Solr indexing issues on a large scale. >> >> We are indexing fe

Re: large scale indexing issues / single threaded bottleneck

2011-10-28 Thread Simon Willnauer
Hey Roman, On Fri, Oct 28, 2011 at 8:38 PM, Roman Alekseenkov wrote: > Hi everyone, > > I'm looking for some help with Solr indexing issues on a large scale. > > We are indexing few terabytes/month on a sizeable Solr cluster (8 > masters / serving writes, 16 slaves

Re: large scale indexing issues / single threaded bottleneck

2011-10-28 Thread Roman Alekseenkov
ndexing issues on a large scale. > > We are indexing few terabytes/month on a sizeable Solr cluster (8 > masters / serving writes, 16 slaves / serving reads). After certain > amount of tuning we got to the point where a single Solr instance can > handle index size of 100GB without

  1   2   >