Hi,
I'm playing around with the autoscaling feature of Solr 7.7.1 and have the
following scenario to solve:
- One collection with two shards
- I want to activate autoscaling to achieve the following behavior:
* Every time a new node comes up, it should get a new replica automatically
through t
uct(termfreq(taxonomy,accessories),-5.0)
product(termfreq(taxonomy,women),2.0)
product(termfreq(taxonomy,most_popular),5.0)
[...]
This lead to the main function that sums the results of all the
previous functions and then scale them to a value between 1 to 100:
def(scale(sum(1.0,product(...),prod
oes exactly happen? Can you please give a bit more about
this?
> Our key question: Scale up (fewer instances to manage) or Scale out (more
> instances to manage) and
> do we switch to compute optimized instances (the answer given our usage I
> assume is probably)
>
>
Is the load
l affect what percentage is needed.
What precisely are you looking at to determine that the machine is
CPU-bound? Some of the things that people assume are evidence of CPU
problems are actually evidence of I/O problems caused by not having
enough memory.
Our key question: Scale up (fewer instances
gt; Our EBS traffic bandwidth seems to work great so searches on disk are
>>pretty fast.
>> Now though we seem CPU bound and if/ when Solr CPU gets pegged for too
>>long replication falls behind and then starts to recover which causes
>>more usage and then eventually shards go ³Do
are pretty
> fast.
> Now though we seem CPU bound and if/ when Solr CPU gets pegged for too long
> replication falls behind and then starts to recover which causes more usage
> and then eventually shards go “Down”.
>
> Our key question: Scale up (fewer instances to manage) or Scale
hen starts to recover which causes more usage and
then eventually shards go “Down”.
Our key question: Scale up (fewer instances to manage) or Scale out (more
instances to manage) and
do we switch to compute optimized instances (the answer given our usage I
assume is probably)
Appreciate
Hi, I'm using scale function and facing a issue where my result set
contains only one result or multiple results with same value in this case
scale is sending data at min level/instead of high value,any idea how can I
achieve the high value in case only one result is present or multiple
re
On Thu, 2017-05-25 at 15:56 -0700, Nawab Zada Asad Iqbal wrote:
> I have 31 machine cluster with 3 shards on each (93 shards). Each
> machine has 250~GB ram and 3TB SSD for search index (there is another
> drive for OS and stuff). One solr process runs for each shard with
> 48G heap. So we have 3 l
Hi Toke,
I don't have any blog, but here is a high level idea:
I have 31 machine cluster with 3 shards on each (93 shards). Each machine
has 250~GB ram and 3TB SSD for search index (there is another drive for OS
and stuff). One solr process runs for each shard with 48G heap. So we have
3 large fi
>>> It is relatively easy to downgrade to an earlier release within the
>>> same major version. We have not switched to 6.5.1 simply because we
>>> have no pressing need for it - Solr 6.3 works well for us.
>
>> That strikes me as a little bit dangerous, unless your indexes are very
>> static. Th
Nawab Zada Asad Iqbal wrote:
> @Toke, I stumbled upon your page last week but it seems that your huge
> index doesn't receive a lot of query traffic.
It switches between two kinds of usage:
Everyday use is very low traffic by researchers using it interactively: 1-2
simultaneous queries, with fa
n 30 machines.
>
>
> I look forward to hear more scale stories.
> Nawab
>
> On Wed, May 24, 2017 at 7:58 AM, Toke Eskildsen wrote:
>
>> Shawn Heisey wrote:
>>> On 5/24/2017 3:44 AM, Toke Eskildsen wrote:
>>>> It is relatively easy to downgrade t
ds on 30 machines.
I look forward to hear more scale stories.
Nawab
On Wed, May 24, 2017 at 7:58 AM, Toke Eskildsen wrote:
> Shawn Heisey wrote:
> > On 5/24/2017 3:44 AM, Toke Eskildsen wrote:
> >> It is relatively easy to downgrade to an earlier release within the
> >
Shawn Heisey wrote:
> On 5/24/2017 3:44 AM, Toke Eskildsen wrote:
>> It is relatively easy to downgrade to an earlier release within the
>> same major version. We have not switched to 6.5.1 simply because we
>> have no pressing need for it - Solr 6.3 works well for us.
> That strikes me as a litt
On 5/24/2017 3:44 AM, Toke Eskildsen wrote:
> It is relatively easy to downgrade to an earlier release within the
> same major version. We have not switched to 6.5.1 simply because we
> have no pressing need for it - Solr 6.3 works well for us.
That strikes me as a little bit dangerous, unless yo
On Tue, 2017-05-23 at 17:27 -0700, Nawab Zada Asad Iqbal wrote:
> Anyone using solr.6.x for multi-terabytes index size: how did you
> decide which version to upgrade to?
We are still stuck with 4.10 for our 70TB+ (split in 83 shards) index,
due to some custom hacks that has not yet been ported. If
I'll quibble a little with Walter and say that 6.4.2 fixes the perf
problem in 6.4.0 and 6.4.1. Which doesn't change his recommendation at
all, I'd go with 6.5.1.
Best,
Erick
On Tue, May 23, 2017 at 5:49 PM, Walter Underwood wrote:
> We are running 6.5.1 in a 16 node cluster, four shards and fou
We are running 6.5.1 in a 16 node cluster, four shards and four replicas. It is
performing brilliantly.
Our index is 18 million documents, but we have very heavy queries. Students are
searching for homework help, so they paste in the entire problem. We truncate
queries at 40 terms to limit the
Hi all,
I am planning to upgrade my solr.4.x installation to a recent stable
version. Should I get the latest 6.5.1 bits or will a little older release
be better in terms of stability?
I am curious if there is way to see solr.6.x adoption in large companies. I
have talked to few people and they ar
sure.
the processes we run to do linkage take hours. we're processing ~600k
records, bouncing our users data up against a few data sources that act as
'sources of truth' for us for the sake of this linkage. we get the top 3
results and run some quick checks on it algorithmically to determine if we
Without having a lot more data it's hard to say anything helpful.
_What_ is slow? What does "data linkage" mean exactly? Etc.
Best,
Erick
On Thu, Jun 2, 2016 at 9:33 AM, John Blythe wrote:
> hi all,
>
> having lots of processing happening using multiple solr cores to do some
> data linkage with
hi all,
having lots of processing happening using multiple solr cores to do some
data linkage with our customers' transactional data. it runs pretty slowly
at the moment. we were wondering if there were some solr or jetty tunings
that we could implement to help make it more powerful and efficient.
we have already started using this toolkit, we have explored it completely,
Do we have any sample script in python to get the config file or other
files from svn and deploy in tomcat?
*Thanks,*
*Rajesh**.*
On Mon, Feb 2, 2015 at 3:32 PM, Anshum Gupta wrote:
> Solr scale toolkit should b
Solr scale toolkit should be a good option for you when it comes to
deploying/managing Solr nodes in a cluster.
It has a lot of support for stuff like spinning up new nodes, stopping,
patching, rolling restart etc.
About not knowing python, as is mentioned in the README, you don't really
ne
o me as we
do not have any experienced developer with these skills
https://github.com/LucidWorks/solr-scale-tk
and
http://lucidworks.com/blog/introducing-the-solr-scale-toolkit/
*Thanks,*
*Rajesh.*
.
The other challenge is that Solr have around 5 millions documents. The
solution needs to be scalable as well.
Any ideas or thoughts are very much welcome.
Ameer
--
View this message in context:
http://lucene.472066.n3.nabble.com/Large-scale-Update-of-solr-indexed-documents-tp4174695.html
Sent
On 8/10/2014 11:07 PM, anand.mahajan wrote:
> Thank you for your suggestions. With the autoCommit (every 10 mins) and
> softCommit (every 10 secs) frequencies reduced things work much better now.
> The CPU usages has gone down considerably too (by about 60%) and the
> read/write throughput is showi
yet)
Thanks,
Anand
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592p4152239.html
Sent from the Solr - User mailing list archive at Nabble.com.
should not autoCommit openSearcher too freq.
360
true
1000
100
1
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592p4152229.html
Sent from the Solr - User mailing list archive
On 8/2/2014 2:46 PM, anand.mahajan wrote:
> Also, since there are already 18 JVMs per machine - How do I go about
> merging these existing cores under just 1 JVM? Would it be that I'd need to
> create 1 Solr instance with 18 cores inside and then migrate data from these
> separate JVMs into the new
at
go the same shard. Will splitting these up with the existing set of hardware
help at all?
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592p4150811.html
Sent from the Solr - User mailing list archive at Nabble.com.
existing cores under just 1 JVM? Would it be that I'd need to
create 1 Solr instance with 18 cores inside and then migrate data from these
separate JVMs into the new instance?
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592p4150810.html
uld go?
>> Is there a pattern / rule that Solr follows when it creates replicas for
>> split shards?
>>
>> 6. I read somewhere that creating a Core would cost the OS one thread and a
>> file handle. Since a core repsents an index in its entirty would it not be
>&
handle. Since a core repsents an index in its entirty would it not be
> allocated the configured number of write threads? (The dafault that is 8)
>
> 7. The Zookeeper cluster is deployed on the same boxes as the Solr instance
> - Would separating the ZK cluster out help?
>
> Sorr
On 8/1/2014 4:19 AM, anand.mahajan wrote:
> My current deployment :
> i) I'm using Solr 4.8 and have set up a SolrCloud with 6 dedicated machines
> - 24 Core + 96 GB RAM each.
> ii)There are over 190M docs in the SolrCloud at the moment (for all
> replicas its consuming overall disk 2340GB which
Node cluster? (Sorry if I'm deviating
here a bit from the core problem i'm trying to fix - but if DSE could work
with a very minimal time and effort requirement - i wont mind trying it
out.)
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-t
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592p4150615.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
--
Regards,
Shalin Shekhar Mangar.
y Windows Phone From: anand.mahajan
Sent: 8/1/2014 9:40 AM
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud Scale Struggle
Oops - my bad - Its autoSoftCommit that is set after every doc and not an
autoCommit.
Following snippet from the solrconfig -
1
true
?
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592p4150615.html
Sent from the Solr - User mailing list archive at Nabble.com.
ecommended practice if only because a slow ZK can cause shards to go
into recovery and leader failure. I doubt it will make things faster in
your case. However, if you can, you should move ZK instances to separate
machines.
>
> Sorry for the long thread _ I thought of asking these all at once rather
> than posting separate ones.
>
> Thanks,
> Anand
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
--
Regards,
Shalin Shekhar Mangar.
cene.472066.n3.nabble.com/SolrCloud-Scale-Struggle-tp4150592.html
Sent from the Solr - User mailing list archive at Nabble.com.
ami-1e6b9d76) at AWS is
> not
> >> : accessible by my AWS credentials. Is this an AMI permissioning issue
> or
> >> is
> >> : it a problem with my particular account or how it is configured at
> AWS.
> >> I
> >> : did not experience this specif
) at AWS is not
>> : accessible by my AWS credentials. Is this an AMI permissioning issue or
>> is
>> : it a problem with my particular account or how it is configured at AWS.
>> I
>> : did not experience this specific problem when working with the previous
>>
ible by my AWS credentials. Is this an AMI permissioning issue or
> is
> : it a problem with my particular account or how it is configured at AWS.
> I
> : did not experience this specific problem when working with the previous
> : iteration of the Solr Scale Toolkit back toward
previous
: iteration of the Solr Scale Toolkit back toward the latter part of May. It
: appears that the AMI was updated from ami-96779efe to ami-1e6b9d76 with the
: newest version of the toolkit.
I'm not much of an AWS expert, but i seem to recall that if you don't
have your AWS secu
I've been attempting to experiment with the recently updated Solr Scale
Tool Kit mentioned here:
http://searchhub.org/2014/06/03/introducing-the-solr-scale-toolkit/
After making the very well documented configuration changes at AWS and
installing Python, I was able to use the toolkit to co
I think Hathi Trust has a few terabytes of index. They do full-text search on
10 million books.
http://www.hathitrust.org/blogs/Large-scale-Search
wunder
On Apr 26, 2014, at 8:36 AM, Toke Eskildsen wrote:
>> Anyone with experience, suggestions or lessons learned in the 10 -100 TB
&g
> Anyone with experience, suggestions or lessons learned in the 10 -100 TB
> scale they'd like to share?
> Researching optimum design for a Solr Cloud with, say, about 20TB index.
We're building a web archive with a projected index size of 20TB (distributed
in 20 shards). So
On 4/25/2014 1:48 PM, Ed Smiley wrote:
> Anyone with experience, suggestions or lessons learned in the 10 -100 TB
> scale they'd like to share?
> Researching optimum design for a Solr Cloud with, say, about 20TB index.
You've gotten some good information already in the rep
How many documents? That can be just as important (often more
important) than total index size.
Some other details, like the types of requests, would be helpful (i.e.
what the index will be used for... the latency requirements of
requests, if you will be faceting, etc).
-Yonik
http://heliosearch.
need to provide
>a lot more detail to get useful help.
>
>Otis
>--
>Performance Monitoring * Log Analytics * Search Analytics
>Solr & Elasticsearch Support * http://sematext.com/
>
>
>On Fri, Apr 25, 2014 at 3:48 PM, Ed Smiley wrote:
>
>> Anyone with experien
ey wrote:
> Anyone with experience, suggestions or lessons learned in the 10 -100 TB
> scale they'd like to share?
> Researching optimum design for a Solr Cloud with, say, about 20TB index.
> -
> Thanks
>
> Ed Smiley, Senior Software Architect, Ebooks
> ProQuest | 161 Eve
?
-- Jack Krupansky
-Original Message-
From: Ed Smiley
Sent: Friday, April 25, 2014 3:48 PM
To: solr-user@lucene.apache.org
Subject: TB scale
Anyone with experience, suggestions or lessons learned in the 10 -100 TB
scale they'd like to share?
Researching optimum design for a Solr
Anyone with experience, suggestions or lessons learned in the 10 -100 TB scale
they'd like to share?
Researching optimum design for a Solr Cloud with, say, about 20TB index.
-
Thanks
Ed Smiley, Senior Software Architect, Ebooks
ProQuest | 161 Evelyn Ave. | Mountain View, CA 94041 USA | +
ect it to simply crash like it's doing.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SolrException-Error-when-indexing-new-documents-at-scale-in-SolrCloud-tp4111551p4111680.html
> Sent from the Solr - User mailing list archive at Nabble.com.
w after some threshold where things don't
fit in the cache, but I'd never expect it to simply crash like it's doing.
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrException-Error-when-indexing-new-documents-at-scale-in-SolrCloud-tp4111551p4111680.html
Sent f
On 1/15/2014 3:10 PM, cwhi wrote:
Thanks for the quick reply. I did notice the exception you pointed out and
had some thoughts about it maybe being the client library I'm using to
connect to Solr (C# SolrNet) disconnecting too early, but that doesn't
explain it eventually running out of memory a
http://lucene.472066.n3.nabble.com/SolrException-Error-when-indexing-new-documents-at-scale-in-SolrCloud-tp4111551p4111561.html
Sent from the Solr - User mailing list archive at Nabble.com.
On 1/15/2014 2:43 PM, cwhi wrote:
I have a SolrCloud installation with about 2 million documents indexed in it.
It's been buzzing along without issue for the past 8 days, but today started
throwing errors on document adds that eventually resulted in out of memory
exceptions. There is nothing fun
sas0tl/solr.log
Also, a short snippet of the first exception is available on pastebin at
http://pastebin.com/pWZrkGEr
Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrException-Error-when-indexing-new-documents-at-scale-in-SolrCloud-tp4111551.html
Sent from the Solr
p://lucene.472066.n3.nabble.com/How-to-programatically-unload-a-shard-from-a-single-server-to-horizontally-scale-on-SolrCloud-tp4105343p4105345.html
Sent from the Solr - User mailing list archive at Nabble.com.
e or you can do it in your script.
http://wiki.apache.org/solr/CoreAdmin#UNLOAD
-
Thanks,
Michael
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-programatically-unload-a-shard-from-a-single-server-to-horizontally-scale-on-SolrCloud-tp4105343p4105344.html
Sent from the Solr - User mailing list archive at Nabble.com.
bshard
from the new machine.
At this point, the data should be more evenly distributed, which will help
us continue to scale. This also seems like an easily scriptable process,
which is what I'm trying to do.
My question is simple. I can call collections?action=SPLITSHARD to split
the sha
Hi,
I don't know. But, unless something outside Solr is a bottleneck, it may
be wise to see if you can speed up indexing. Maybe we can help here...
Otis
Solr & ElasticSearch Support
http://sematext.com/
On Oct 1, 2013 9:29 AM, "Thomas Egense" wrote:
> Hello everyone,
> I have a small challenge
Hello everyone,
I have a small challenge performance testing a SolrCloud setup. I have 10
shards, and each shard is supposed to have index-size ~200GB. However I
only have a single index of 200GB because it will take too long to build
another index with different data, and I hope to somehow use th
Does Solr provide to collect such data and somehow receive it?
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Search-statistics-in-category-scale-tp4091734.html
> Sent from the Solr - User mailing list archive at Nabble.com.
ry and number of items found.
Does Solr provide to collect such data and somehow receive it?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Search-statistics-in-category-scale-tp4091734.html
Sent from the Solr - User mailing list archive at Nabble.com.
On 5/22/2013 11:25 AM, Justin Babuscio wrote:
On your overflow theory, why would this impact the client? Is is possible
that a write attempt to Solr would block indefinitely while the Solr server
is running wild or in a bad state due to the overflow?
That's the general notion. I could be comp
u, and tells you that everything has
> succeeded even when it doesn't.
>
> The one advantage that SUSS/CUSS has over its Http sibling is that it is
> multi-threaded, so it can send updates concurrently. You seem to know
> enough about how it works, so I'll just say tha
concurrently. You seem to know
enough about how it works, so I'll just say that you don't need
additional complexity that is not under your control and refuses to
throw exceptions when an error occurs. You already have a large-scale
concurrent and multi-threaded indexing setup, so
*Problem:*
We periodically rebuild our Solr index from scratch. We have built a
custom publisher that horizontally scales to increase write throughput. On
a given rebuild, we will have ~60 JVMs running with 5 threads that are
actively publishing to all Solr masters.
For each thread, we instanti
Yep, that's the usual process for growth planning.
Best
Erick
On Wed, Nov 7, 2012 at 4:01 AM, SuoNayi wrote:
> Hi all,
>Because we cannot add or remove shard when solrcloud cluster has been
> set up,
> so we have to predict a precise shard size at first, says we need 3 shards.
> but now we
ard on the "master", and the new
server would be promoted automatically to the shard leader.
Sent from my Verizon Wireless 4GLTE smartphone
- Reply message -
From: "Jeff Rhines"
To: "solr-user@lucene.apache.org"
Subject: [External] Re: how to scale from 1 ser
It's my understanding that your strategy is correct, although I expect that
zookeeper would need to be updated somehow with the new second and third
shards, no?
On Nov 8, 2012, at 2:36 AM, SuoNayi wrote:
> Hi all,
> Because it' unable to add or remove shard after solrcloud cluster is
> init
Hi all,
Because we cannot add or remove shard when solrcloud cluster has been set
up,
so we have to predict a precise shard size at first, says we need 3 shards.
but now we have no enough servers to set up the cluster with one shard one
server.
In this situation, I have to set up the cluster
Hi Daniel,
- Original Message -
> From: Daniel Bruegge
> To: solr-user@lucene.apache.org; Otis Gospodnetic
> Cc:
> Sent: Thursday, January 19, 2012 5:49 AM
> Subject: Re: How can a distributed Solr setup scale to TB-data, if URL
> limitations are 4000 for distri
On Thu, Jan 19, 2012 at 4:51 AM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:
>
> Huge is relative. ;)
> Huge Solr clusters also often have huge hardware. Servers with 16 cores
> and 32 GM RAM are becoming very common, for example.
> Another thing to keep in mind is that while lots of orga
Hi Daniel,
>
> From: Daniel Bruegge
>Subject: Re: How can a distributed Solr setup scale to TB-data, if URL
>limitations are 4000 for distributed shard search?
>
>But you can read so often about huge solr clusters and I am wondering how
>th
Try changing the URI/HTTP/GET size limitation on your app server.
On 01/18/2012 05:59 PM, Daniel Bruegge wrote:
Hi,
I am just wondering how I can 'grow' a distributed Solr setup to an index
size of a couple of terabytes, when one of the distributed Solr limitations
is max. 4000 characters in UR
But you can read so often about huge solr clusters and I am wondering how
they do this. Because I also read often, that the Index size of one shard
should fit into RAM. Or at least the heap size should be as big as the
index size. So I see a lots of limitations hardware-wise. Or am I on the
totally
You can raise the limit to a point.
On Jan 18, 2012, at 5:59 PM, Daniel Bruegge wrote:
> Hi,
>
> I am just wondering how I can 'grow' a distributed Solr setup to an index
> size of a couple of terabytes, when one of the distributed Solr limitations
> is max. 4000 characters in URI limitation. Se
Hi,
I am just wondering how I can 'grow' a distributed Solr setup to an index
size of a couple of terabytes, when one of the distributed Solr limitations
is max. 4000 characters in URI limitation. See:
*The number of shards is limited by number of characters allowed for GET
> method's URI; most W
endet: Donnerstag, 3. November 2011 14:00
An: 'solr-user@lucene.apache.org'
Betreff: RE: large scale indexing issues / single threaded bottleneck
Shishir, we have 35 million "documents", and should be doing about 5000-1
new "documents" a day, but with very small &qu
November 01, 2011 10:58 PM
To: solr-user@lucene.apache.org
Subject: RE: large scale indexing issues / single threaded bottleneck
Roman,
How frequently do you update your index? I have a need to do real time
add/delete to SOLR documents at a rate of approximately 20/min.
The total number of documents
> The total number of documents are in the range of 4 million. Will there
> be any performance issues?
>
> Thanks,
> Shishir
>
--
View this message in context:
http://lucene.472066.n3.nabble.com/large-scale-indexing-issues-single-threaded-bottleneck-tp3461815p3472901.html
Sen
Alekseenkov [mailto:ralekseen...@gmail.com]
Sent: Sunday, October 30, 2011 6:11 PM
To: solr-user@lucene.apache.org
Subject: Re: large scale indexing issues / single threaded bottleneck
Guys, thank you for all the replies.
I think I have figured out a partial solution for the problem on Friday
Yonik,
Adding overwrite=false don't help. XMLLoader don't check this HTTP
parameter. Instead it check attribute in XML tag, with the same name.
-Kiril
--
View this message in context:
http://lucene.472066.n3.nabble.com/large-scale-indexing-issues-single-threaded-bottleneck-tp34618
ot;overwrite=false" didn't help, but the hack did.
Once again, thank you for the answers and recommendations
Roman
--
View this message in context:
http://lucene.472066.n3.nabble.com/large-scale-indexing-issues-single-threaded-bottleneck-tp3461815p3466523.html
Sent from the Solr - User mailing list archive at Nabble.com.
lr-ra.tgels.org
http://rankingalgorithm.tgels.org
On 10/28/2011 11:38 AM, Roman Alekseenkov wrote:
Hi everyone,
I'm looking for some help with Solr indexing issues on a large scale.
We are indexing few terabytes/month on a sizeable Solr cluster (8
masters / serving writes, 16 slaves / serving reads)
On Sat, Oct 29, 2011 at 6:35 AM, Michael McCandless
wrote:
> I saw a mention somewhere that you can tell Solr not to use
> IW.addDocument (not IW.updateDocument) when you add a document if you
> are certain it's not replacing a previous document with the same ID
Right - adding overwrite=false to
On Fri, Oct 28, 2011 at 3:27 PM, Simon Willnauer
wrote:
> one more thing, after somebody (thanks robert) pointed me at the
> stacktrace it seems kind of obvious what the root cause of your
> problem is. Its solr :) Solr closes the IndexWriter on commit which is
> very wasteful since you basically
> abstract away the encoding of the index
Robert, this is what you wrote. "Abstract away the encoding of the
index" means pluggable, otherwise it's not abstract and / or it's a
flawed design. Sounds like it's the latter.
On Fri, Oct 28, 2011 at 8:10 PM, Jason Rutherglen
wrote:
>> Otherwise we have "flexible indexing" where "flexible" means "slower
>> if you do anything but the default".
>
> The other encodings should exist as modules since they are pluggable.
> 4.0 can ship with the existing codec. 4.1 with addit
> Otherwise we have "flexible indexing" where "flexible" means "slower
> if you do anything but the default".
The other encodings should exist as modules since they are pluggable.
4.0 can ship with the existing codec. 4.1 with additional codecs and
the bulk postings at a later time.
Otherwise it
On Fri, Oct 28, 2011 at 5:03 PM, Jason Rutherglen
wrote:
> +1 I suggested it should be backported a while back. Or that Lucene
> 4.x should be released. I'm not sure what is holding up Lucene 4.x at
> this point, bulk postings is only needed useful for PFOR.
This is not true, most modern index
illnauer
wrote:
> On Fri, Oct 28, 2011 at 9:17 PM, Simon Willnauer
> wrote:
>> Hey Roman,
>>
>> On Fri, Oct 28, 2011 at 8:38 PM, Roman Alekseenkov
>> wrote:
>>> Hi everyone,
>>>
>>> I'm looking for some help with Solr indexing issues on a
On Fri, Oct 28, 2011 at 9:17 PM, Simon Willnauer
wrote:
> Hey Roman,
>
> On Fri, Oct 28, 2011 at 8:38 PM, Roman Alekseenkov
> wrote:
>> Hi everyone,
>>
>> I'm looking for some help with Solr indexing issues on a large scale.
>>
>> We are indexing fe
Hey Roman,
On Fri, Oct 28, 2011 at 8:38 PM, Roman Alekseenkov
wrote:
> Hi everyone,
>
> I'm looking for some help with Solr indexing issues on a large scale.
>
> We are indexing few terabytes/month on a sizeable Solr cluster (8
> masters / serving writes, 16 slaves
ndexing issues on a large scale.
>
> We are indexing few terabytes/month on a sizeable Solr cluster (8
> masters / serving writes, 16 slaves / serving reads). After certain
> amount of tuning we got to the point where a single Solr instance can
> handle index size of 100GB without
1 - 100 of 196 matches
Mail list logo