Hey Yago,
12 T is very impressive.
Can you also share some numbers about the shards, replicas, machine
count/specs and docs/second for your case?
I think you would not be having a single index of 12 TB too. So some
details on that would be really helpful too.
https://lucidworks.com/blog/2014/06/
Erick / Alex,
I want to thank you both. Your hints got me understand SOLR a bit better. I
ended up with reversewildcard, and it speeds up performance a lot. That's
what I'm expecting from SOLR... I also no longer experience the huge memory
hog.
The only down-side I can think of is, you need to r
Please follow the instructions here:
http://lucene.apache.org/solr/resources.html
See the "unsubscribe" section. And if it doesn't work first time, try
the "problems" link.
You _must_ use the exact e-mail you subscribed with, a common mistake
is to have the e-mail forwarded and use the address y
In my company we have a SolrCloud cluster with 12T.
My advices:
Be nice with CPU you will needed in some point (very important if you have not
control over the kind of queries to the cluster, clients are greedy, the want
all results at the same time)
SSD and memory (as many as you can afford i
Solr is RAM hungry. Make sure that you have enough RAM to have most if the
index of a core in the RAM itself.
You should also consider using really good SSDs.
That would be a good start. Like others said, test and verify your setup.
--Pushkar Raste
On Sep 23, 2016 4:58 PM, "Jeffery Yuan" wrote
Please, I need to unsubscribe from this mailing list. Thanks.
--
*Khalid Galal*
Senior Software Engineer
---
*SiliconExpert Tech. - An Arrow company*
3375 Scott Blvd suite 406
Santa Clara, CA 95054
---
*E-mail (Personal): **khalid.mtw...@gmail.com *
*E-mail (Work): **khalid_ga...@silicone
Some anecdotal information. Alfresco is document management system that
uses Solr. We did scale testing with documents meant to simulate typical
office documents. We found with larger documents that 50 million documents
and 500 GB of index size per shard provided acceptable performance.
But you wi
On 9/23/2016 2:33 PM, Jeffery Yuan wrote:
> In our application, every data there is about 800mb raw data, we are
> going to store this data for 5 years, then it's about 1 or 2 TB data.
As long as the filesystem can do it, Solr can handle that much data.
Getting good performance with that much da
Thanks so much for your prompt reply.
We are definitely going to use SolrCloud.
I am just wondering whether SolrCloud can scale even at TB data level and
what kind of hardware configuration it should be.
Thanks.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Whether-solr
You can only put 2 billion documents in one core, I would suggest to use solr
cloud. you need calculate how many solr document in your data and then
decide how many shards to go. you can get many useful resource in website,
I just provide one here.
http://www.slideshare.net/anshumg/best-practices-f
Hi, Dear all:
In our application, every data there is about 800mb raw data, we are going
to store this data for 5 years, then it's about 1 or 2 TB data.
I am wondering whether solr can support this much data?
Usually how much data we store per node, how many nodes we can have in
solr cloud
Been there, done that
You might be glad to know that there are a couple of tickets to reduce
the verbosity of the logs (or, more accurately, move some of the
logging to DEBUG level and allow a switch at startup) that should make
staring at logs less of a chore..
One other signal that a qu
Oh, never mind. Apparently staring at logs has led to blindness...I do see
the "master" query with the full elapsed time and hit count, and indeed,
there is a parameter "_" with some tracking number which links all the
queries together.
On Thu, Sep 22, 2016 at 7:32 PM, Elaine Cario wrote:
> W
Have you evaluated whether the "mm" parameter might help?
https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser#TheDisMaxQueryParser-Themm(MinimumShouldMatch)Parameter
-Original Message-
From: preeti kumari [mailto:preeti.bg...@gmail.com]
Sent: Friday, September 23, 20
But if "SEF" and "OFF" are known to be searched for and especially if
they are well-delimited, they could just be pulled-out into a separate
field and just checked with an FQ.
In the end, there may be no need for either EdgeNGram or wildcards.
Just twisting the data during _indexing_ to represent
Hi,
Thanks Shawn and Erick, that helps me.
On Thu, Sep 22, 2016 at 8:26 PM, Erick Erickson
wrote:
> Not only will optimize not help, even re-indexing all
> the docs to the current collection will leave the
> meta-data in the index about the removed fields. For
> 50 fields that likely won't matt
Hi All,
I am trying to migrate FAST esp to SOLR search engine.
I am trying to implement mode="ONEAR" from FAST in solr.
Please let me know if anyone has any idea about this.
ngram:string("750 500 000 000 000 000",mode="ONEAR")
In solr we are splitting to split field in "750 500 000 000 000 000
Hi everyone,
I'm looking at using two different implementations of spell checking
together: DirectSolrSpellChecker and FileBasedSpellChecker but I get the
following error:
msg: "All checkers need to use the same Analyzer.",
trace: "java.lang.IllegalArgumentException: All checkers need to use the
This is happening in 5.3.1
This metric is interesting to know the minimal memory footprint of a core (data
structures and caches).
I agree with Shawn that if Solr doesn't support the metric should be remove
from the admin, but I insist in the fact that it's useful to plot memory
consumption in
19 matches
Mail list logo