Hi All, I am testing a SolrCloud with many collections. The version is 5.2.1
and I installed 3 machines – each one with 4 cores and 8 GB Ram.Then I created
collections with 3 shards and replication factor of 2. It gives me 2 cores per
collection on each machine.I reached almost 900 collections a
If I understood your question correctly, that's what I am suggesting to try.
Notice that, as I mentioned earlier, that ignores all the complexity
of similarity, ranking, etc that Solr offers. But it does not seem you
need it in your particular case, as you are just searching for
presence/absence o
1. Keep the number of collections down to the low hundreds max. Preferably
no more than a few dozen or a hundred.
2. 8GB is too small to be useful. 16 GB min.
3. If you need large numbers of machines, organize them as separate
clusters.
4. Figure 100 to 200 million documents on a Solr server. E.g.,
yura last wrote:
> Hi All, I am testing a SolrCloud with many collections. The version is 5.2.1
> and I installed 3 machines – each one with 4 cores and 8 GB Ram.Then I
> created collections with 3 shards and replication factor of 2. It gives me 2
> cores per collection on each machine.I reached a
The JavaDoc says that the PhoneticFilterFactory will "inject" tokens with
an offset of 0 into the stream. I'm assuming this means an offset of 0
from the token that it is analyzing, is that right? I am trying to
collapse some of my schema, I currently have a text field that I use for
general purp
>From the "teaching to fish" category of advice (since I don't know the
actual answer).
Did you try "Analysis" screen in the Admin UI? If you check "Verbose
output" mark, you will see all the offsets and can easily confirm the
detailed behavior for yourself.
Regards,
Alex.
Solr Analyzers,
Hi, I have tried various options to speed up percentile calculation for
facets. But the internal solr cache only speed up my queries from 22 to 19
sec.
I'am using the new json facets http://yonik.com/json-facet-api/
Any tips for caching stats?
-Håvard
I am using SolrCloud
My initial requirements are:
1) There are about 6000 clients
2) The number of documents from each client are about 50 (average
document size is about 400 bytes)
3 I have to wipe off the index/collection every night and create new
Any thoughts/ideas/suggestions on:
1) Ho
You have to provide a lot more info about your problem, including
what you've tried, what your data looks like, etc.
You might review:
http://wiki.apache.org/solr/UsingMailingLists
Best,
Erick
On Sat, Aug 15, 2015 at 10:27 AM, Håvard Wahl Kongsgård
wrote:
> Hi, I have tried various options to s
This is beyond my direct area of expertise, but one way to look at
this would be:
1) Create new collections offline. Down to each of the 6000 clients
having its own private collection (embedded SolrJ/server). Or some
sort of mini-hubs, e.g. a server per N clients.
2) Bring those collections into ce
I'm somewhat puzzled there is no built in security. I can't image
anybody is running a public facing solr server with the admin page wide
open?
I've searched and haven't found any solutions that work out of the box.
I've tried the solutions here to no avail.
https://wiki.apache.org/solr/Solr
No one runs a public-facing Solr server. Just like no one runs a public-facing
MySQL server.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
On Aug 15, 2015, at 4:15 PM, Scott Derrick wrote:
> I'm somewhat puzzled there is no built in security. I can'
Walter,
actually that explains it perfectly! I will move behind my apache server...
thanks,
Scott
On 8/15/2015 6:15 PM, Walter Underwood wrote:
No one runs a public-facing Solr server. Just like no one runs a public-facing
MySQL server.
wunder
Walter Underwood
wun...@wunderwood.org
http://
On 8/15/2015 2:03 PM, Troy Edwards wrote:
> I am using SolrCloud
>
> My initial requirements are:
>
> 1) There are about 6000 clients
> 2) The number of documents from each client are about 50 (average
> document size is about 400 bytes)
> 3 I have to wipe off the index/collection every night
Scott:
You better not even let them access Solr directly.
http://server:port/solr/admin/collections?ACTION=delete&name=collection.
Try it sometime on a collection that's not important ;)
But as Walter said, that'd be similar to allowing end users
unrestricted access to
a SOL database, t
Piling on here. At the scale you're talking, I suspect you'll not only have
a bunch of servers, you'll really have a bunch of completely separate
"Solr Clouds", complete with their own Zookeepers etc. Partly for
administrative sake, partly for stability, etc.
Not sure that'll be true, mind you, bu
Troy Edwards wrote:
> 1) There are about 6000 clients
> 2) The number of documents from each client are about 50 (average
> document size is about 400 bytes)
So roughly 3 billion documents / 1TB index size. So at least 2 shards, due to
the 2 billion limit in Lucene. If you want more advice t
17 matches
Mail list logo