On 05.03.2014 11:51, Toke Eskildsen wrote:
On Wed, 2014-03-05 at 09:59 +0100, Angel Tchorbadjiiski wrote:
On 04.03.2014 11:20, Toke Eskildsen wrote:
Angel Tchorbadjiiski [angel.tchorbadjii...@antibodies-online.com] wrote:

[Single shard / 2 cores Solr 4.6.1, 65M docs / 50GB, 20 facet fields]

The OS in use is a 64bit linux with an OpenJDK 1.7 Java with 48G RAM.

I did not see your memory allocation anywhere. What is your Xmx?
At the moment I dont use it. The instance allocates 12G without the
parameter set.

I have little experience with OpenJDK, but a quick search suggests that
the default Xmx is physical memory/4 which is indeed 12G for your
machine. I strongly recommend that you set Xmx explicitly instead as the
value should be tuned to your concrete Solr deploy and not whatever
amount of RAM your machine happens to have.
Yes, that is right. I've now both -Xms and -Xmx set.

Shawn suggests facets to be the culprit and I find it a fair suggestion.
As both cores are used primarily for faceting, this sounds very reasonable:).

The stack trace you pasted did not state exactly what caused the OOM -
was is a complete trace? You might find better information in the Solr
log. If it is facet related, it is probably during uninversion.
The traces were mostly exactly the same as the included one, but I'll have a closer look at the logs again.


A gotcha in Solr faceting is that it is quite often the number of
documents, rather than the number of facets or facet values, that
requires a lot of memory.

Extremely loose numbers: Field-faceting on 65M documents (wildly
guessing 5000 unique values and 2 references/doc) is somewhere around
65M*log2(65M*2) + 2*65M*log(5000) bits ~= 65M*28 + 130M*13 ~= 400MB.

As Solr does not come with facet structure collapsing, each facet is
independent of the others, so with the above estimate, 8GB will be used
for faceting.

Thank you for the example:-).

I had a look at the inverted index in Solr and the problems resulting form it in respect to memory needed. It seems, that the docValue fields will be the solution for the problem.


Before DocValues became the answer, I wrote a bit about it here:
http://sbdevel.wordpress.com/2013/04/16/you-are-faceting-itwrong/

I'll have a look at the post, thank you very much.

Cheers
Angel

Reply via email to