Is this test index? Do you rewrite documents with same ids? Did you try
to optimize index?
Emir
--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/
On 22.07.2015 13:10, Daniel Holmes wrote:
Upayavira number of docs in that case is 140275. The solr memory is 30Gb.
Yes Emir I need most of them to be saved.
I don't know Alessandro is that usual to use disk for indexing more than 3x
of document size and presumably it will grow up in continue of crawl
exponentially... Its so suboptimal I think.
On Wed, Jul 22, 2015 at 3:16 PM, Alessandro Benedetti <
benedetti.ale...@gmail.com> wrote:
"In one case for instance my segments size is 8.4G while index size is
28G!!! It seems unusual…"
The index is a collection of index segments + few overhead .
So, do you simply mean you have 4 segments ?
Where is the problem anyway ?
You are also storing content which usually is a big part of the index.
As Upaya said, I am curious to know why you are so surprised !
Cheers
2015-07-22 11:27 GMT+01:00 Daniel Holmes <noora.sa...@gmail.com>:
Hi All
I have problem with index size in solr 4.7.2. My OS is Ubuntu 14.10
64-bit.
my fields are :
<field name="id" type="string" stored="true" indexed="true"/>
<field name="segment" type="string" stored="true" indexed="false"/>
<field name="url" type="url_text" stored="true" indexed="true"
required="true"/>
<field name="outlink" type="url_text" stored="true" indexed="true"
required="true"/>
<field name="content" type="text_general" stored="true" indexed="true"/>
<field name="title" type="text_general" stored="true" indexed="true"/>
<field name="host" type="url" stored="false" indexed="true"/>
<field name="segment" type="string" stored="true" indexed="false"/>
<field name="boost" type="float" stored="true" indexed="false"/>
<field name="digest" type="string" stored="true" indexed="false"/>
<field name="tstamp" type="date" stored="true" indexed="false"/>
In one case for instance my segments size is 8.4G while index size is
28G!!! It seems unusual...
What suggestions do you have to reduce index size?
Is there any way to check disk usage details in cores? e.g. stop words,
stored docs, etc.
--
--------------------------
Benedetti Alessandro
Visiting card - http://about.me/alessandro_benedetti
Blog - http://alexbenedetti.blogspot.co.uk
"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"
William Blake - Songs of Experience -1794 England