Glad to hear it. Incidentally, lowering maxBufferedDocs will reduce peak memory consumption during indexing, at a cost of slower indexing throughput.

-Mike

On 11-May-08, at 3:41 AM, Tracy Flynn wrote:

Thanks for the replies.

For a completely different reason, I happened to look at the memory stats for all processes including the SOLR instances. Noticed that the SLOW Solr instance was maxing out with more virtual memory than allocated. After boosting the maximum heap space and restarting, everything started to run at 4x-5x the speed before the fix - and at the rate I reasonably thought it should.

Tracy

On May 9, 2008, at 8:02 AM, Tracy Flynn wrote:

Hi,

I'm starting to see significant slowdown in loading performance after I have loaded about 400K documents. I go from a load rate of near 40 docs/sec to 20- 25 docs a second.

Am I correct in assuming that, during indexing operations, Lucene/ SOLR tries to hold as much of the indexex in memory as possible? If so, does the slowdown indicate need to increase JVM heap space?

Any ideas / help would be appreciated

Regards,

Tracy

---------------------------------------------------------------------------------------------------------------------

Details

Documents loaded as XML via POST command in batches of 1000, commit after each batch

Total current documents ~ 450,000
Avg document size: 4KB
One indexed text field contains 3KB or so. (body field below - standard type 'text')

Dual XEON 3 GHZ 4 GB memory

SOLR JVM Startup options

java -Xms256m -Xmx1000m  -jar start.jar


Relevant portion of the schema follows


<field name="document_id" type="string" indexed="true" stored="true" required="true"/> <field name="language" type="string" indexed="true" stored="true" required="false"/> <field name="languages" type="string" indexed="true" stored="true" required="false"/> <!-- The value specified for folding_id must be a field of type "integer" -
      type "sint" does not work -->
<field name="folding_id" type="integer" indexed="true" stored="true" required="false" default="0"/> <field name="document_type" type="string" indexed="true" stored="true" required="true"/> <field name="title" type="text" indexed="true" stored="true" required="false"/> <field name="body" type="text" indexed="true" stored="true" required="false" compressed="true"/> <field name="teaser" type="text" indexed="no" stored="true" required="false"/> <field name="articles_in_category" type="sint" indexed="true" stored="true" required="false" default="0"/> <field name="pen_name" type="text" indexed="true" stored="true" required="false"/> <field name="article_id" type="sint" indexed="true" stored="true" required="false" default="0"/> <field name="article_status_id" type="sint" indexed="true" stored="true" required="false" default="0"/> <field name="user_id" type="sint" indexed="true" stored="true" required="false" default="0"/> <field name="user_name" type="text" indexed="true" stored="true" required="false"/> <field name="user_email" type="text" indexed="true" stored="true" required="false"/> <field name="channel_context" type="sint" indexed="true" stored="true" required="false" multiValued="true"/> <field name="category_id" type="sint" indexed="true" stored="true" required="false" default="0"/> <field name="category_status_id" type="sint" indexed="true" stored="true" required="false" default="0"/> <field name="category_title" type="text" indexed="true" stored="true" required="false"/> <field name="category_keywords" type="text" indexed="true" stored="true" required="false" multiValued="true"/> <field name="category_type" type="text" indexed="true" stored="true" required="false"/> <field name="channel_id" type="sint" indexed="true" stored="true" required="false" default="0"/> <field name="channel_title" type="text" indexed="true" stored="true" required="false"/> <field name="helium_rank" type="sint" indexed="false" stored="true" required="false" default="0"/> <field name="helium_rank_percentile" type="sfloat" indexed="false" stored="true" required="false"/> <field name="helium_scaled_rank_boost" type="sfloat" indexed="true" stored="true" required="false"/> <field name="helium_scaled_rank_boost_string" type="string" indexed="true" stored="true" required="false"/>
  <!--
<field name="title_popularity" type="sint" indexed="true" stored="true" default="0"/> <field name="title_recent_popularity" type="sint" indexed="true" stored="true" default="0"/> <field name="title_views_measure" type="sint" indexed="true" stored="true" default="0"/> <field name="title_recent_earnings_measure" type="sint" indexed="true" stored="true" default="0"/> <field name="title_earnings_measure" type="sint" indexed="true" stored="true" default="0"/>
 -->
<field name="created_date" type="date" indexed="true" stored="true" required="false" />




Reply via email to