Thanks Otis.

Our use case doesn't require any sorting or faceting. I'm wondering if
I've configured anything wrong.

I got total of 25 fields (15 are indexed and stored, other 10 are just
stored). All my fields are basic data type - which I thought are not
sorted. My id field is unique key.

Is there any field here that might be getting sorted?

 <field name="id" type="long" indexed="true" stored="true"
required="true" omitNorms="true" compressed="false"/>

   <field name="atmps" type="integer" indexed="false" stored="true"
compressed="false"/>
   <field name="bcid" type="string" indexed="true" stored="true"
omitNorms="true" compressed="false"/>
   <field name="cmpcd" type="string" indexed="true" stored="true"
omitNorms="true" compressed="false"/>
   <field name="ctry" type="string" indexed="true" stored="true"
omitNorms="true" compressed="false"/>
   <field name="dlt" type="date" indexed="false" stored="true"
default="NOW/HOUR"  compressed="false"/>
   <field name="dmn" type="string" indexed="true" stored="true"
omitNorms="true" compressed="false"/>
   <field name="eaddr" type="string" indexed="true" stored="true"
omitNorms="true" compressed="false"/>
   <field name="emsg" type="string" indexed="false" stored="true"
compressed="false"/>
   <field name="erc" type="string" indexed="false" stored="true"
compressed="false"/>
   <field name="evt" type="string" indexed="true" stored="true"
omitNorms="true" compressed="false"/>
   <field name="from" type="string" indexed="true" stored="true"
omitNorms="true" compressed="false"/>
   <field name="lfid" type="string" indexed="true" stored="true"
omitNorms="true" compressed="false"/>
   <field name="lsid" type="string" indexed="true" stored="true"
omitNorms="true" compressed="false"/>
   <field name="prsid" type="string" indexed="true" stored="true"
omitNorms="true" compressed="false"/>
   <field name="rc" type="string" indexed="false" stored="true"
compressed="false"/>
   <field name="rmcd" type="string" indexed="false" stored="true"
compressed="false"/>
   <field name="rmscd" type="string" indexed="false" stored="true"
compressed="false"/>
   <field name="scd" type="string" indexed="true" stored="true"
omitNorms="true" compressed="false"/>
   <field name="sip" type="string" indexed="false" stored="true"
compressed="false"/>
   <field name="ts" type="date" indexed="true" stored="false"
default="NOW/HOUR" omitNorms="true"/>


   <!-- catchall field, containing all other searchable text fields (implemented
        via copyField further on in this schema  -->
   <field name="all" type="text_ws" indexed="true" stored="false"
omitNorms="true" multiValued="true"/>

Thanks,
-vivek

On Wed, May 13, 2009 at 1:10 PM, Otis Gospodnetic
<otis_gospodne...@yahoo.com> wrote:
>
> Hi,
> Some answers:
> 1) .tii files in the Lucene index.  When you sort, all distinct values for 
> the field(s) used for sorting.  Similarly for facet fields.  Solr caches.
> 2) ramBufferSizeMB dictates, more or less, how much Lucene/Solr will consume 
> during indexing.  There is no need to commit every 50K docs unless you want 
> to trigger snapshot creation.
> 3) see 1) above
>
> 1.5 billion docs per instance where each doc is cca 1KB?  I doubt that's 
> going to fly. :)
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
>> From: vivek sar <vivex...@gmail.com>
>> To: solr-user@lucene.apache.org
>> Sent: Wednesday, May 13, 2009 3:04:46 PM
>> Subject: Solr memory requirements?
>>
>> Hi,
>>
>>   I'm pretty sure this has been asked before, but I couldn't find a
>> complete answer in the forum archive. Here are my questions,
>>
>> 1) When solr starts up what does it loads up in the memory? Let's say
>> I've 4 cores with each core 50G in size. When Solr comes up how much
>> of it would be loaded in memory?
>>
>> 2) How much memory is required during index time? If I'm committing
>> 50K records at a time (1 record = 1KB) using solrj, how much memory do
>> I need to give to Solr.
>>
>> 3) Is there a minimum memory requirement by Solr to maintain a certain
>> size index? Is there any benchmark on this?
>>
>> Here are some of my configuration from solrconfig.xml,
>>
>> 1) 64
>> 2) All the caches (under query tag) are commented out
>> 3) Few others,
>>       a)  true    ==>
>> would this require memory?
>>       b)  50
>>       c) 200
>>       d)
>>       e) false
>>       f)  2
>>
>> The problem we are having is following,
>>
>> I've given Solr RAM of 6G. As the total index size (all cores
>> combined) start growing the Solr memory consumption  goes up. With 800
>> million documents, I see Solr already taking up all the memory at
>> startup. After that the commits, searches everything become slow. We
>> will be having distributed setup with multiple Solr instances (around
>> 8) on four boxes, but our requirement is to have each Solr instance at
>> least maintain around 1.5 billion documents.
>>
>> We are trying to see if we can somehow reduce the Solr memory
>> footprint. If someone can provide a pointer on what parameters affect
>> memory and what effects it has we can then decide whether we want that
>> parameter or not. I'm not sure if there is any minimum Solr
>> requirement for it to be able maintain large indexes. I've used Lucene
>> before and that didn't require anything by default - it used up memory
>> only during index and search times - not otherwise.
>>
>> Any help is very much appreciated.
>>
>> Thanks,
>> -vivek
>
>

Reply via email to