Hi,

1)Solr has various type of caches . We can specify how many documents cache
can have at a time.
       e.g. if windowsize=50
           50 results will be cached in queryResult Cache.
            if user makes a new request to server for results after 50
documents a new request will be sent to the server & server will retrieve
next             50 results in the cache.
       http://wiki.apache.org/solr/SolrCaching
       Yes, solr looks into the cache to retrieve the fields to be returned.

2) Yes, we can have different tokenizers or filters for index & search. We
need not create a different fieldtype. We need to configure the same
fieldtype (datatype) for index & search analyzers sections differently.

   e.g.

        <fieldType name="textSpell" class="solr.TextField"
positionIncrementGap="100" stored="false" multiValued="true">
          *<analyzer type="index">*
         <tokenizer class="solr.StandardTokenizerFactory"/>
         <filter class="solr.LowerCaseFilterFactory"/>

         <!--<filter class="solr.SynonymFilterFactory"
synonyms="Synonyms.txt" ignoreCase="true" expand="false"/>-->
         <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
         <filter class="solr.StandardFilterFactory"/>
         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
       </analyzer>
      * <analyzer type="query">*
         <tokenizer class="solr.StandardTokenizerFactory"/>
         <filter class="solr.LowerCaseFilterFactory"/>

         <filter class="solr.StandardFilterFactory"/>
         <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>



Regards,
Abhay

On Tue, Sep 15, 2009 at 6:41 PM, Shashikant Kore <shashik...@gmail.com>wrote:

> Hi,
>
> I am familiar with Lucene and trying out Solr.
>
> I have index which was created outside solr. The index is fairly
> simple with two field - document_id  & content. The query result needs
> to return all the document IDs. The result need not be ordered by the
> score. For this, in Lucene, I use custom hit collector with search to
> get results quickly. The index has a few million documents and queries
> returning hundreds of thousands of documents are not uncommon. So, the
> speed is crucial here.
>
> Since retrieving the document_id for each document is slow, I am using
> FileldCache to store the values of document_id. For all the results
> collected (in a bitset) with hit collector, document_id field is
> retrieved from the fieldcache.
>
> 1. How can I effectively disable scoring? I have read that
> ConstantScoreQuery is quite fast, but from the code, I see that it is
> used only for wildcard queries. How can I use ConstantScoreQuery for
> all the queries (boolean, term, phrase, ..)?  Also, is
> ConstantScoreQuery as fast as a custom hit collector?
>
> 2. How can Solr take advantage of the fieldcache while returning the
> field document_id? The documentation says, fieldcache can be
> explicitly auto warmed with Solr.  If fieldcache is available and
> initialized at the beginning, will solr look into the cache to
> retrieve the fields to be returned?
>
> 3. If there is an additional field for stemmed_content on which search
> needs to use different analyzer, I suppose, that could be specified by
> fieldType attribute in the schema.
>
> Thank you,
>
> --shashi
>

Reply via email to