I use it with Rails, and there's an excellent Ruby gem called "sunspot"
that does all of the hard work.  Nevertheless, I have dug into the
configuration to better understand it, and the schema was an area of
interest.

What the sunspot authors did was to create a simple schema that would hold
just about anything.  The secret is using dynamic field definitions.
Basically, it looks like this (just using a couple of items as examples):

<schema name="sunspot" version="1.0">
  <types>
    <fieldType name="string" class="solr.StrField" omitNorms="true"/>
    <fieldType name="tdouble" class="solr.TrieDoubleField"
omitNorms="true"/>
    <fieldType name="text" class="solr.TextField" omitNorms="false">
      <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.StandardFilterFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.PorterStemFilterFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="false"/>
      </analyzer>
    </fieldType>
etc.
  </types>
  <fields>
    <field name="id" stored="true" type="string" multiValued="false"
indexed="true"/>
    <dynamicField name="*_s" stored="false" type="string"
multiValued="false" indexed="true"/>
    <dynamicField name="*_text" stored="false" type="text"
multiValued="false" indexed="true"/>
    <dynamicField name="*_texts" stored="true" type="text"
multiValued="true" indexed="true"/>
    <dynamicField name="*_textv" stored="false" termVectors="true"
type="text" multiValued="true" indexed="true"/>
    <dynamicField name="*_textsv" stored="true" termVectors="true"
type="text" multiValued="true" indexed="true"/>
    <dynamicField name="*_et" stored="false" termVectors="true"
type="tdouble" multiValued="false" indexed="true"/>
    <dynamicField name="*_etm" stored="false" termVectors="true"
type="tdouble" multiValued="true" indexed="true"/>
    <dynamicField name="*_ets" stored="true" termVectors="true"
type="tdouble" multiValued="false" indexed="true"/>
    <dynamicField name="*_etms" stored="true" termVectors="true"
type="tdouble" multiValued="true" indexed="true"/>
etc.
    <field name="_version_" type="string" indexed="true" stored="true"
multiValued="false" />
  </fields>

  <uniqueKey>id</uniqueKey>
  <solrQueryParser defaultOperator="AND"/>
</schema>

The point of this is that you can then store a field and just add a suffix
to define the type.  So, if you have an item description that you want to
index it would be "item_description_text" and the type would automatically
be set to text with the attributes as defined in the first section.  They
have added types for floats, integers, texts, strings (which are searched
as-is without stemming, parsing words, etc.), dates/times, booleans, and
location coordinates.

This is a really nice way to store the information as you have no need to
define a concrete schema up front.

Michael
-- 
Michael Darrin Chaney, Sr.
[email protected]
http://www.michaelchaney.com/

-- 
-- 
You received this message because you are subscribed to the Google Groups 
"NLUG" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/nlug-talk?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"NLUG" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to