Re: Can you parse the contents of a field to populate other fields?

George Everitt Wed, 07 Nov 2007 21:42:56 -0800

I'm not sure I fully understand your ultimate goal or Yonik'sresponse. However, in the past I've been able to representhierarchical data as a simple enumeration of delimited paths:


<field name="taxonomy">root</field>
<field name="taxonomy">root/region</field>
<field name="taxonomy">root/region/north america</field>
<field name="taxonomy">root/region/south america</field>

Then, at response time, you can walk the result facet and build ahierarchy with counts that can be put into a tree view. The tree canbe any arbitrary depth, and documents can live in any combination ofnodes on the tree.

In addition, you can represent any arbitrary name value pair(attribute/tuple) as a two level tree. That way, you can put anycombination of attributes in the facet and parse them out at resultslist time. For example, you might be indexing computer hardware.Memory, Bus Speed and Resolution may be valid for some objects but notfor others. Just put them in a facet and specify a separator:


<field name="attribute">memory:1GB</name>
<field name="attribute">busspeed:133Mhz</name>
<field name="attribute">voltage:110/220</name>
<field name="attribute">manufacturer:Shiangtsu</field>

When you do a facet query, you can easily display the categoriesappropriate to the object. And do facet selections like "show me allgreen things" and "show me all size 4 things".



Even if that's not your goal, this might help someone else.


George Everitt







On Nov 7, 2007, at 3:15 PM, Kristen Roth wrote:

So, I think I have things set up correctly in my schema, but itdoesn't

appear that any logic is being applied to my Category_# fields - they

are being populated with the full string copied from the Categoryfield

(facet1::facet2::facet3...facetn) instead of just facet1, facet2, etc.

I have several different field types, each with a different regex to
match a specific part of the input string.  In this example, I'm
matching facet1 in input string facet1::facet2::facet3...facetn

   <fieldtype name="cat1str" class="solr.TextField">
        <analyzer type="index">
            <tokenizer class="solr.PatternTokenizerFactory"
pattern="^([^:]+)" group="1"/>
                </analyzer>
   </fieldtype>

I have copyfields set up for each Category_# field. Anythingobviously

wrong?

Thanks!
Kristen

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
Seeley
Sent: Wednesday, November 07, 2007 9:38 AM
To: solr-user@lucene.apache.org
Subject: Re: Can you parse the contents of a field to populate other
fields?

On 11/6/07, Kristen Roth <[EMAIL PROTECTED]> wrote:

Yonik - thanks so much for your help!  Just to clarify; where should

the

regex go for each field?


Each field should have a different FieldType (referenced by the "type"
XML attribute).  Each fieldType can have it's own analyzer.  You can
use a different PatternTokenizer (which specifies a regex) for each
analyzer.

-Yonik

Re: Can you parse the contents of a field to populate other fields?

Reply via email to