On Wed, Jul 1, 2009 at 8:19 PM, Kevin MacClay <kevinmacc...@gmail.com>wrote:
> I'm interested to see the performance benefits of using "TrieRange" fields > in Solr 1.4, but I am running into some problems giving them a try. When I > retrieve facet counts against a TrieRange field, the values are garbled and > the first three counts are erroneous. Am I doing something wrong? > > In schema.xml, I established an integer TrieField type: > > <fieldType name="tint" class="solr.TrieField" type="integer" > omitNorms="true" positionIncrementGap="0" indexed="true" stored="false" /> > > And I pointed our dynamic field to type "tint" instead of "sint". > > <dynamicField name="*_i" type="tint" indexed="true" stored="true"/> > > After re-indexing the existing data (deleting and re-inserting the > documents) and running a query, these are the facet counts I receive: > > <lst name="column3_i"> > <int name="p#2;#0;#0;">22379</int> > <int name="x#1;#0;">22379</int> > <int name="h#4;#0;#0;#0;">22371</int> > <int name="`#8;#0;#0;#0;#0;">21613</int> > <int name="`#8;#0;#0;#0;#8;">138</int> > <int name="`#8;#0;#0;#0;#16;">97</int> > <int name="`#8;#0;#0;#0;#1;">89</int> > <int name="`#8;#0;#0;#0;#4;">78</int> > <int name="`#8;#0;#0;#0;(">67</int> > <int name="`#8;#0;#0;#0;#2;">58</int> > <int name="`#8;#0;#0;#0;#24;">49</int> > <int name="`#8;#0;#0;#0;#3;">22</int> > <int name="`#8;#0;#0;#0; ">19</int> > <int name="`#8;#0;#0;#0;P">18</int> > <int name="`#8;#0;#0;#0;#5;">16</int> > <int name="`#8;#0;#0;#0; ">14</int> > <int name="`#8;#0;#0;#0;#6;">13</int> > <int name="`#8;#0;#0;#0;x">11</int> > <int name="`#8;#0;#0;#0; ">9</int> > <int name="h#4;#0;#0;#1;">8</int> > <int name="`#8;#0;#0;#0;#7;">6</int> > <int name="`#8;#0;#0;#0;#12;">6</int> > <int name="`#8;#0;#0;#0;8">6</int> > <int name="`#8;#0;#0;#1; ">6</int> > <int name="`#8;#0;#0;#0;p">5</int> > </lst> > > These are the counts I am expecting: > > <lst name="column3_i"> > <int name="0">21613</int> > <int name="8">138</int> > <int name="16">97</int> > <int name="1">89</int> > <int name="4">78</int> > <int name="40">67</int> > <int name="2">58</int> > <int name="24">49</int> > <int name="3">22</int> > <int name="32">19</int> > <int name="80">18</int> > <int name="5">16</int> > <int name="13">14</int> > <int name="6">13</int> > <int name="120">11</int> > <int name="10">9</int> > <int name="7">6</int> > <int name="12">6</int> > <int name="56">6</int> > <int name="160">6</int> > <int name="112">5</int> > <int name="240">4</int> > <int name="14">3</int> > <int name="280">3</int> > <int name="480">3</int> > </lst> > > Notice that the TrieField counts are correct after the first three. I also > tried setting stored="true" in <fieldType> but with no effect. > Unfortunately, Trie fields cannot be used for faceting. Faceting works on indexed tokens and trie stores multiple tokens (in its own encoding) into the index. This is the reason behind the garbled characters. I think faceting can be done with some changes in the current code. For now, I suggest you use normal integer/long fields for faceting and use trie fields for range searches. -- Regards, Shalin Shekhar Mangar.