Right, but the response in the doc when you make a request is almost,
but not quite totally, unrelated to how facet values are tallied. It's
all about what tokens are actually in your index, which you can see in
the "schema browser"...

Let me know what the results are
Erick

On Wed, Apr 9, 2014 at 11:40 AM, Jean-Sebastien Vachon
<jean-sebastien.vac...@wantedanalytics.com> wrote:
> Thanks Erick I will check this as soon as I can.
>
> In the meantime, here is a sample query and how it looks in our index. It 
> looks good to me (at least that what is showing up as well in our other and 
> older indexes)
>
> http://10.0.5.227:8201/solr/Current/select?q=*:*&fl=ad_job_type_id&fq=ad_job_type_id:[*%20TO%20*]&facet=on&facet.field=ad_job_type_id&rows=1
>
> <result name="response" numFound="12204004" start="0" maxScore="1.0">
>  <doc>
>    <arr name="ad_job_type_id">
>        <str>4 5 1</str>
>     </arr>
>   </doc>
> </result>
>
>> -----Original Message-----
>> From: Erick Erickson [mailto:erickerick...@gmail.com]
>> Sent: April-09-14 2:21 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Were changes made to facetting on multivalued fields recently?
>>
>> That is...um...very strange. It looks to me like you have somehow indexed a
>> bunch of new values. I'm guessing here, but it's suspicious that you have a
>> value "4,1" should that have been indexed as "4" and "1" as separate tokens?
>>
>> So here's what I'd do
>> 1> take a look at the solr/admin/schema browser output for that field
>> in the two versions. I suspect you'll see 7 values in 4.6 and a bazillion in 
>> 4.7.1.
>> 2> if <1> is true, take a look at the admin/analysis page for the
>> field in question and see some sample index-time inputs, especially for the
>> theoretical "4,1" entries. I suspect that 4.6 will break these up into two
>> tokens and 4.7.1 won't.
>> 3> if <2> is true, take a very careful look at the index-time analysis
>> chains in the two versions, I bet they're different and that accounts for 
>> your
>> observations.
>> 4> try 1-3, discover I'm totally off base and paste the schema.xml
>> definitions for the field in question in both 4.6 and 4.7.1 to this thread 
>> and
>> we can take a look.
>>
>> This should not have changed between 4.6 and 4.7.1, at least not
>> intentionally.
>>
>> Best,
>> Erick
>>
>> On Wed, Apr 9, 2014 at 11:04 AM, Jean-Sebastien Vachon <jean-
>> sebastien.vac...@wantedanalytics.com> wrote:
>> > Hi All,
>> >
>> > We just discovered that the response from Solr (4.7.1) when faceting on
>> one of our multi-valued fields has changed considerably.
>> >
>> > In the past (4.6.1 and prior versions as well) we used to have
>> > something like this: (there are 7 possible values for this attribute)
>> >
>> > <lst name="facet_counts">
>> > <lst name="facet_queries"/>
>> > <lst name="facet_fields">
>> > <lst name="ad_job_type_id">
>> > <int name="1">11454652</int>
>> > <int name="4">11387070</int>
>> > <int name="5">2095603</int>
>> > <int name="3">809992</int>
>> > <int name="2">567244</int>
>> > <int name="6">139389</int>
>> > <int name="7">4120</int>
>> > </lst>
>> > </lst>
>> > <lst name="facet_dates"/>
>> > </lst>
>> >
>> > And now with 4.7.1 we are getting this:
>> > <lst name="facet_counts">
>> > <lst name="facet_queries"/>
>> > <lst name="facet_fields">
>> > <lst name="ad_job_type_id">
>> > <int name="1">10954552</int>
>> > <int name="4">10884418</int>
>> > <int name="5">2000530</int>
>> > <int name="3">784491</int>
>> > <int name="2">535935</int>
>> > <int name="4,1">134826</int>
>> > <int name="5,1">11770</int>
>> > ... there are too many values to list them all ...
>> >
>> > I checked the Change log for 4.7.1 and only saw an optimization made
>> > for https://issues.apache.org/jira/browse/SOLR-5512
>> >
>> > Is there any new configuration directive that we should be aware of?
>> >
>> > Thanks
>> >
>> >
>> >
>> >
>> >
>>
>> -----
>> Aucun virus trouvé dans ce message.
>> Analyse effectuée par AVG - www.avg.fr
>> Version: 2014.0.4354 / Base de données virale: 3722/7256 - Date:
>> 27/03/2014 La Base de données des virus a expiré.

Reply via email to