Hi Yonik,

As you asked here is the code snippet and the actual solr query. Please
have a look. I have included only 104 metrics but like this we can go upto
3600 rollups.

Also I have looked into the JSON Facet API. If I have to use facets, I will
have to then define 3600 facets in a single query and I guess that would be
also slow. Also is there any max limit on the number of facets we can
define in a single query?

Code snippet:

private SolrQuery buildQuery(Integer variable1, List<String> metrics) {
    SolrQuery query = new SolrQuery();
    query.set("q", "variable1:" + variable1);
    query.setRows(0);
    metrics.forEach(
            metric -> query.setGetFieldStatistics("{!sum=true }" + metric)
    );
    return query;
}


The generated query:

{! q=variable1:290 rows=0 stats=true stats.field='{!sum=true
}metric_1' stats.field='{!sum=true }metric_2' stats.field='{!sum=true
}metric_3' stats.field='{!sum=true }metric_4' stats.field='{!sum=true
}metric_5' stats.field='{!sum=true }metric_6' stats.field='{!sum=true
}metric_7' stats.field='{!sum=true }metric_8' stats.field='{!sum=true
}metric_9' stats.field='{!sum=true }metric_10' stats.field='{!sum=true
}metric_11' stats.field='{!sum=true }metric_12'
stats.field='{!sum=true }metric_13' stats.field='{!sum=true
}metric_14' stats.field='{!sum=true }metric_15'
stats.field='{!sum=true }metric_16' stats.field='{!sum=true
}metric_17' stats.field='{!sum=true }metric_18'
stats.field='{!sum=true }metric_19' stats.field='{!sum=true
}metric_20' stats.field='{!sum=true }metric_21'
stats.field='{!sum=true }metric_22' stats.field='{!sum=true
}metric_23' stats.field='{!sum=true }metric_24'
stats.field='{!sum=true }metric_25' stats.field='{!sum=true
}metric_26' stats.field='{!sum=true }metric_27'
stats.field='{!sum=true }metric_28' stats.field='{!sum=true
}metric_29' stats.field='{!sum=true }metric_30'
stats.field='{!sum=true }metric_31' stats.field='{!sum=true
}metric_32' stats.field='{!sum=true }metric_33'
stats.field='{!sum=true }metric_34' stats.field='{!sum=true
}metric_35' stats.field='{!sum=true }metric_36'
stats.field='{!sum=true }metric_37' stats.field='{!sum=true
}metric_38' stats.field='{!sum=true }metric_39'
stats.field='{!sum=true }metric_40' stats.field='{!sum=true
}metric_41' stats.field='{!sum=true }metric_42'
stats.field='{!sum=true }metric_43' stats.field='{!sum=true
}metric_44' stats.field='{!sum=true }metric_45'
stats.field='{!sum=true }metric_46' stats.field='{!sum=true
}metric_47' stats.field='{!sum=true }metric_48'
stats.field='{!sum=true }metric_49' stats.field='{!sum=true
}metric_50' stats.field='{!sum=true }metric_51'
stats.field='{!sum=true }metric_52' stats.field='{!sum=true
}metric_53' stats.field='{!sum=true }metric_54'
stats.field='{!sum=true }metric_55' stats.field='{!sum=true
}metric_56' stats.field='{!sum=true }metric_57'
stats.field='{!sum=true }metric_58' stats.field='{!sum=true
}metric_59' stats.field='{!sum=true }metric_60'
stats.field='{!sum=true }metric_61' stats.field='{!sum=true
}metric_62' stats.field='{!sum=true }metric_63'
stats.field='{!sum=true }metric_64' stats.field='{!sum=true
}metric_65' stats.field='{!sum=true }metric_66'
stats.field='{!sum=true }metric_67' stats.field='{!sum=true
}metric_68' stats.field='{!sum=true }metric_69'
stats.field='{!sum=true }metric_70' stats.field='{!sum=true
}metric_71' stats.field='{!sum=true }metric_72'
stats.field='{!sum=true }metric_73' stats.field='{!sum=true
}metric_74' stats.field='{!sum=true }metric_75'
stats.field='{!sum=true }metric_76' stats.field='{!sum=true
}metric_77' stats.field='{!sum=true }metric_78'
stats.field='{!sum=true }metric_79' stats.field='{!sum=true
}metric_80' stats.field='{!sum=true }metric_81'
stats.field='{!sum=true }metric_82' stats.field='{!sum=true
}metric_83' stats.field='{!sum=true }metric_84'
stats.field='{!sum=true }metric_85' stats.field='{!sum=true
}metric_86' stats.field='{!sum=true }metric_87'
stats.field='{!sum=true }metric_88' stats.field='{!sum=true
}metric_89' stats.field='{!sum=true }metric_90'
stats.field='{!sum=true }metric_91' stats.field='{!sum=true
}metric_92' stats.field='{!sum=true }metric_93'
stats.field='{!sum=true }metric_94' stats.field='{!sum=true
}metric_95' stats.field='{!sum=true }metric_96'
stats.field='{!sum=true }metric_97' stats.field='{!sum=true
}metric_98' stats.field='{!sum=true }metric_99'
stats.field='{!sum=true }metric_100' stats.field='{!sum=true
}metric_101' stats.field='{!sum=true }metric_102'
stats.field='{!sum=true }metric_103' stats.field='{!sum=true
}metric_104'}




On Tue, Dec 12, 2017 at 10:21 AM, RAUNAK AGRAWAL <agrawal.rau...@gmail.com>
wrote:

> Hi Yonik,
>
> I will try the JSON Facet API and update here but my hunch is that
> querying mechanism is not the problem. Rather the problem lies with the
> solr aggregations.
>
> Thanks
>
> On Tue, Dec 12, 2017 at 6:31 AM, Yonik Seeley <ysee...@gmail.com> wrote:
>
>> I think the SolrJ below uses the old stats component.
>> Hopefully the JSON Facet API would be faster for this, but it's not
>> completely clear what the main query here looks like, and if it's the
>> source of any bottleneck rather than the aggregations.
>> What does the generated query string actually look like (it may be
>> easiest just to pull this from the logs).
>>
>> -Yonik
>>
>>
>> On Mon, Dec 11, 2017 at 7:32 PM, RAUNAK AGRAWAL
>> <agrawal.rau...@gmail.com> wrote:
>> > Hi,
>> >
>> > We have a use case where there are 4-5 dimensions and around 3500
>> metrics
>> > in a single document. We have indexed only 2 dimensions and made all the
>> > metrics as doc_values so that we can run the aggregation queries.
>> >
>> > We have 6 million such documents and we are using solr cloud(6.6) on a 6
>> > node cluster with 8 Vcores and 24 GB RAM each.
>> >
>> > On the same set of hardware in elastic search we were getting the
>> response
>> > in about 10ms whereas in solr we are getting response in around 300-400
>> ms.
>> >
>> > This is how I am querying the data.
>> >
>> > private SolrQuery buildQuery(Integer variable1, List<Integer> groups,
>> > List<String> metrics) {
>> >     SolrQuery query = new SolrQuery();
>> >     String groupQuery = " (" + groups.stream().map(g -> "group:" +
>> g).collect
>> >             (Collectors.joining(" OR ")) + ")";
>> >     String finalQuery = "variable1:" + variable1 + " AND " + groupQuery;
>> >     query.set("q", finalQuery);
>> >     query.setRows(0);
>> >     metrics.forEach(
>> >             metric -> query.setGetFieldStatistics("{!sum=true }" +
>> metric)
>> >     );
>> >     return query;
>> > }
>> >
>> > Any help will be appreciated regarding this.
>> >
>> >
>> > Thanks,
>> >
>> > Raunak
>>
>
>

Reply via email to