On Fri, Jun 12, 2015 at 10:30 AM, kingofhypocrites <kingofhypocri...@gmail.com> wrote: > I am migrating a database from SQL Server to Cassandra. Currently I have a > setup as follows: > > - Log data in Cassandra > - Summarize data in Spark and put into Cassandra summary tables > - Query data in Solr > > Everything fits beautifully until I need to do stats on groups. I am hoping > to get this to work with Solr so I can stick to one database, but I am not > sure it's possible. > > If I had it in SQL Server, I could do it like so: > SELECT > site_id, > keyword, > SUM(visits) as visits, > CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate, > SUM(pageviews) as pageviews, > CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as > avg_pages_per_visit > FROM > report_all_keywords_daily > WHERE > site_id = 55 AND date_key >= '20150606' AND date_key <= '20150608' > GROUP BY > site_id, keyword > ORDER BY visits DESC
This is the closest we can get with the JSON Facet API today: json.facet={ sites: { type : terms, field : site_id, sort : "visits desc", facet : { visits : "sum(visits)", bounces : "sum(bounces)", pageviews : "sum(pageviews)" } } } That doesn't take into account "keyword" when sorting the buckets. You could nest a ketword facet inside a site facet and thus calculate the stats for the top N keywords per site: json.facet={ sites: { type : terms, field : site_id, facet : { keywords: { type : terms, field : keyword, sort : "visits desc", facet : { visits : "sum(visits)", bounces : "sum(bounces)", pageviews : "sum(pageviews)" } } } } More info here: http://yonik.com/json-facet-api/ -Yonik