On 3/21/08, Chris Hostetter <[EMAIL PROTECTED]> wrote:
>
>
> : I am investigating to implement an aggregate average function for a
> document
>
> : and require help for the same.
>
>
> First off: please don't repost the same email with a different subject
> (on either solr list) just because you don't recieve a reply in the first
> 24 hours.  The Solr community is very healthy and active and willing to
> help, but sometimes people get busy and not every question gets addressed
> right away (if you check the archives though, typically every question
> gets answered eventually -- the few threads that have 0 replies are
> usually reposts -- but you have to be a little patient)


sorry for the repost in  the dev mailing-list .. I thought i might have to
post in the dev list for this kind of question.

Second: this sounds exactly like the question you asked a few days ago...
>
> http://www.nabble.com/sorting-on-aggregate-averages-to16095991.html
>
> ...did you look into the way the FacetComponent and SimpleFacets work as i
> suggested?  what you are asking is really much more related to faceting
> then to the Function queries (function queries are designed to give you
> one value per document)
>
> Third: something i didn't really catch the first time you asked this
> question was how few documents you expected to deal with per request....
>
> : The problem is that I  have documents containing manufacturer, product,
>
> : rating (m,p,r)
> : and i want to find the top  manufacturers for product (p1,p2,...) can be
> : around 10 to 20 products
> : so i need to compute average rating for each manufacturer for these
> : products(p1,p2,..) and sort by this average.
>
>
> If you are only going to query for 10-20 (or even 100) documents, then
> you'll have at most 10-20 (or 100) manufactures and ratings.  You could
> iterate over these and compute the average directly ... this would be a
> lot easier and simpler to implement then trying to leverage the faceting
> code (or the FunctionQuery code ... like i said it really wasn't designed
> for anything like this)


Given document schema mpr as :
mpr(
MID:INT,
PID:INT,
Rating:float
)

with assumptions:
Cardinality of M ranges in the order of 10^3
Cardinality of P ranges in the order of 10^2
    ... the toltal of records ranging in order of 10^5

the SQL equivalent would be something like:

SELECT MID, AVG(Rating) as Average FROM mpr
    WHERE PID in (p1[,p2,...])
    GROUP BY MID
    ORDER BY Average DESC LIMIT 0, 10;

Also I would require to boost the vales based on PIDs (some products have
more wight than others  effectively computing a wighted average)


To handle these queries I am plannig to develop a custom request handler
plugin in most generic form to be useful in general.

-Hoss
>
>

Reply via email to