On Mon, Mar 16, 2009 at 12:11 PM, karthik c <karthik...@gmail.com> wrote:

> Hi,
>
> We have a requirement to fetch a set of distinct values of a given field
> that match the given query. We also need to fetch the number of items
> associated with each field value. I figured out a way to do this for
> single-valued fields but am not able to get it to work for multi-valued
> fields.
>
> Long Story:
> Say you have an index of movies, I would like to get a unique set of
> directors matching a query (say "john") and also the number of movies
> directed by each of them. For this example lets assume that "director" is a
> single valued field.
>
> I came up with one approach to implement this: Search for the query string
> in the director field and then apply faceting on the same field (director).
> The search will limit the movie results to the ones directed by directors
> matching the query. Further, the faceting will provide a unique set of
> directors and also the count of movies associated with them. The query will
> look something like this:
>
> solr/Movie/select/?q=director:(john)&start=0&rows=0&facet=true&facet.field=raw_director
>
> This query works fine for single-valued fields. However it does not work in
> the case of multi-valued fields, say we perform a similar search on the
> "actors" (mutli-valued) field, the query will look like:
>
> solr/Movie/select/?q=actors:(john)&start=0&rows=0&facet=true&facet.field=raw_actors
> In this case, the search will again limit the movie results to the ones in
> which actors matching the query have acted in. However while faceting the
> results on "actors", the facet results will also contain other actors that
> have acted in the resulting movies. For eg: say we are searching for
> actors:malkovich, this will return all movies in which John Malkovich has
> acted in. When the faceting is applied on these results, the facet results
> contain John Malkovich with the correct number of movies. But, the facet
> results also contain other actors who have acted with John Malkovich. The
> facet results for the above query look something like this:
> <lst name="facet_fields">
>    <lst name="raw_actors">
>        <int name="John Malkovich">49</int>
>        <int name="Catherine Deneuve">4</int>
>        <int name="John Cusack">3</int>
>        <int name="Angelina Jolie">2</int>
>        <int name="Evangeline Lilly">2</int>
>        <int name="Glenne Headly">2</int>
>        <int name="Jeremy Irons">2</int>
>        <int name="Ray Winstone">2</int>
>    </lst>
> </lst>
> The other actors in the above results is obviously not what we expect to
> see, since they do match the original query (i.e. malkovich).
>

Note that a document in your index represents a movie. You are actually
searching for movies and not actors. Looking from that perspective, the
results are correct.

You may need to re-think your schema. Make a document represent what you
want to search. Perhaps have different types of documents for 'actors',
'movies' etc.

-- 
Regards,
Shalin Shekhar Mangar.

Reply via email to