Hi guys,

 

Question:

 

What is the best way to create a solr schema which supports a
'multivalue' where the value is a two item array of event category and a
date. I want to have faceted searches, counts and Date Range ability on
both the category and the dates.

 

Details:

 

This is a person database where Person can have details about them (like
address) and Person have many "Events".  Events have a category (type of
event) and a Date for when that event occurred.  At the bottom you will
see a simple diagram showing the relationship.  Briefly, a Person has
many Events and Events have a single category and a single person.

 

What I would like to be able to do is:

 

Have a facet which shows all of the event categories, with a 'sub-facet'
that show Category + date.  For example, if a Category was "Attended
Conference" and date was 2008-09-08, I'd be able to show a count of all
"Attended Conference", then have a tree type control and show the years
(for example):

 

Eg.

 

+ Attended Conference (1038)

|

+---- 2010 (100)

+--- 2009 (134)

+--- 2008 (234) 

|

+ Another Event Category (23432)

|

+---------2010 (234)

+--------2009 (245)

 

Etc.

 

For scale, I expect to have < 100 "Event Categories" and < a million
person_event records on < 250,000 persons.  I don't care very much about
disk space, so if it's a 1 GB or 100 GB due to indexing, that's okay if
the solution works (and its fast! :-))

 

 

Solutions I looked at:

 

*       I looked at poly but they seem to be a fixed length and appeared
to be the same type.  Typical use case was latitude & longitude.  I
don't think this will work because there are a variable number of events
attached to a person.
*       I looked at multiValued but it didn't seem to permit two fields
having a relationship, ie. Event Category & Event Date.  It seemed to me
that they need to be broken out.  That's not necessarily a bad thing,
but it didn't seem ideal.
*       I thought about concatenating category & date to create a fake
fields strictly for faceting purposes, but I believe that will break
date ranges.  Eg.  EventCategoryId + "|" + Date  = 1|2009 as a facet
would allow me to show counts for that event type.  Seems a bit unwieldy
to me... 

 

What's the groups advice for handling this situation in the best way?

 

Thanks in advance, as always sorry if this question has been asked and
answered a few times already.  I googled for a few hours before writing
this... but things change so fast with Solr that any article older than
a year was suspect to me, also there are so many patches that provide
additional functionality... 

 

Tim

 

 

 

 

Schema:

 

Reply via email to