How would I do a facet search if I did this and not get duplicates?

Thanks,
Moazzam

On Mon, Jun 7, 2010 at 10:07 AM, Israel Ekpo <israele...@gmail.com> wrote:
> I think you need a 1:1 mapping between the consultant and the company, else
> how are you going to run your queries for let's say consultants that worked
> for Google or AOL between March 1999 and August 2004?
>
> If the mapping is 1:1, your life would be easier and you would not need to
> do extra parsing of the results your retrieved.
>
> Unfortunately, it looks like your are doing to have a lot of records.
>
> With an RDBMS, it is easier to do joins but with Lucene and Solr you have to
> denormalize all the relationships.
>
> Hence in this particular scenario, if you have 5 consultants that worked for
> 4 distinct companies you will have to send 20 documents to Solr
>
> On Mon, Jun 7, 2010 at 10:15 AM, Moazzam Khan <moazz...@gmail.com> wrote:
>
>> Thanks for the replies guys.
>>
>>
>> I am currently storing consultants like this ..
>>
>> <doc>
>>  <id>123</id>
>>  <FirstName>tony</FirstName>
>>  <LastName>marjo</LastName>
>>  <Company>Google</Company>
>>  <Company>AOL</Company>
>> <doc>
>>
>> I have a few multi valued fields so if I do it the way Israel
>> suggested it, I will have tons of records. Do you think it will be
>> better if I did this instead ?
>>
>>
>> <doc>
>>  <id>123</id>
>>  <FirstName>tony</FirstName>
>>  <LastName>marjo</LastName>
>>  <Company>Google_StartDate_EndDate</Company>
>>  <Company>AOL_StartDate_EndDate</Company>
>> <doc>
>>
>> Or is what you guys said better?
>>
>> Thanks for all the help.
>>
>> Moazzam
>>
>>
>> On Mon, Jun 7, 2010 at 1:10 AM, Lance Norskog <goks...@gmail.com> wrote:
>> > And for 'present', you would pick some time far in the future:
>> > 2100-01-01T00:00:00Z
>> >
>> > On 6/5/10, Israel Ekpo <israele...@gmail.com> wrote:
>> >> You need to make each document added to the index a 1 to 1 mapping for
>> each
>> >> company and consultant combo
>> >>
>> >> <schema>
>> >>
>> >> <fields>
>> >>     <!-- Concatenation of company and consultant id -->
>> >>     <field name="consultant_id_company_id" type="string" indexed="true"
>> >> stored="true" required="true"/>
>> >>     <field name="consultant_firstname" type="string" indexed="true"
>> >> stored="true" multiValued="false"/>
>> >>     <field name="consultant_lastname" type="string" indexed="true"
>> >> stored="true" multiValued="false"/>
>> >>
>> >>     <!-- The name of the company the consultant worked for -->
>> >>     <field name="company" type="text" indexed="true" stored="true"
>> >> multiValued="false"/>
>> >>     <field name="start_date" type="tdate" indexed="true" stored="true"
>> >> multiValued="false"/>
>> >>     <field name="end_date" type="tdate" indexed="true" stored="true"
>> >> multiValued="false"/>
>> >> </fields>
>> >>
>> >> <defaultSearchField>text</defaultSearchField>
>> >>
>> >> <copyField source="consultant_firstname" dest="text"/>
>> >> <copyField source="consultant_lastname" dest="text"/>
>> >> <copyField source="company" dest="text"/>
>> >>
>> >> </schema>
>> >>
>> >> <!--
>> >>
>> >> So for instance, you have 2 consultants
>> >>
>> >> Michael Davis and Tom Anderson who worked for AOL and Microsoft, Yahoo,
>> >> Google and Facebook.
>> >>
>> >> Michael Davis = 1
>> >> Tom Anderson = 2
>> >>
>> >> AOL = 1
>> >> Microsoft = 2
>> >> Yahoo = 3
>> >> Google = 4
>> >> Facebook = 5
>> >>
>> >> This is how you would add the documents to the index
>> >>
>> >> -->
>> >>
>> >> <doc>
>> >>     <consultant_id_company_id>1_1</consultant_id_company_id>
>> >>     <consultant_firstname>Michael</consultant_firstname>
>> >>     <consultant_lastname>Davis</consultant_lastname>
>> >>     <company>AOL</company>
>> >>     <start_date>2006-02-13T15:26:37Z</start_date>
>> >>     <end_date>2008-02-13T15:26:37Z</end_date>
>> >> </doc>
>> >>
>> >> <doc>
>> >>     <consultant_id_company_id>1_4</consultant_id_company_id>
>> >>     <consultant_firstname>Michael</consultant_firstname>
>> >>     <consultant_lastname>Davis</consultant_lastname>
>> >>     <company>Google</company>
>> >>     <start_date>2006-02-13T15:26:37Z</start_date>
>> >>     <end_date>2009-02-13T15:26:37Z</end_date>
>> >> </doc>
>> >>
>> >> <doc>
>> >>     <consultant_id_company_id>2_3</consultant_id_company_id>
>> >>     <consultant_firstname>Tom</consultant_firstname>
>> >>     <consultant_lastname>Anderson</consultant_lastname>
>> >>     <company>Yahoo</company>
>> >>     <start_date>2001-01-13T15:26:37Z</start_date>
>> >>     <end_date>2009-02-13T15:26:37Z</end_date>
>> >> </doc>
>> >>
>> >> <doc>
>> >>     <consultant_id_company_id>2_4</consultant_id_company_id>
>> >>     <consultant_firstname>Tom</consultant_firstname>
>> >>     <consultant_lastname>Anderson</consultant_lastname>
>> >>     <company>Google</company>
>> >>     <start_date>1999-02-13T15:26:37Z</start_date>
>> >>     <end_date>2010-02-13T15:26:37Z</end_date>
>> >> </doc>
>> >>
>> >>
>> >> The you can search as
>> >>
>> >> q=company:X AND start_date:[X TO *] AND end_date:[* TO Z]
>> >>
>> >> On Fri, Jun 4, 2010 at 4:58 PM, Moazzam Khan <moazz...@gmail.com>
>> wrote:
>> >>
>> >>> Hi guys,
>> >>>
>> >>>
>> >>> I have a list of consultants and the users (people who work for the
>> >>> company) are supposed to be able to search for consultants based on
>> >>> the time frame they worked for, for a company. For example, I should
>> >>> be able to search for all consultants who worked for Bear Stearns in
>> >>> the month of july. What is the best of accomplishing this?
>> >>>
>> >>> I was thinking of formatting the document like this
>> >>>
>> >>> <company>
>> >>>   <name> Bear Stearns</name>
>> >>>   <startDate>2000-01-01</startDate>
>> >>>   <endDate>present</endDate>
>> >>> </company>
>> >>> <company>
>> >>>   <name> AIG</name>
>> >>>   <startDate>1999-01-01</startDate>
>> >>>   <endDate>2000-01-01</endDate>
>> >>> </company>
>> >>>
>> >>> Is this possible?
>> >>>
>> >>> Thanks,
>> >>>
>> >>> Moazzam
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> "Good Enough" is not good enough.
>> >> To give anything less than your best is to sacrifice the gift.
>> >> Quality First. Measure Twice. Cut Once.
>> >> http://www.israelekpo.com/
>> >>
>> >
>> >
>> > --
>> > Lance Norskog
>> > goks...@gmail.com
>> >
>>
>
>
>
> --
> "Good Enough" is not good enough.
> To give anything less than your best is to sacrifice the gift.
> Quality First. Measure Twice. Cut Once.
> http://www.israelekpo.com/
>

Reply via email to