I think you need a 1:1 mapping between the consultant and the company, else how are you going to run your queries for let's say consultants that worked for Google or AOL between March 1999 and August 2004?
If the mapping is 1:1, your life would be easier and you would not need to do extra parsing of the results your retrieved. Unfortunately, it looks like your are doing to have a lot of records. With an RDBMS, it is easier to do joins but with Lucene and Solr you have to denormalize all the relationships. Hence in this particular scenario, if you have 5 consultants that worked for 4 distinct companies you will have to send 20 documents to Solr On Mon, Jun 7, 2010 at 10:15 AM, Moazzam Khan <moazz...@gmail.com> wrote: > Thanks for the replies guys. > > > I am currently storing consultants like this .. > > <doc> > <id>123</id> > <FirstName>tony</FirstName> > <LastName>marjo</LastName> > <Company>Google</Company> > <Company>AOL</Company> > <doc> > > I have a few multi valued fields so if I do it the way Israel > suggested it, I will have tons of records. Do you think it will be > better if I did this instead ? > > > <doc> > <id>123</id> > <FirstName>tony</FirstName> > <LastName>marjo</LastName> > <Company>Google_StartDate_EndDate</Company> > <Company>AOL_StartDate_EndDate</Company> > <doc> > > Or is what you guys said better? > > Thanks for all the help. > > Moazzam > > > On Mon, Jun 7, 2010 at 1:10 AM, Lance Norskog <goks...@gmail.com> wrote: > > And for 'present', you would pick some time far in the future: > > 2100-01-01T00:00:00Z > > > > On 6/5/10, Israel Ekpo <israele...@gmail.com> wrote: > >> You need to make each document added to the index a 1 to 1 mapping for > each > >> company and consultant combo > >> > >> <schema> > >> > >> <fields> > >> <!-- Concatenation of company and consultant id --> > >> <field name="consultant_id_company_id" type="string" indexed="true" > >> stored="true" required="true"/> > >> <field name="consultant_firstname" type="string" indexed="true" > >> stored="true" multiValued="false"/> > >> <field name="consultant_lastname" type="string" indexed="true" > >> stored="true" multiValued="false"/> > >> > >> <!-- The name of the company the consultant worked for --> > >> <field name="company" type="text" indexed="true" stored="true" > >> multiValued="false"/> > >> <field name="start_date" type="tdate" indexed="true" stored="true" > >> multiValued="false"/> > >> <field name="end_date" type="tdate" indexed="true" stored="true" > >> multiValued="false"/> > >> </fields> > >> > >> <defaultSearchField>text</defaultSearchField> > >> > >> <copyField source="consultant_firstname" dest="text"/> > >> <copyField source="consultant_lastname" dest="text"/> > >> <copyField source="company" dest="text"/> > >> > >> </schema> > >> > >> <!-- > >> > >> So for instance, you have 2 consultants > >> > >> Michael Davis and Tom Anderson who worked for AOL and Microsoft, Yahoo, > >> Google and Facebook. > >> > >> Michael Davis = 1 > >> Tom Anderson = 2 > >> > >> AOL = 1 > >> Microsoft = 2 > >> Yahoo = 3 > >> Google = 4 > >> Facebook = 5 > >> > >> This is how you would add the documents to the index > >> > >> --> > >> > >> <doc> > >> <consultant_id_company_id>1_1</consultant_id_company_id> > >> <consultant_firstname>Michael</consultant_firstname> > >> <consultant_lastname>Davis</consultant_lastname> > >> <company>AOL</company> > >> <start_date>2006-02-13T15:26:37Z</start_date> > >> <end_date>2008-02-13T15:26:37Z</end_date> > >> </doc> > >> > >> <doc> > >> <consultant_id_company_id>1_4</consultant_id_company_id> > >> <consultant_firstname>Michael</consultant_firstname> > >> <consultant_lastname>Davis</consultant_lastname> > >> <company>Google</company> > >> <start_date>2006-02-13T15:26:37Z</start_date> > >> <end_date>2009-02-13T15:26:37Z</end_date> > >> </doc> > >> > >> <doc> > >> <consultant_id_company_id>2_3</consultant_id_company_id> > >> <consultant_firstname>Tom</consultant_firstname> > >> <consultant_lastname>Anderson</consultant_lastname> > >> <company>Yahoo</company> > >> <start_date>2001-01-13T15:26:37Z</start_date> > >> <end_date>2009-02-13T15:26:37Z</end_date> > >> </doc> > >> > >> <doc> > >> <consultant_id_company_id>2_4</consultant_id_company_id> > >> <consultant_firstname>Tom</consultant_firstname> > >> <consultant_lastname>Anderson</consultant_lastname> > >> <company>Google</company> > >> <start_date>1999-02-13T15:26:37Z</start_date> > >> <end_date>2010-02-13T15:26:37Z</end_date> > >> </doc> > >> > >> > >> The you can search as > >> > >> q=company:X AND start_date:[X TO *] AND end_date:[* TO Z] > >> > >> On Fri, Jun 4, 2010 at 4:58 PM, Moazzam Khan <moazz...@gmail.com> > wrote: > >> > >>> Hi guys, > >>> > >>> > >>> I have a list of consultants and the users (people who work for the > >>> company) are supposed to be able to search for consultants based on > >>> the time frame they worked for, for a company. For example, I should > >>> be able to search for all consultants who worked for Bear Stearns in > >>> the month of july. What is the best of accomplishing this? > >>> > >>> I was thinking of formatting the document like this > >>> > >>> <company> > >>> <name> Bear Stearns</name> > >>> <startDate>2000-01-01</startDate> > >>> <endDate>present</endDate> > >>> </company> > >>> <company> > >>> <name> AIG</name> > >>> <startDate>1999-01-01</startDate> > >>> <endDate>2000-01-01</endDate> > >>> </company> > >>> > >>> Is this possible? > >>> > >>> Thanks, > >>> > >>> Moazzam > >>> > >> > >> > >> > >> -- > >> "Good Enough" is not good enough. > >> To give anything less than your best is to sacrifice the gift. > >> Quality First. Measure Twice. Cut Once. > >> http://www.israelekpo.com/ > >> > > > > > > -- > > Lance Norskog > > goks...@gmail.com > > > -- "Good Enough" is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once. http://www.israelekpo.com/