How would I do a facet search if I did this and not get duplicates? Thanks, Moazzam
On Mon, Jun 7, 2010 at 10:07 AM, Israel Ekpo <israele...@gmail.com> wrote: > I think you need a 1:1 mapping between the consultant and the company, else > how are you going to run your queries for let's say consultants that worked > for Google or AOL between March 1999 and August 2004? > > If the mapping is 1:1, your life would be easier and you would not need to > do extra parsing of the results your retrieved. > > Unfortunately, it looks like your are doing to have a lot of records. > > With an RDBMS, it is easier to do joins but with Lucene and Solr you have to > denormalize all the relationships. > > Hence in this particular scenario, if you have 5 consultants that worked for > 4 distinct companies you will have to send 20 documents to Solr > > On Mon, Jun 7, 2010 at 10:15 AM, Moazzam Khan <moazz...@gmail.com> wrote: > >> Thanks for the replies guys. >> >> >> I am currently storing consultants like this .. >> >> <doc> >> <id>123</id> >> <FirstName>tony</FirstName> >> <LastName>marjo</LastName> >> <Company>Google</Company> >> <Company>AOL</Company> >> <doc> >> >> I have a few multi valued fields so if I do it the way Israel >> suggested it, I will have tons of records. Do you think it will be >> better if I did this instead ? >> >> >> <doc> >> <id>123</id> >> <FirstName>tony</FirstName> >> <LastName>marjo</LastName> >> <Company>Google_StartDate_EndDate</Company> >> <Company>AOL_StartDate_EndDate</Company> >> <doc> >> >> Or is what you guys said better? >> >> Thanks for all the help. >> >> Moazzam >> >> >> On Mon, Jun 7, 2010 at 1:10 AM, Lance Norskog <goks...@gmail.com> wrote: >> > And for 'present', you would pick some time far in the future: >> > 2100-01-01T00:00:00Z >> > >> > On 6/5/10, Israel Ekpo <israele...@gmail.com> wrote: >> >> You need to make each document added to the index a 1 to 1 mapping for >> each >> >> company and consultant combo >> >> >> >> <schema> >> >> >> >> <fields> >> >> <!-- Concatenation of company and consultant id --> >> >> <field name="consultant_id_company_id" type="string" indexed="true" >> >> stored="true" required="true"/> >> >> <field name="consultant_firstname" type="string" indexed="true" >> >> stored="true" multiValued="false"/> >> >> <field name="consultant_lastname" type="string" indexed="true" >> >> stored="true" multiValued="false"/> >> >> >> >> <!-- The name of the company the consultant worked for --> >> >> <field name="company" type="text" indexed="true" stored="true" >> >> multiValued="false"/> >> >> <field name="start_date" type="tdate" indexed="true" stored="true" >> >> multiValued="false"/> >> >> <field name="end_date" type="tdate" indexed="true" stored="true" >> >> multiValued="false"/> >> >> </fields> >> >> >> >> <defaultSearchField>text</defaultSearchField> >> >> >> >> <copyField source="consultant_firstname" dest="text"/> >> >> <copyField source="consultant_lastname" dest="text"/> >> >> <copyField source="company" dest="text"/> >> >> >> >> </schema> >> >> >> >> <!-- >> >> >> >> So for instance, you have 2 consultants >> >> >> >> Michael Davis and Tom Anderson who worked for AOL and Microsoft, Yahoo, >> >> Google and Facebook. >> >> >> >> Michael Davis = 1 >> >> Tom Anderson = 2 >> >> >> >> AOL = 1 >> >> Microsoft = 2 >> >> Yahoo = 3 >> >> Google = 4 >> >> Facebook = 5 >> >> >> >> This is how you would add the documents to the index >> >> >> >> --> >> >> >> >> <doc> >> >> <consultant_id_company_id>1_1</consultant_id_company_id> >> >> <consultant_firstname>Michael</consultant_firstname> >> >> <consultant_lastname>Davis</consultant_lastname> >> >> <company>AOL</company> >> >> <start_date>2006-02-13T15:26:37Z</start_date> >> >> <end_date>2008-02-13T15:26:37Z</end_date> >> >> </doc> >> >> >> >> <doc> >> >> <consultant_id_company_id>1_4</consultant_id_company_id> >> >> <consultant_firstname>Michael</consultant_firstname> >> >> <consultant_lastname>Davis</consultant_lastname> >> >> <company>Google</company> >> >> <start_date>2006-02-13T15:26:37Z</start_date> >> >> <end_date>2009-02-13T15:26:37Z</end_date> >> >> </doc> >> >> >> >> <doc> >> >> <consultant_id_company_id>2_3</consultant_id_company_id> >> >> <consultant_firstname>Tom</consultant_firstname> >> >> <consultant_lastname>Anderson</consultant_lastname> >> >> <company>Yahoo</company> >> >> <start_date>2001-01-13T15:26:37Z</start_date> >> >> <end_date>2009-02-13T15:26:37Z</end_date> >> >> </doc> >> >> >> >> <doc> >> >> <consultant_id_company_id>2_4</consultant_id_company_id> >> >> <consultant_firstname>Tom</consultant_firstname> >> >> <consultant_lastname>Anderson</consultant_lastname> >> >> <company>Google</company> >> >> <start_date>1999-02-13T15:26:37Z</start_date> >> >> <end_date>2010-02-13T15:26:37Z</end_date> >> >> </doc> >> >> >> >> >> >> The you can search as >> >> >> >> q=company:X AND start_date:[X TO *] AND end_date:[* TO Z] >> >> >> >> On Fri, Jun 4, 2010 at 4:58 PM, Moazzam Khan <moazz...@gmail.com> >> wrote: >> >> >> >>> Hi guys, >> >>> >> >>> >> >>> I have a list of consultants and the users (people who work for the >> >>> company) are supposed to be able to search for consultants based on >> >>> the time frame they worked for, for a company. For example, I should >> >>> be able to search for all consultants who worked for Bear Stearns in >> >>> the month of july. What is the best of accomplishing this? >> >>> >> >>> I was thinking of formatting the document like this >> >>> >> >>> <company> >> >>> <name> Bear Stearns</name> >> >>> <startDate>2000-01-01</startDate> >> >>> <endDate>present</endDate> >> >>> </company> >> >>> <company> >> >>> <name> AIG</name> >> >>> <startDate>1999-01-01</startDate> >> >>> <endDate>2000-01-01</endDate> >> >>> </company> >> >>> >> >>> Is this possible? >> >>> >> >>> Thanks, >> >>> >> >>> Moazzam >> >>> >> >> >> >> >> >> >> >> -- >> >> "Good Enough" is not good enough. >> >> To give anything less than your best is to sacrifice the gift. >> >> Quality First. Measure Twice. Cut Once. >> >> http://www.israelekpo.com/ >> >> >> > >> > >> > -- >> > Lance Norskog >> > goks...@gmail.com >> > >> > > > > -- > "Good Enough" is not good enough. > To give anything less than your best is to sacrifice the gift. > Quality First. Measure Twice. Cut Once. > http://www.israelekpo.com/ >