Search and Entity structure

2012-10-25 Thread v vijith
Dear All,

Apologize for lengthy email 

SOLR Version: 4

Im a newbie to SOLR and have gone through tutorial but could not get a
solution. The below requirement doesnt seem to be impossible but I
think Im missing the obvious.

In my RDBMS, there is a Qualification table and an Employee table. An
employee can have many qualifications. The qualification can have
following attributes - GradeName and Grade. The search using sql query
to achieve my requirement is as below

select * from qualification a, employee b where a.empid= b.empid and
a.gradename='MBA' and b.grade='A';

This will return me the employee along with the dept who has the grade
as MBA and has grade of A.

Employee: 2 records
-
Empid: 1
Name: John
Location: California

Qualifications:
Gradeid: 1
Empid: 1
Name: MBA
Grade: B

Gradeid: 2
Empid: 1
Name: LEAD
Grade: A


Empid: 2
Name: George
Location: Nevada

Qualifications:
Gradeid: 3
Empid: 2
Name: MBA
Grade: A

Gradeid: 4
Empid: 2
Name: Graduate
Grade: C
-
With the above query, I get the following record as the output
Empid: 2
Name: George
Grade Name: MBA
Grade: A

Note that, I dont get Emp with empid: 1 as the qualification is MBA
and has grade B only.

I would like to get this done through SOLR.

Note that, I have simplified my problem using emp and qualification
tables so that its easy to understand.

Once I configured the Schema.xml and data-config.xml, I was able to
index the employee and qualification and the
I have an entity structure generated by SOLR like this


 John<./str>
California 


 MBA
 LEAD



 A
 B




 George<./str>
Nevada


 MBA
 Graduate



 A
 C



As you see in the above structure, the relation between qualifications
are lost. The linkage between MBA and  its corresponding Grade is
lost. So IF I do a search with Grade Name MBA and Grade as A, both the
docs will be returned.  This is incorrect data for me.

Is there a way by which I can get relation between the child entity
elements. In short how do I get the result as shown in the SQL above?

The answer probably is in the document or entity that is getting generated.

If the entity is as below, it might fix the same


 George<./str>
Nevada


 MBA


 A



 George<./str>
Nevada


 graduate


 C


Then, when I search, it will show one doc only.

Thanks
Viju


Re: Search and Entity structure

2012-10-25 Thread v vijith
Hi,

Dear All,

Apologize for lengthy email 

SOLR Version: 4

Im a newbie to SOLR and have gone through tutorial but could not get a
solution. The below requirement doesnt seem to be impossible but I
think Im missing the obvious.

In my RDBMS, there is a Qualification table and an Employee table. An
employee can have many qualifications. The qualification can have
following attributes - GradeName and Grade. The search using sql query
to achieve my requirement is as below

select * from qualification a, employee b where a.empid= b.empid and
a.gradename='MBA' and b.grade='A';

This will return me the employee along with the dept who has the grade
as MBA and has grade of A.

Employee: 2 records
-
Empid: 1
Name: John
Location: California

Qualifications:
Gradeid: 1
Empid: 1
Name: MBA
Grade: B

Gradeid: 2
Empid: 1
Name: LEAD
Grade: A


Empid: 2
Name: George
Location: Nevada

Qualifications:
Gradeid: 3
Empid: 2
Name: MBA
Grade: A

Gradeid: 4
Empid: 2
Name: Graduate
Grade: C
-
With the above query, I get the following record as the output
Empid: 2
Name: George
Grade Name: MBA
Grade: A

Note that, I dont get Emp with empid: 1 as the qualification is MBA
and has grade B only.

I would like to get this done through SOLR.

Note that, I have simplified my problem using emp and qualification
tables so that its easy to understand.

Once I configured the Schema.xml and data-config.xml, I was able to
index the employee and qualification and the
I have an entity structure generated by SOLR like this


 John<./str>
California 


 MBA
 LEAD



 A
 B




 George<./str>
Nevada


 MBA
 Graduate



 A
 C



As you see in the above structure, the relation between qualifications
are lost. The linkage between MBA and  its corresponding Grade is
lost. So IF I do a search with Grade Name MBA and Grade as A, both the
docs will be returned.  This is incorrect data for me.

Is there a way by which I can get relation between the child entity
elements. In short how do I get the result as shown in the SQL above?

The answer probably is in the document or entity that is getting generated.

If the entity is as below, it might fix the same


 George<./str>
Nevada


 MBA


 A



 George<./str>
Nevada


 graduate


 C


Then, when I search, it will show one doc only.

Thanks
Viju


Thanks
Viju

On Thu, Oct 25, 2012 at 8:18 PM, v vijith  wrote:
> Dear All,
>
> Apologize for lengthy email 
>
> SOLR Version: 4
>
> Im a newbie to SOLR and have gone through tutorial but could not get a
> solution. The below requirement doesnt seem to be impossible but I
> think Im missing the obvious.
>
> In my RDBMS, there is a Qualification table and an Employee table. An
> employee can have many qualifications. The qualification can have
> following attributes - GradeName and Grade. The search using sql query
> to achieve my requirement is as below
>
> select * from qualification a, employee b where a.empid= b.empid and
> a.gradename='MBA' and b.grade='A';
>
> This will return me the employee along with the dept who has the grade
> as MBA and has grade of A.
>
> Employee: 2 records
> -
> Empid: 1
> Name: John
> Location: California
>
> Qualifications:
> Gradeid: 1
> Empid: 1
> Name: MBA
> Grade: B
>
> Gradeid: 2
> Empid: 1
> Name: LEAD
> Grade: A
> 
>
> Empid: 2
> Name: George
> Location: Nevada
>
> Qualifications:
> Gradeid: 3
> Empid: 2
> Name: MBA
> Grade: A
>
> Gradeid: 4
> Empid: 2
> Name: Graduate
> Grade: C
> -
> With the above query, I get the following record as the output
> Empid: 2
> Name: George
> Grade Name: MBA
> Grade: A
>
> Note that, I dont get Emp with empid: 1 as the qualification is MBA
> and has grade B only.
>
> I would like to get this done through SOLR.
>
> Note that, I have simplified my problem using emp and qualification
> tables so that its easy to understand.
>
> Once I configured the Schema.xml and data-config.xml, I was able to
> index the employee and qualification and the
> I have an entity structure generated by SOLR like this
>
> 
>  John<./str>
> California 
>
> 
>  MBA
>  LEAD
> 
>
> 
>  A
>  B
> 
> 
>
> 
>  George<./str>
> Nevada
>
> 
>  MBA
>  Graduate
> 
>
> 
>  A
>  C
> 
> 
>
> As you see in the above structure, the relation between qualifications
> are lost. The linkage between MBA and  its corresponding Grade is
> lost. So IF I do a search with Grade Name MBA and Grade as A, both the
> docs will be returned.  This is incorrect data for me.
>
> Is there a way by which I can get relation between the child entity
> elements. In short how do I get the result as shown in the SQL above?
>
> The answer probably is in the document or entity that is getting generated.
>
> If the entity is as below, it might fix the same
>
> 
>  George<./str>
> Nevada
>
> 
>  MBA
> 
> 
>  A
> 
>
> 
>  George<./str>
> Nevada
>
> 
>  graduate
> 
> 
>  C
> 
>
> Then, when I search, it will show one doc only.
>
> Thanks
> Viju


Re: Search and Entity structure

2012-10-26 Thread v vijith
Thanks for the response. This workaround  would be difficult to
implement. Also Im finding it very difficult to understand that SOLR
doesnt provide this feature for searching.


On Fri, Oct 26, 2012 at 9:42 AM, adityab  wrote:
> Hi Vijith,
>
> See if this solution solves your problem. There might be other ways this is
> the one i have on top of my mind at this hour.
>
> You might be having and ID for each qualification. then have the relation
> using dotted notation.
> 1 = MBA, 2 = LEAD etc.
>
> 
> 1.A
> 2.B
> 
> 
>
> the format is like X.Y where X is QualificationID and Y is grade value. If
> you have ID for Grade value too then use it in Y instead of actual Grade
> value.
> So in solr when searching for MBA with A Grad you can query "q=grade:1.A" .
> This should give you the result.
>
>
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Search-and-Entity-structure-tp4015890p4015996.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Search and Entity structure

2012-10-26 Thread v vijith
The schema content that I have put in is

   
   
   
   
   
 EMPID

The dataconfig file is






With this as well, when I try, I get the entity as below -


3Viktor
2
George
C
4
PM
1John
B2LEAD


The issue is that, employee George has 2 qualifications but is not
shown in the result. This is due to unique id I believe. Can you
provide some help?



On Fri, Oct 26, 2012 at 8:46 PM, Gora Mohanty  wrote:
> On 25 October 2012 23:48, v vijith  wrote:
>> Dear All,
>>
>> Apologize for lengthy email 
>>
>> SOLR Version: 4
>>
>> Im a newbie to SOLR and have gone through tutorial but could not get a
>> solution. The below requirement doesnt seem to be impossible but I
>> think Im missing the obvious.
>>
>> In my RDBMS, there is a Qualification table and an Employee table. An
>> employee can have many qualifications. The qualification can have
>> following attributes - GradeName and Grade. The search using sql query
>> to achieve my requirement is as below
>>
>> select * from qualification a, employee b where a.empid= b.empid and
>> a.gradename='MBA' and b.grade='A';
>>
>> This will return me the employee along with the dept who has the grade
>> as MBA and has grade of A.
>>
>> Employee: 2 records
>> -
>> Empid: 1
>> Name: John
>> Location: California
>>
>> Qualifications:
>> Gradeid: 1
>> Empid: 1
>> Name: MBA
>> Grade: B
>>
>> Gradeid: 2
>> Empid: 1
>> Name: LEAD
>> Grade: A
>> 
>>
>> Empid: 2
>> Name: George
>> Location: Nevada
>>
>> Qualifications:
>> Gradeid: 3
>> Empid: 2
>> Name: MBA
>> Grade: A
>>
>> Gradeid: 4
>> Empid: 2
>> Name: Graduate
>> Grade: C
>
> Stop thinking of Solr in terms of RDBMS. Instead, flatten out your
> data. Thus, in your example, you could have a schema with the
> following fields:
> doc_id name location qualification grade
> doc_id is a unique identifier for Solr. If you want to retain Empid
> and Gradeid you could also add these.
>
> and the following entries
> 1 John California MBA B
> 2 John California Lead A
> 3 George Nevada MBA A
> 4 George Nevada Graduate C
>
> Searching for qualification:MBA and grade:A will then give you only
> record 3.
>
> Regards,
> Gora


Re: Search and Entity structure

2012-10-27 Thread v vijith
Hi,

If I write a query like this, is there a way I can achive the results
that I need

select * from employee a left outer join qualification b on a.empid = b.empid;

This will return 5 records, 1 per employee qualification. Can this be
indexed as is?

1, John, MBA, A
1, John, Lead, B
2, George, MBA, B
2, George, PM, C
3, Viktor, null, null

How do I put the schema so that this can be searched as is?

I would not prefer to put database specific functions in my solution.

On Sat, Oct 27, 2012 at 7:08 AM, Gora Mohanty  wrote:
> On 27 October 2012 01:20, v vijith  wrote:
> [...]
>> The dataconfig file is
>> 
>> 
>> 
>> 
>> 
> [...]
>
> The SELECT in the nested entity "qualification" should fetch
> all qualifications for the given employee. How to do that is
> database dependent, e.g., one would use something like
> group_concat() in mysql. After collecting multiple qualifications
> in a single string, one can use a transformer to break the
> string at the separator used in group_concat(), and populate
> the desired Solr field with the pieces.
>
> Depending on your expertise, it might be easier to do this
> through a Solr XML document, or SolrJ.
>
> Regards,
> Gora


Re: Search and Entity structure

2012-10-27 Thread v vijith
Indeed , this worked .

The fix that was required was related to the how the document is
represented. It depends on the unique key. For the same unique key, it
will always update the existing document. So to avoid it, I used the
oracle sequence to identify the record, it can be oracle row number or
if none are required, it can be UUID field from SOLR.

Now Im able to search for gradename:MBA AND grade:B with the relation
maintained.

The updated dataconfig is shown below





  



Schema

   
   
   
   
   

 someid



On Sat, Oct 27, 2012 at 10:49 AM, v vijith  wrote:
> Hi,
>
> If I write a query like this, is there a way I can achive the results
> that I need
>
> select * from employee a left outer join qualification b on a.empid = b.empid;
>
> This will return 5 records, 1 per employee qualification. Can this be
> indexed as is?
>
> 1, John, MBA, A
> 1, John, Lead, B
> 2, George, MBA, B
> 2, George, PM, C
> 3, Viktor, null, null
>
> How do I put the schema so that this can be searched as is?
>
> I would not prefer to put database specific functions in my solution.
>
> On Sat, Oct 27, 2012 at 7:08 AM, Gora Mohanty  wrote:
>> On 27 October 2012 01:20, v vijith  wrote:
>> [...]
>>> The dataconfig file is
>>> 
>>> 
>>> 
>>> 
>>> 
>> [...]
>>
>> The SELECT in the nested entity "qualification" should fetch
>> all qualifications for the given employee. How to do that is
>> database dependent, e.g., one would use something like
>> group_concat() in mysql. After collecting multiple qualifications
>> in a single string, one can use a transformer to break the
>> string at the separator used in group_concat(), and populate
>> the desired Solr field with the pieces.
>>
>> Depending on your expertise, it might be easier to do this
>> through a Solr XML document, or SolrJ.
>>
>> Regards,
>> Gora