Hi All,

I am new to Lucene / SOLR and developing a POC as part of research. Check below 
my requirement and problem statement. Need help on how I can index the data 
such data I have a very good search functionality in my POC.

------------------------------------------------------------------
Requirement:
------------------------------------------------------------------

Assume my web application is an Online book store and it sell all categories of 
books like Computers, Social Studies, Physical Sciences etc. Each of these 
categories has sub-categories. For example Computers has sub-categories like 
Software Engineering, Java, SQL Server etc

I have a database table called Categories and it contains both Parent Category 
descriptions and also Child Category descriptions.

Data structure of Category table is:

Category_ID_Primay_Key  integer
Parent_Category_ID  integer
Category_Name varchar(100)
Category_Description varchar(1000)


------------------------------------------------------------------
My Search UI:
------------------------------------------------------------------

My search page is very simple. We have a text field with "Search" button.

------------------------------------------------------------------
User Action:
------------------------------------------------------------------

User enter below search text in above text field and clicks on "Search" button.

"Books on Data Center"

------------------------------------------------------------------
What is my expected behavior:
------------------------------------------------------------------

Since the word "Data Center" more relevant computers I should show books 
related to computers.

------------------------------------------------------------------
My Problem statement and Question to you all:
------------------------------------------------------------------

To have a better search in my web applications what kind of strategy should I 
have and index the data accordingly in SOLR/Lucene.

In my Lucene Index I may or may not have the word "data center". Still I should 
be able to return "data center"

One thought I have is as follows:

Modify the Category table by adding one more column to it:

Category_ID_Primay_Key  integer
Parent_Category_ID  integer
Category_Name varchar(100)
Category_Description varchar(1000)
Category_Description_Keywords varchar(8000)

Now take each word in "Category_description", find synonyms of it and store 
that data in Category_Description_Keywords column. After doing it, index the 
Category table records in SOLR/Lucene.

Below are my questions to you all:

Question 1:
Need your feedbacks on above approach or any other approach which help me to 
make my search better that returns most relevant results to the user.

Question 2:
Can you suggest me Java based best Open Source or commercial synonym engines. I 
want such a best synonym engine that gives me all possible synonyms of a word.



Thanks in Advance,
Kishore Veleti A.V.K.

Reply via email to