Kishore,

Solr has a SynonymFilterFactory which might be off use to you (
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-2c461ac74b4ddd82e453dc68fcfc92da77358d46)


Regards,
Eswar

On Nov 18, 2007 10:39 PM, Kishore AVK. Veleti <[EMAIL PROTECTED]>
wrote:

> Hi All,
>
> I am new to Lucene / SOLR and developing a POC as part of research. Check
> below my requirement and problem statement. Need help on how I can index the
> data such data I have a very good search functionality in my POC.
>
> ------------------------------------------------------------------
> Requirement:
> ------------------------------------------------------------------
>
> Assume my web application is an Online book store and it sell all
> categories of books like Computers, Social Studies, Physical Sciences etc.
> Each of these categories has sub-categories. For example Computers has
> sub-categories like Software Engineering, Java, SQL Server etc
>
> I have a database table called Categories and it contains both Parent
> Category descriptions and also Child Category descriptions.
>
> Data structure of Category table is:
>
> Category_ID_Primay_Key  integer
> Parent_Category_ID  integer
> Category_Name varchar(100)
> Category_Description varchar(1000)
>
>
> ------------------------------------------------------------------
> My Search UI:
> ------------------------------------------------------------------
>
> My search page is very simple. We have a text field with "Search" button.
>
> ------------------------------------------------------------------
> User Action:
> ------------------------------------------------------------------
>
> User enter below search text in above text field and clicks on "Search"
> button.
>
> "Books on Data Center"
>
> ------------------------------------------------------------------
> What is my expected behavior:
> ------------------------------------------------------------------
>
> Since the word "Data Center" more relevant computers I should show books
> related to computers.
>
> ------------------------------------------------------------------
> My Problem statement and Question to you all:
> ------------------------------------------------------------------
>
> To have a better search in my web applications what kind of strategy
> should I have and index the data accordingly in SOLR/Lucene.
>
> In my Lucene Index I may or may not have the word "data center". Still I
> should be able to return "data center"
>
> One thought I have is as follows:
>
> Modify the Category table by adding one more column to it:
>
> Category_ID_Primay_Key  integer
> Parent_Category_ID  integer
> Category_Name varchar(100)
> Category_Description varchar(1000)
> Category_Description_Keywords varchar(8000)
>
> Now take each word in "Category_description", find synonyms of it and
> store that data in Category_Description_Keywords column. After doing it,
> index the Category table records in SOLR/Lucene.
>
> Below are my questions to you all:
>
> Question 1:
> Need your feedbacks on above approach or any other approach which help me
> to make my search better that returns most relevant results to the user.
>
> Question 2:
> Can you suggest me Java based best Open Source or commercial synonym
> engines. I want such a best synonym engine that gives me all possible
> synonyms of a word.
>
>
>
> Thanks in Advance,
> Kishore Veleti A.V.K.
>

Reply via email to