Kishore, Solr has a SynonymFilterFactory which might be off use to you ( http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-2c461ac74b4ddd82e453dc68fcfc92da77358d46)
Regards, Eswar On Nov 18, 2007 10:39 PM, Kishore AVK. Veleti <[EMAIL PROTECTED]> wrote: > Hi All, > > I am new to Lucene / SOLR and developing a POC as part of research. Check > below my requirement and problem statement. Need help on how I can index the > data such data I have a very good search functionality in my POC. > > ------------------------------------------------------------------ > Requirement: > ------------------------------------------------------------------ > > Assume my web application is an Online book store and it sell all > categories of books like Computers, Social Studies, Physical Sciences etc. > Each of these categories has sub-categories. For example Computers has > sub-categories like Software Engineering, Java, SQL Server etc > > I have a database table called Categories and it contains both Parent > Category descriptions and also Child Category descriptions. > > Data structure of Category table is: > > Category_ID_Primay_Key integer > Parent_Category_ID integer > Category_Name varchar(100) > Category_Description varchar(1000) > > > ------------------------------------------------------------------ > My Search UI: > ------------------------------------------------------------------ > > My search page is very simple. We have a text field with "Search" button. > > ------------------------------------------------------------------ > User Action: > ------------------------------------------------------------------ > > User enter below search text in above text field and clicks on "Search" > button. > > "Books on Data Center" > > ------------------------------------------------------------------ > What is my expected behavior: > ------------------------------------------------------------------ > > Since the word "Data Center" more relevant computers I should show books > related to computers. > > ------------------------------------------------------------------ > My Problem statement and Question to you all: > ------------------------------------------------------------------ > > To have a better search in my web applications what kind of strategy > should I have and index the data accordingly in SOLR/Lucene. > > In my Lucene Index I may or may not have the word "data center". Still I > should be able to return "data center" > > One thought I have is as follows: > > Modify the Category table by adding one more column to it: > > Category_ID_Primay_Key integer > Parent_Category_ID integer > Category_Name varchar(100) > Category_Description varchar(1000) > Category_Description_Keywords varchar(8000) > > Now take each word in "Category_description", find synonyms of it and > store that data in Category_Description_Keywords column. After doing it, > index the Category table records in SOLR/Lucene. > > Below are my questions to you all: > > Question 1: > Need your feedbacks on above approach or any other approach which help me > to make my search better that returns most relevant results to the user. > > Question 2: > Can you suggest me Java based best Open Source or commercial synonym > engines. I want such a best synonym engine that gives me all possible > synonyms of a word. > > > > Thanks in Advance, > Kishore Veleti A.V.K. >