How many synonym sets do you have? I'm using about 600 sets with no problem. --wunder
On 11/19/07 8:23 PM, "climbingrose" <[EMAIL PROTECTED]> wrote: > Correction for last message: you need to modify or extend > SynonymFilterFactory instead of SynonymFilter. SynonmFilterFactory is > responsible for initialising SynonymFilter and populating the list of > synonyms. Have a look at the source code. I think it's pretty easy to > understand. What you probably need to do is to add more parameters > such as database host, username, password and the actual database in > init() method. > > On Nov 20, 2007 3:18 PM, climbingrose <[EMAIL PROTECTED]> wrote: >> One approach is to extend SynonymFilter so that it reads synonyms from >> database instead of a file. SynonymFilter is just a Java class so you >> can do whatever you want with it :D. From what I remember, the filter >> initialises a list of all input synonyms and store them in memory. >> Therefore, you need to make sure that all the synonyms can fit into >> memory at runtime. >> >> >> On Nov 20, 2007 1:54 AM, Kishore AVK. Veleti <[EMAIL PROTECTED]> >> wrote: >>> Hi Eswar, >>> >>> Thanks for the update. >>> >>> I have gone through the below link provided by you and what I understood >>> from it is, we need to have all possible synonyms in a text file. This file >>> need to be given as input for "SynonymFilterFactory" to work. If my >>> understanding is right then the approach may not suit my requirement. Reason >>> is I need to find synonyms of all the keywords in category description and >>> store those synonyms in the above said input file. The file may be too big. >>> >>> Let me know if my understanding is wrong. >>> >>> >>> Thanks, >>> Kishore Veleti A.V.K. >>> >>> >>> >>> >>> -----Original Message----- >>> From: Eswar K [mailto:[EMAIL PROTECTED] >>> Sent: Monday, November 19, 2007 11:22 AM >>> To: solr-user@lucene.apache.org >>> Subject: Re: Finding all possible synonyms for a word >>> >>> Kishore, >>> >>> Solr has a SynonymFilterFactory which might be off use to you ( >>> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-2c461ac74b4 >>> ddd82e453dc68fcfc92da77358d46) >>> >>> >>> Regards, >>> Eswar >>> >>> On Nov 18, 2007 10:39 PM, Kishore AVK. Veleti <[EMAIL PROTECTED]> >>> wrote: >>> >>>> Hi All, >>>> >>>> I am new to Lucene / SOLR and developing a POC as part of research. Check >>>> below my requirement and problem statement. Need help on how I can index >>>> the >>>> data such data I have a very good search functionality in my POC. >>>> >>>> ------------------------------------------------------------------ >>>> Requirement: >>>> ------------------------------------------------------------------ >>>> >>>> Assume my web application is an Online book store and it sell all >>>> categories of books like Computers, Social Studies, Physical Sciences etc. >>>> Each of these categories has sub-categories. For example Computers has >>>> sub-categories like Software Engineering, Java, SQL Server etc >>>> >>>> I have a database table called Categories and it contains both Parent >>>> Category descriptions and also Child Category descriptions. >>>> >>>> Data structure of Category table is: >>>> >>>> Category_ID_Primay_Key integer >>>> Parent_Category_ID integer >>>> Category_Name varchar(100) >>>> Category_Description varchar(1000) >>>> >>>> >>>> ------------------------------------------------------------------ >>>> My Search UI: >>>> ------------------------------------------------------------------ >>>> >>>> My search page is very simple. We have a text field with "Search" button. >>>> >>>> ------------------------------------------------------------------ >>>> User Action: >>>> ------------------------------------------------------------------ >>>> >>>> User enter below search text in above text field and clicks on "Search" >>>> button. >>>> >>>> "Books on Data Center" >>>> >>>> ------------------------------------------------------------------ >>>> What is my expected behavior: >>>> ------------------------------------------------------------------ >>>> >>>> Since the word "Data Center" more relevant computers I should show books >>>> related to computers. >>>> >>>> ------------------------------------------------------------------ >>>> My Problem statement and Question to you all: >>>> ------------------------------------------------------------------ >>>> >>>> To have a better search in my web applications what kind of strategy >>>> should I have and index the data accordingly in SOLR/Lucene. >>>> >>>> In my Lucene Index I may or may not have the word "data center". Still I >>>> should be able to return "data center" >>>> >>>> One thought I have is as follows: >>>> >>>> Modify the Category table by adding one more column to it: >>>> >>>> Category_ID_Primay_Key integer >>>> Parent_Category_ID integer >>>> Category_Name varchar(100) >>>> Category_Description varchar(1000) >>>> Category_Description_Keywords varchar(8000) >>>> >>>> Now take each word in "Category_description", find synonyms of it and >>>> store that data in Category_Description_Keywords column. After doing it, >>>> index the Category table records in SOLR/Lucene. >>>> >>>> Below are my questions to you all: >>>> >>>> Question 1: >>>> Need your feedbacks on above approach or any other approach which help me >>>> to make my search better that returns most relevant results to the user. >>>> >>>> Question 2: >>>> Can you suggest me Java based best Open Source or commercial synonym >>>> engines. I want such a best synonym engine that gives me all possible >>>> synonyms of a word. >>>> >>>> >>>> >>>> Thanks in Advance, >>>> Kishore Veleti A.V.K. >>>> >>> >> >> >> >> -- >> Regards, >> >> Cuong Hoang >> > >