Re: Synonym processing at index time

2010-11-26 Thread Lance Norskog
I gave up trying to utterly totally master the analyzer classes. solr/admin/analysis.jsp allows you to see exactly how your analysis stack processes text, including what it does with synonyms both at index and query times. This is the easiest way to start and maintain this kind of feature; you mig

Re: DIH delta, deltaQuery

2010-11-26 Thread Alexey Serba
Are you sure that it's deltaQuery that's taking a minute? It only retrieves ids of updated records and then deltaImportQuery is executed N times for each id record. You might want to try the following technique - http://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport On Wed, Nov 24,

Re: Basic Solr Configurations and best practice

2010-11-26 Thread Alexey Serba
> 1-      How to combine data from DIH and content extracted from file system > document into one document in the index? http://wiki.apache.org/solr/TikaEntityProcessor You can have one sql entity that retrieves metadata from database and another nested entity that parses binary file into additiona

Re: using DIH with mets/alto file sets

2010-11-26 Thread Alexey Serba
> The idea is to create a full text index of the alto content, accompanied by > the author/title info from the mets file for purposes of results display. - Then you need to list only alto files in your landscapes entity (fileName="^ID.{3}-ALTO\d{3}.xml$" or something like that), because you don't

Re: Logging queries and hit count

2010-11-26 Thread Ahmet Arslan
> Is it possible to create a lean log file for queries and > the number of > hits these queries returned? > > We are running Solr under Tomcat. I believe that many people do it at client side. But tomcat already logs that info. If you set tomcat's log level to INFO you can extract hits, QTime an

Saravanan Chinnadurai/Actionimages is out of the office.

2010-11-26 Thread Saravanan . Chinnadurai
I will be out of the office starting 26/11/2010 and will not return until 27/11/2010. Please email to itsta...@actionimages.com for any urgent issues. Action Images are proud to be an Official Supplier to England 2018 - www.england2018bid.com Action Images is a division of Reuters Limited a

Logging queries and hit count

2010-11-26 Thread Marian Steinbach
Hi! Is it possible to create a lean log file for queries and the number of hits these queries returned? We are running Solr under Tomcat. Thanks! Marian

RE: Synonym Filtering on String Fields

2010-11-26 Thread Jason Brown
Thanks Erick - I do exactly want multiple terms generated from my string field i.e. I want the single term fund manager summary to be turned into 2 terms > fund manager summary, fund manager report I want the single term guide to be turned into the 2 terms -> guide, product guide I am using te

Re: Synonym Filtering on String Fields

2010-11-26 Thread Erick Erickson
Besides Ahmet's comments, I have to wonder if you want to do this in a single field? The problem is that you're expanding your synonyms into a field. Let's say you expand "memory" into "memory", "recall" and "RAM". Now you have three tokens in your field. What does faceting mean now? Perhaps you wo

Re: Synonym Filtering on String Fields

2010-11-26 Thread Ahmet Arslan
Two things can be done. 1 or 2. 1-) You can use tokenizerFactory attribute of synonym filter factory. 2-) You can use escape white spaces in synonyms.txt fund\ manager\ summary, fund\ manager\ report --- On Fri, 11/26/10, Jason Brown wrote: > From: Jason Brown > Subject: Synonym Filteri

Re: DIH : Delta import don't process the updated documents

2010-11-26 Thread stockii
hey. i know this problem. DIH dont commit your documents. if there not comited, no changes in the index. in my case, i had a broken delta-query. some stupid mistakes like. id = '{$dataimporter.delta.id}' or deltaimporter.delta.id check your delta ID and dih should commit you changes. DIH sa

Synonym Filtering on String Fields

2010-11-26 Thread Jason Brown
I have the following field type set up in my schema. The idea is to fire phrases of text such as 'fund manager summary' (without the quotes) at it, and for the synonym processing to recognise this, and add the rest of the synonyms (index-time synonym processing with expansion) to the index from