Substring in filter query (fq)...

2009-05-26 Thread escher2k
I have a requirement where I have to search a product catalog and filter on a substring. I am not sure if this is supported. Any help is appreciated. I have included the use case below. Example: Root cat -> Digital -> Watch -> "Timex Essentials" Root cat -> Men -> Accessories -> "Timex Essent

Substring in filter query...

2009-05-26 Thread escher2k
I have a requirement where I have to search a product catalog and filter on a substring. I am not sure if this is supported. Any help is appreciated. I have included the use case below. Example: Root cat -> Digital -> Watch -> "Timex Essentials" Root cat -> Men -> Accessories -> "Timex Essentials

Use Windows 1252 encoding...

2007-06-22 Thread escher2k
Is it possible to use Windows 1252 encoding instead of UTF-8 for Solr ? The application runs on Linux/JDK 1.5. We are using PHP for the front end. The problem we are having is that some characters are displayed weirdly owing to the encoding. Thanks. -- View this message in context: http://www.n

Re: OR condition in search...

2007-05-07 Thread escher2k
Hoss, This is not entirely related to the previous question. I know that the Analyzer configuration is causing the interpretation to be the way it is. In certain cases, where the typed in field might have user name such as "kathy_k", we want it to look for the exact expression in addition to wh

OR condition in search...

2007-05-04 Thread escher2k
Is is possible to specify that a term to be looked up in alternate ways" - e.g. search = "3 D" OR "3D" ? Reason being, by default, a search for "3D" is being split into "3 D". Thanks. -- View this message in context: http://www.nabble.com/OR-condition-in-search...-tf3695012.html#a10332814 Sent

Question about word treatment...

2007-05-04 Thread escher2k
(1) How does one ensure that Solr treats words like .Net and 3D correctly ? Right now, they get translated into Net and 3 d respectively. (2) Is it possible to force Lucene to treat a multiword (e.g. Ruby on Rails) as one word ? I am not sure if there is a mechanism to do this by creating a speci

Specifying no-ops...

2007-04-30 Thread escher2k
I want to capture information about the user who is executing a particular search. Is there a way to specify in Solr that certain fields should just be treated as pass through and not processed ? This way I can use arbitrary params to do better logging. Thanks. -- View this message in context:

Re: Delete from Solr index...

2007-04-30 Thread escher2k
Thanks Ryan. I need to use query since I am deleting a range of documents. >From your comment, I wasn't sure if one doesn't need to do an explicit commit when using delete by query. Does delete by query not need an explicit commit. Thanks. ryan mckinley wrote: > > esch

Faceted count syntax (exclude zeros)...

2007-04-30 Thread escher2k
I am trying to execute a faceted count on a field called "load_id" and want to exclude 0s. The URL below doesn't seem to be excluding zeros. http://localhost:12002/solr/select/?qt=dismax&q=Y&qf=show_all_flag&fl=load_id&facet=true&facet.limit=-1&facet.field=load_id&facet.mincount=1&rows=0 Result

Delete from Solr index...

2007-04-30 Thread escher2k
I am trying to remove documents from my index using "delete by query". However when I did this, the deleted items seem to remain. This is the format of the XML file I am using - load_id:20070424150841 load_id:20070425145301 load_id:20070426145301 load_id:20070427145302 load_id:20070428145301 load

Additive scoring using Dismax...

2007-04-26 Thread escher2k
I am trying to search across multiple fields using the AND operator. Somehow, when the results are returned, the score seems to be retrieving the max value and not really adding them up. In the example given below, the value that is returned (825) is really the max instead of what I was expecting

Re: Filter question...

2007-04-19 Thread escher2k
Thanks Mike. I just tested it on one field and looks like it works fine. Mike Klaas wrote: > > On 4/19/07, escher2k <[EMAIL PROTECTED]> wrote: >> >> Thanks Jennifer. But the issue with the quotes would be that it would >> match >> the string exactly and

Re: Filter question...

2007-04-19 Thread escher2k
Thanks Chris. We are using dismax already :) Chris Hostetter wrote: > > > : not find it, if there were other words in between (e.g. New Capital > Delhi). > > then you should use field:"New Delhi"~3 or (+field:New +field:Delhi) what > you have now is going to match any docs that have "New" in

Re: Filter question...

2007-04-19 Thread escher2k
Thanks Jennifer. But the issue with the quotes would be that it would match the string exactly and not find it, if there were other words in between (e.g. New Capital Delhi). Jennifer Seaman wrote: > > >>Is there a way to only retrieve those records that contain both the >>words "New" and "De

Filter question...

2007-04-19 Thread escher2k
I have a bunch of fields that I am trying to filter on. When I try to filter the data across the multiple fields, the result seems to even retrieve fields where the data is not present. For instance if the filter query contains this - primary_state:New Delhi OR primary_country:New Delhi OR prima

Changing encoding norms and boosting...

2007-03-29 Thread escher2k
This is related to an earlier posting (http://www.nabble.com/Document-boost-not-as-expected...-tf3476653.html). I am trying to determine a ranking for users that is between 1 and 1.5. Because of the way the encoding norm is stored, if index time boosting is done, everyone gets a score of 1, 1.25 o

Re: Document boost not as expected...

2007-03-29 Thread escher2k
31072 121359 Doc 2 123194.06 114688 106189 The difference between the results is because I am ignoring the length Norm (changed it from ((float)(1.0 / Math.sqrt(numTerms) to 1.0f). Thanks once again. Mike Klaas wrote: > > On 3/28/07, escher2k <[E

Re: Document boost not as expected...

2007-03-28 Thread escher2k
Mike, I am not doing anything custom for this test. I am assuming that the Default Similarity is used. Surprisingly, if I remove the document level boost (set to 1.0) and just have a field level boost, the result seems to be correct. Mike Klaas wrote: > > On 3/28/07, escher2k &

Re: Document boost not as expected...

2007-03-28 Thread escher2k
Chris, Earlier I was trying to modify the Similarity computation to make it field dependent (we are trying to change tf based on the field). Now, I have reverted the custom computation so that the default Similarity is used. Fro testing, I boosted a single field in one doc. Y ... This is w

Document boost not as expected...

2007-03-27 Thread escher2k
I am implementing a document boost at indexing time for the documents. I read some posting that seemed to indicate that omitNorm=false is needed to retain the document boosting for retrieval. After I did that, it looks like I am not able to get back the boost I originally put in. Instead, I get 1.

Re: Filter query doesn't always work...

2007-03-27 Thread escher2k
. > > You can find some documentation about it in the example schema.xml: > http://svn.apache.org/viewvc/lucene/solr/trunk/example/solr/conf/schema.xml > > mirko > > > Quoting escher2k <[EMAIL PROTECTED]>: > >> >> I have a strange problem, and I don't seem

Filter query doesn't always work...

2007-03-27 Thread escher2k
I have a strange problem, and I don't seem to see any issue with the data. I am filtering on a field called reviews_positive_6_mos. The field is declared as an integer. If I specify - (a) fq=reviews_positive_6mos%3A[*+TO+*] => 36033 records are retrieved. (b) fq=reviews_positive_6mos%3A[*+TO+100]

RE: Incremental replication...

2007-02-13 Thread escher2k
Graham Stead-2 wrote: > > We have used replication for a few weeks now and it generally works well. > > I believe you'll find that commit operations cause only new segments to be > transferred, whereas optimize operations cause the entire index to be > transferred. Therefore, the amount of data

Incremental replication...

2007-02-13 Thread escher2k
I was wondering if the scripts provided in Solr do incremental replication. Looking at the script for snapshooter, it seems like the whole index directory is copied over. Is that correct ? If so, isn't performance a problem over the long run ? Thanks for the clarification in advance (I hope I am w

Production application server recommendation...

2007-02-08 Thread escher2k
Hi, Thanks to the excellent support from the community and the application, we have made good progress towards building a solution using Solr. We currently use Lucene with Jakarta. I was wondering if anyone has recommendations on using Jetty vs. Jakarta. This will run on Solaris. Thanks. -

OR filtering...

2007-01-29 Thread escher2k
Hi, I have a question about the syntax for doing an OR filter in my URL. How do I specify where ((fq=colA[10 TO 20]) AND (fq=state:USA OR fq=country:USA) ? Basically, I am doing a search for a keyword across certain fields and I want to filter the result set. The user can input city/state/count

Bucketing result set (User list posting)...

2007-01-17 Thread escher2k
I have a requirement wherein the documents that are retrieved based on the similarity computation are bucketed and resorted based on user score. An example - Let us say a search returns the following data set - Doc ID Lucene score User score 10001000 125 1000 900

Re: Faceting question...

2007-01-15 Thread escher2k
Thanks Chris. DUMB of me not to have noticed. Chris Hostetter wrote: > > > : : omitNorms="true"/> > : > > your "fhild_catname" isn't using "string" as it's field type -- it's using > "text" (which is most likely using TextField and being tokenized) > > > > > > -Hoss > > > -- View

Faceting question...

2007-01-15 Thread escher2k
I have a document which contains the following facet - Other - Programming When I get back the results, the output seems to show lst name="facet_fields"> 1 1 Is there any way to prevent "other" and "programming" from being returned as tokens ? The schema.xml defined this as a string

Lucene Solr version question...

2007-01-05 Thread escher2k
Is the latest Lucene nightly build compatible with the Solr ? If not, is there any file in the download, which lists the lucene build that is bundled with Solr ? Thanks. -- View this message in context: http://www.nabble.com/Lucene-Solr-version-question...-tf2927137.html#a8183274 Sent from the

Re: Custom scorer...

2007-01-04 Thread escher2k
Yonik, I have my own Similarity and now need to write my own Scorer for muti field scoring. Is extending DisMaxQueryScorer the way to go ? Thanks. Yonik Seeley wrote: > > On 1/3/07, escher2k <[EMAIL PROTECTED]> wrote: >>I only saw options to specify the default operat

Custom function...

2007-01-03 Thread escher2k
Hi, I am trying to create a linear function to influence the similarity computation. For example - if tf = 4, f(tf) = 150 * 1 + 150 * 0.3 = 195 The first occurrence is multiplied by 150. The next three occurrences are mulitplied by 150 and divided by 10 (3/10). Howev

Re: Custom scorer...

2007-01-03 Thread escher2k
I haven't written any unfortunately. Right now, I have only implemented custom Similarity - in here, I am trying to alter the scoring substantially depending on the field. However, I might need to alter the overall document score - I am in the process of investigating if I can get away without ha

Multi column search...

2007-01-03 Thread escher2k
What is the syntax to use to specify a search across multiple across fields in a URL ? This is in case someone needs to alter the "searchable" fields dynamically as opposed to reading them at server startup from "qf" in solrconfig.xml. Also, it is easier at debug time :) Thanks. -- View this m

Re: Question about similarity manipulation...

2007-01-03 Thread escher2k
Chris Hostetter wrote: > > > : The DisjunctionMaxQuery seems to yield the maximum score only. From my > > NOTE: by setting the "tiebreaker" value of a DisjunctionMaxQuery to "1.0" > it generates the sum of the scores > > : understanding, I would > : need to do the following - > : (1) Create a

Re: Custom scorer...

2007-01-03 Thread escher2k
Yonik, I only saw options to specify the default operator (AND|OR) and to specify similarity. There was nothing for scoring - in case I need to modify the scoring. I got the similarity working by creating my own class. Thanks. Yonik Seeley wrote: > > On 1/2/07, escher2k <[EMAIL

Custom scorer...

2007-01-02 Thread escher2k
Is there a way to specify a custom scorer in Solr using the XML configuration file ? I want to use the AND operator (e.g. "Lucene AND Solr") to search multiple fields and then add the similarity score for each field using a custom similiarity computation for each. Thanks. -- View this message in

Question about similarity manipulation...

2007-01-02 Thread escher2k
We have a requirement that requires an additive score and I am not sure if it is possible and what the right way to go about it is. Assume, there are three fields - (a) Project name - contains text (b) Project description - contains text (c) Profile score - numeric The basic idea is to implement

Function boosts...

2006-12-28 Thread escher2k
I had a question about the way boosting works - is it a final boost on the score that is returned ? For instance, in the LinearFloatFunction (LinearFloatFunction(ValueSource source, float slope, float intercept)), is the ValueSource is the "core" score returned by Lucene that gets boosted. From

Re: Realtime directory change...

2006-12-28 Thread escher2k
Thanks Chris. So, assuming that we rebuild the index, delete the old data and then execute a commit, will the snap scripts take care of reconciling all the data ? Internally, is there an update timestamp notion used to figure out which unique id records have changed and then synchronize them by ex

Boosting using function query...

2006-12-27 Thread escher2k
Hi, I have three documents in my test data set. - 30ABCDE XYZ GHI - 40abcde XYZ GHI - 50abcde XYZ I used the standard handler to retrieve the documents where "ghi" or "xyz" are present (http://localhost:8983/solr/select/?q=ghi+xyz&debugQuery=1). When the results are return

Re: Multiple indexes...

2006-12-21 Thread escher2k
seperate context path. for example, you > could have: > > http://xyz:8765/index1/select/?q=xxx > http://xyz:8765/index2/select/?q=xxx > http://xyz:8765/index3/select/?q=xxx > > > > > On 12/21/06, escher2k <[EMAIL PROTECTED]> wrote: >> >> >&

Re: Multiple indexes...

2006-12-21 Thread escher2k
. Erik Hatcher wrote: > > What is the advantage to running multiple indexes from a single Solr > instance over multiple Solr instances each serving a single index? > > Erik > > > On Dec 21, 2006, at 3:26 PM, escher2k wrote: > >> >> I looked

Re: Realtime directory change...

2006-12-21 Thread escher2k
on the same machine while the application is still running ? Ideally, what we would want is to recreate a new index from scratch and then use the master/slave configuration to copy the indexes to other machines. Yonik Seeley wrote: > > On 12/21/06, escher2k <[EMAIL PROTECTED]>

Multiple indexes...

2006-12-21 Thread escher2k
I looked at the forums and found that it is not possible to have multiple indexes associated with one app server instance ? Is the best way to run multiple app server instances ? It would be a nice enhancement to support parameterization of the index to be used. -- View this message in context:

Realtime directory change...

2006-12-21 Thread escher2k
Hi, We currently use Lucene to do index user data every couple of hours - the index is completely rebuilt, the old index is archived and the new one copied over to the directory. Example - /bin/cp ${LOG_FILE} ${CRON_ROOT}/index/help/ /bin/rm -rf ${INDEX_ROOT}/archive/help.${DATE} /bin/cp -R ${C