Re: Spell Check Handler

2007-10-10 Thread scott.tabar
Hoss, I had a feeling someone would be quoting Yonik's Law of Patches! ;-) For now, this is done. I created the changes, created JavaDoc comments on the various settings and their expected output, created a JUnit test for the SpellCheckerRequestHandler which tests various components of the ha

Re: Facets and running out of Heap Space

2007-10-10 Thread Yonik Seeley
On 10/10/07, Mike Klaas <[EMAIL PROTECTED]> wrote: > Have you tried setting multivalued=true without reindexing? I'm not > sure, but I think it will work. Yes, that will work fine. One thing that will change is the response format for stored fields val1 instead of val1 Hopefully in the future we

Re: Syntax for newSearcher query

2007-10-10 Thread BrendanD
Awesome! Thanks! hossman wrote: > > > : looking queries that I'm not quite sure how to specify in my > solrconfig.xml > : file in the newSearcher section. > > : > rows=20&start=0&facet.query=attribute_id:1003278&facet.query=attribute_id:1003928&sort=merchant_count+desc&facet=true&facet.field=

Re: Syntax for newSearcher query

2007-10-10 Thread Chris Hostetter
: looking queries that I'm not quite sure how to specify in my solrconfig.xml : file in the newSearcher section. : rows=20&start=0&facet.query=attribute_id:1003278&facet.query=attribute_id:1003928&sort=merchant_count+desc&facet=true&facet.field=min_price_cad_rounded_to_tens&facet.field=manufactu

Syntax for newSearcher query

2007-10-10 Thread BrendanD
Hi, The examples that I've found in the solrconfig.xml file and on this site are fairly basic for pre-warming specific queries. I have some rather complex looking queries that I'm not quite sure how to specify in my solrconfig.xml file in the newSearcher section. Here's an example of 3 queries t

Re: [ADMIN] - Spam problems?

2007-10-10 Thread Chris Hostetter
: Around Sept. 20 I started getting Japanese spam to this account. This is : a special account I only use for the Solr and Lucene user mailing : lists. Did anybody else get these, starting around 9/20? Note that many mailing list archives leave the sender emails in plain text (which results in

Re: Facets and running out of Heap Space

2007-10-10 Thread Mike Klaas
On 10-Oct-07, at 3:46 PM, David Whalen wrote: I'll see what I can do about that. Truthfully, the most important facet we need is the one on media_type, which has only 4 unique values. The second most important one to us is location, which has about 30 unique values. So, it would seem like we

RE: Facets and running out of Heap Space

2007-10-10 Thread David Whalen
I'll see what I can do about that. Truthfully, the most important facet we need is the one on media_type, which has only 4 unique values. The second most important one to us is location, which has about 30 unique values. So, it would seem like we actually need a counter-intuitive solution. That

Re: quick allowDups questions

2007-10-10 Thread Ryan McKinley
the default solrj implementation should do what you need. As for Solrj, you're probably right, but I'm not going to take any chances for the time being. The server.add method has an optional Boolean flag named "overwrite" that defaults to true. Without knowing for sure what it does, I'm not goi

Re: Facets and running out of Heap Space

2007-10-10 Thread Mike Klaas
On 10-Oct-07, at 2:40 PM, David Whalen wrote: Accoriding to Yonik I can't use minDf because I'm faceting on a string field. I'm thinking of changing it to a tokenized type so that I can utilize this setting, but then I'll have to rebuild my entire index. Unless there's some way around that?

Internal Server Error and waitSearcher="false" for commit/optimize

2007-10-10 Thread Jason Rennie
Hello, We're using solr 1.2 and a nightly build of the solrj client code. We very occasionally see things like this: org.apache.solr.client.solrj.SolrServerException: Error executing query at org.apache.solr.client.solrj.request.QueryRequest.process( QueryRequest.java:86) at org.

Re: start tag not allowed in epilog

2007-10-10 Thread BrendanD
I've re-written the code to generate separate files. One for adds and one for deletes. And this is working well for us now. Thanks. Mike Klaas wrote: > > > This would be very complicated from a standpoint of returning errors > to the client. > > Keep in mind the can never be batched, rega

Re: Different search results for (german) singular/plural searches - looking for a solution

2007-10-10 Thread Daniel Naber
On Wednesday 10 October 2007 12:00, Martin Grotzke wrote: > Basically I see two options: stemming and the usage of synonyms. Are > there others? A large list of German words and their forms is available from a Windows software called Morphy (http://www.wolfganglezius.de/doku.php?id=public:cl:mo

RE: Facets and running out of Heap Space

2007-10-10 Thread David Whalen
Accoriding to Yonik I can't use minDf because I'm faceting on a string field. I'm thinking of changing it to a tokenized type so that I can utilize this setting, but then I'll have to rebuild my entire index. Unless there's some way around that? > -Original Message- > From: Mike Kla

RE: quick allowDups questions

2007-10-10 Thread Charlie Jackson
Thanks for the response, Mike. A quick test using the example app confirms your statement. As for Solrj, you're probably right, but I'm not going to take any chances for the time being. The server.add method has an optional Boolean flag named "overwrite" that defaults to true. Without knowing for

Re: start tag not allowed in epilog

2007-10-10 Thread Mike Klaas
On 10-Oct-07, at 12:49 PM, BrendanD wrote: We simply process a queue of updates from a database table. Some of the updates are deletes, some are adds. Sometimes you can have many deletes in a row, sometimes many adds in a row, and sometimes a mixture of deletes and adds. We're trying to b

Re: quick allowDups questions

2007-10-10 Thread Mike Klaas
On 10-Oct-07, at 1:11 PM, Charlie Jackson wrote: Anyway, I need to update some docs in my index because my client program wasn't accurately putting these docs in (values for one of the fields was missing). I'm hoping I won't have to write additional code to go through and delete each existing

Re: Facets and running out of Heap Space

2007-10-10 Thread Mike Klaas
On 10-Oct-07, at 12:19 PM, David Whalen wrote: It looks now like I can't use facets the way I was hoping to because the memory requirements are impractical. I can't remember if this has been mentioned, but upping the HashDocSet size is one way to reduce memory consumption. Whether this wi

quick allowDups questions

2007-10-10 Thread Charlie Jackson
Normally this is the type of thing I'd just scour through the online docs or the source code for, but I'm under the gun a bit. Anyway, I need to update some docs in my index because my client program wasn't accurately putting these docs in (values for one of the fields was missing). I'm hoping

Re: WebException (ServerProtocolViolation) with SolrSharp

2007-10-10 Thread Jeff Rodenburg
Hi Felipe - The issue you're encountering is a problem with the data format being passed to the solr server. If you follow the stack trace that you posted, you'll notice that the solr field is looking for a value that's a float, but the passed value is "1,234". I'm guessing this is caused by one

Re: start tag not allowed in epilog

2007-10-10 Thread BrendanD
We simply process a queue of updates from a database table. Some of the updates are deletes, some are adds. Sometimes you can have many deletes in a row, sometimes many adds in a row, and sometimes a mixture of deletes and adds. We're trying to batch our updates and were hoping to avoid having to

Re: start tag not allowed in epilog

2007-10-10 Thread Chris Hostetter
: Does anyone know how to correct this? Is it not possible to have multiple : different top-level tags in the same update xml file? It seems to me like it : should work, but perhaps there's something inherently bad about this from : the XMLStreamReader's point of view. it's inherently bad from th

start tag not allowed in epilog

2007-10-10 Thread BrendanD
Hi, I've got an xml update document that I'm sending to solr's update handler with deletes and adds in it. For example: 12345678 And I'm getting the following exception in the catalina.out log: Oct 10, 2007 12:58:22 PM org.apache.solr.common.SolrException log SEVERE: javax.xml.stream.

RE: Facets and running out of Heap Space

2007-10-10 Thread David Whalen
It looks now like I can't use facets the way I was hoping to because the memory requirements are impractical. So, as an alternative I was thinking I could get counts by doing rows=0 and using filter queries. Is there a reason to think that this might perform better? Or, am I simply moving the p

WebException (ServerProtocolViolation) with SolrSharp

2007-10-10 Thread Filipe Correia
Hello, I am trying to run SolrSharp's example application but am getting a WebException with a ServerProtocolViolation status message. After some debugging I found out this is happening with a call to: http://localhost:8080/solr/update/ And using fiddler[1] found out that solr is actually throwi

WebException (ServerProtocolViolation) with SolrSharp

2007-10-10 Thread Filipe Correia
Hello, I am trying to run SolrSharp's example application but am getting a WebException with a ServerProtocolViolation status message. After some debugging I found out this is happening with a call to: http://localhost:8080/solr/update/ And using fiddler[1] found out that solr is actually throwi

Re: getting number of stored documents via rest api

2007-10-10 Thread Chris Hostetter
: I think search for "*:*" is the optimal code to do it. I don't think you can : do anything faster. FYI: getting the data from the xml returned by stats.jsp is definitely faster in the case where you really want all docs. if you want the total number from some other query however, don't "count

Re: getting number of stored documents via rest api

2007-10-10 Thread Chris Hostetter
: there a fast & easy way to retrieve this number (instead of searching for : "*:*" and counting the results)? NOTE: you don't have to count the results to know the total number of docs matching any query ... just use the numFound attribute of the block. : I already took a look at the stats.j

Re: Problems with mySolr Wiki

2007-10-10 Thread Chris Hostetter
i'm not very familiar with that wiki, but note the line in the example ant script... ... : --solr.xml <-- Where i can find this file? according to the wiki page... > First we will setup a basic directory structure (assuming we only want to > change some fields) and copy the attached buil

Re: getting number of stored documents via rest api

2007-10-10 Thread climbingrose
I think search for "*:*" is the optimal code to do it. I don't think you can do anything faster. On 10/11/07, Stefan Rinner <[EMAIL PROTECTED]> wrote: > > Hi > > for some tests I need to know how many documents are stored in the > index - is there a fast & easy way to retrieve this number (instead

getting number of stored documents via rest api

2007-10-10 Thread Stefan Rinner
Hi for some tests I need to know how many documents are stored in the index - is there a fast & easy way to retrieve this number (instead of searching for "*:*" and counting the results)? I already took a look at the stats.jsp code - but there the number of documents is retrieved via an api

Re: Availability Issues

2007-10-10 Thread Otis Gospodnetic
Hi, - Original Message From: David Whalen <[EMAIL PROTECTED]> On that note -- I've read that Jetty isn't the best servlet container to use in these situations, is that your experience? OG: In which situations? Jetty is great, actually! (the pretty high traffic site in my sig runs Jett

Re: problems with arabic search

2007-10-10 Thread Grant Ingersoll
Hmmm, by the looks of your query, it doesn't seem like it is a Solr query, but I admit I don't have all the parameters memorized. What request handler, etc. are you using? Have you tried debugging? And you say you have tried a query with the Solr Admin query page, right? And that works?

RE: problems with arabic search

2007-10-10 Thread Heba Farouk
In firefox, character encoding is set to UTF-8 Yes, I'm sending the query directly to solr using apache httpclient and I set the http request header content type to : Content-Type="text/html; charset=UTF-8" Any suggestions Thanks in advance -Original Message- From: Grant Ingersoll [mailt

Re: problems with arabic search

2007-10-10 Thread Grant Ingersoll
Can you give more detail about what you have done? What character encoding do you have your browser set to? In Firefox, do View -> Character Encoding to see what it is set to when you are on the input page? Internet Explorer and other browsers have other options. Are you sending the que

RE: Solr and KStem

2007-10-10 Thread Wagner,Harry
Hi Piete, Good idea. Thanks. One other change that should probably be made is to change the package statement from org.oclc.solr.analysis to org.apache.solr.analysis. Thanks again. Cheers! harry -Original Message- From: Pieter Berkel [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 09,

showing results per facet-value efficiently

2007-10-10 Thread Britske
First of all, I just wanted to say that I just started working with Solr and really like the results I'm getting from Solr (in terms of performance, flexibility) as well as the good responses I'm getting from this group. Hopefully I will be able to contribute in way way or another to this wonderfu

Re: Different search results for (german) singular/plural searches - looking for a solution

2007-10-10 Thread Thomas Traeger
in short: use stemming Try the SnowballPorterFilterFactory with German2 as language attribute first and use synonyms for combined words i.e. "Herrenhose" => "Herren", "Hose". By using stemming you will maybe have some "interesting" results, but it is much better living with them than having

Different search results for (german) singular/plural searches - looking for a solution

2007-10-10 Thread Martin Grotzke
Hello, with our application we have the issue, that we get different results for singular and plural searches (german language). E.g. for "hose" we get 1.000 documents back, but for "hosen" we get 10.000 docs. The same applies to "t-shirt" or "t-shirts", of e.g. "hut" and "hüte" - lots of cases :

Re: Manage multiple indexes with Solr

2007-10-10 Thread Venkatraman S
i would be interested to know in both the cases : Case 1 : * document "1", with uniq ID "ui1" will be indexed in the "indexA" * document "2", with uniq ID "ui2" will be indexed in the "indexB" * document "3", with uniq ID "ui3" will be indexed in the "indexA" Case 2 : * document "1", with uniq ID

Re: Manage multiple indexes with Solr

2007-10-10 Thread ycrux
Sorry, there's a mistake in my previous example. Please read this: * document "1", with uniq ID "ui1" will be indexed in the "indexA" * document "2", with uniq ID "ui2" will be indexed in the "indexB" * document "3", with uniq ID "ui3" will be indexed in the "indexA" Thanks cheers Y. Messa

Manage multiple indexes with Solr

2007-10-10 Thread ycrux
Hi guys ! Is it possible to configure Solr to manage different indexes depending on the added documents ? For example: * document "1", with uniq ID "ui1" will be indexed in the "indexA" * document "2", with uniq ID "ui2" will be indexed in the "indexB" * document "3", with uniq ID "ui1" will be

unlockOnStartup does not work in embedded solr?

2007-10-10 Thread Alexey Shakov
Hi *, I use solr as embedded solution. I have set unlockOnStartup to "true" in my solrconfig.xml But it seems, that this option is ignored by embedded solr. Any ideas? Thanks in advance, Alexey

Problems with mySolr Wiki

2007-10-10 Thread Christian Klinger
Hi Solr-Users, i try to follow the instructions [1] from the solr-wiki to build my custom solr server. First i have created the directory-structure. mySolr --solr --conf --schema.xml --solrconfig.xml --solr.xml <-- Where i can find this file? --build.xml <-- copy & paste from the wiki

RE: problems with arabic search

2007-10-10 Thread Heba Farouk
I'm developing a java application using solr, this application is working with English search Yes, I have tried querying solr directly for Arabic and it's working Any suggestions ?? -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 10, 2007 5:50