Re: GET_SCORES flag in SolrIndexSearcher

2007-10-18 Thread Chris Hostetter
: The scores list in DocIterator is null after a successful query. There's a : flag in SolrIndexSearcher, GET_SCORES, that looks like it should trigger : setting the scores array for the resulting DocList, but I can't figure out how : to set it. Any suggestions? I'm using the svn trunk code. Ca

Re: Which terms in the query match

2007-10-18 Thread Chris Hostetter
: 1. Query for a set of terms against a field - : 2. do a second query on the results of the first query for the terms : that did not match in the first query against another field. i'm a little confused as to what exactly the point of this would be ... mainly because phrases like "query on the

Re: Overall performance: network v.s. SAN file system

2007-10-18 Thread Walter Underwood
The question almost doesn't make sense, because SANs are so configurable. It is like saying "over a network" without specifying whether the network is dial-up or fiber. A few things to note: * The automatic backups are not synchronized with consistent index states, so they are probably useless. *

Re: Overall performance: network v.s. SAN file system

2007-10-18 Thread Christopher Triggs
Hi, I have not done any comparisons with Solr but have done some with another enterprise search engine. Are you looking for performance data or architecture? Some of the things I looked at was: Indexing performance gains, Size of index v's query performance. Memory usage of large indexe

Re: Overall performance: network v.s. SAN file system

2007-10-18 Thread Otis Gospodnetic
I don't think anyone replied to this, but maybe now, two months since Lance's email, somebody has done some comparisons? I'm curious, too. Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message Fr

Re: Solr + Tomcat Undeploy Leaks

2007-10-18 Thread Mike Klaas
On 18-Oct-07, at 1:01 PM, Stu Hood wrote: I'm running SVN r583865 (1.3-dev). Mike: when you say 'process level management', do you mean starting them statically? Or starting them dynamically, but using a different container for each instance? I have a large number of servers, each running

Re: Solr + Tomcat Undeploy Leaks

2007-10-18 Thread Stu Hood
I'm running SVN r583865 (1.3-dev). Mike: when you say 'process level management', do you mean starting them statically? Or starting them dynamically, but using a different container for each instance? A little explanation is probably in order: -- We're using Solr to provide log search capab

Re: Solr + Tomcat Undeploy Leaks

2007-10-18 Thread Tom Hill
I certainly have seen memory problems when I just drop a new war file in place. So now I usually stop tomcat and restart. I used to see problems (pre-1.0) when I just redeployed repeatedly, without even accessing the app, but I've got a little script running in the background that has done that 50

Re: Solr, operating systems and globalization

2007-10-18 Thread Mike Klaas
On 18-Oct-07, at 11:43 AM, Chris Hostetter wrote: : This is easy--I always convert dates to UTC. Doubly important since several : of our servers operate in different timezones. : : Less easy is changing Solr's interpretation of NOW in DateMath to be UTC. : What is the correct way to go abo

Re: Solr + Tomcat Undeploy Leaks

2007-10-18 Thread Mike Klaas
I'm not sure that many people are dynamically taking down/starting up Solr webapps in servlet containers. I certainly perfer process-level management of my (many) Solr instances. -Mike On 18-Oct-07, at 10:40 AM, Stu Hood wrote: Any ideas? Has anyone had experienced this problem with othe

Re: Lock obtain timed out

2007-10-18 Thread Chris Hostetter
: We have a very active large index running a solr trunk from a few weeks ago : that has been going down about once a week for this: : : [11:08:17.149] No lockType configured for /home/bwhitman/XXX/XXX/ : discovered-solr/data/index assuming 'simple' : [11:08:17.150] org.apache.lucene.store.LockOb

Re: Solr, operating systems and globalization

2007-10-18 Thread Chris Hostetter
: This is easy--I always convert dates to UTC. Doubly important since several : of our servers operate in different timezones. : : Less easy is changing Solr's interpretation of NOW in DateMath to be UTC. : What is the correct way to go about this? You lost me there ... "Dates" in java have no c

Re: FunctionQuery, DisMax and Highlighting

2007-10-18 Thread Mike Klaas
On 18-Oct-07, at 8:47 AM, Alf Eaton wrote: I'm currently using the standard request handler for queries, because it provides highlighting (unlike DisMax). I'd also like to be able to use FunctionQuery to boost certain fields. From looking through the lists and JIRA it looks like there has bee

Re: Solr, operating systems and globalization

2007-10-18 Thread Mike Klaas
On 17-Oct-07, at 1:52 PM, Chris Hostetter wrote: : However, SolrSharp culture settings should be reflective and consistent with : the solr server instance's culture. This leads to my question: does Solr : control its culture & language settings through the various language : components th

RE: Solr + Tomcat Undeploy Leaks

2007-10-18 Thread Stu Hood
Any ideas? Has anyone had experienced this problem with other containers? I'm not tied to Tomcat if I can find another servlet host with a REST api for deploying apps. Thanks, Stu -Original Message- From: Stu Hood <[EMAIL PROTECTED]> Sent: Wednesday, October 17, 2007 4:46pm To: solr-use

Re: Geographical distance searching

2007-10-18 Thread patrick o'leary
Hi Doug What exactly are you looking for? The code for localsolr is still in dev state, but I've left my work open and available for download at http://www.nsshutdown.com/viewcvs/viewcvs.cgi/localsolr/ Once I'm happy with it, I'll donate it back in the form of patches until / unless it's acce

Re: Lock obtain timed out

2007-10-18 Thread Brian Whitman
Thanks to ryan and matt.. so far so good. true single

RE: [jira] Commented: (SOLR-380) There's no way to convert search results into page-level hits of a "structured document".

2007-10-18 Thread Binkley, Peter
(I'm taking this discussion to solr-user, as Mike Klaas suggested; sorry for using JIRA for it. Previous discussion is at https://issues.apache.org/jira/browse/SOLR-380). I think the requirements I mentioned in a comment (https://issues.apache.org/jira/browse/SOLR-380#action_12535296) justify aban

FunctionQuery, DisMax and Highlighting

2007-10-18 Thread Alf Eaton
I'm currently using the standard request handler for queries, because it provides highlighting (unlike DisMax). I'd also like to be able to use FunctionQuery to boost certain fields. >From looking through the lists and JIRA it looks like there has been some work to add highlighting to DisMax queri

Re: Geographical distance searching

2007-10-18 Thread Doug Daniels
Hi Patrick, Are the solr components of that demo in the repository as well? I couldn't find them there. Best, Doug patrick o'leary wrote: > > As far as I'm concerned nothings going to beat PG's GIS calculations, > but it's tsearch was > a lot slower than myisam. > > My goal was a single sol

Re: multilingual list of stopwords

2007-10-18 Thread Maria Mosolova
Thank you very much for the references Gordon! Looks like that is exactly what I need Maria On 10/18/07, Gordon <[EMAIL PROTECTED]> wrote: > Maria, > > It's perfectly reasonable to build a single list, sort it, and scan it for > especially bad cases. See for example, > http://members.unine.ch/jacqu

Re: Lock obtain timed out

2007-10-18 Thread matt davies
I think you do this if you only have one index true Check with cleverer bods first though On 18 Oct 2007, at 16:11, Brian Whitman wrote: false

Re: Lock obtain timed out

2007-10-18 Thread Ryan McKinley
try setting the lock type to 'single' in solrconfig.xml ... single I have run into troubles a few times since this was added - putting it single type in config has fixed it every time though... ryan Brian Whitman wrote: We have a very active large index running a solr trunk from a fe

Re: multilingual list of stopwords

2007-10-18 Thread Gordon
Maria, It's perfectly reasonable to build a single list, sort it, and scan it for especially bad cases. See for example, http://members.unine.ch/jacques.savoy/clef/index.html for stopwords for several languages or check in some standard programming modules like: http://search.cpan.org/~fabpot/Ling

Re: multilingual list of stopwords

2007-10-18 Thread Maria Mosolova
Thanks a lot Peter! Maria On 10/18/07, Binkley, Peter <[EMAIL PROTECTED]> wrote: > There's code in Nutch to identify the language of a given text: > http://lucene.apache.org/nutch/apidocs/org/apache/nutch/analysis/lang/La > nguageIdentifier.html . > > Peter > > -Original Message- > From: M

RE: multilingual list of stopwords

2007-10-18 Thread Binkley, Peter
There's code in Nutch to identify the language of a given text: http://lucene.apache.org/nutch/apidocs/org/apache/nutch/analysis/lang/La nguageIdentifier.html . Peter -Original Message- From: Maria Mosolova [mailto:[EMAIL PROTECTED] Sent: Thursday, October 18, 2007 8:48 AM To: solr-user

Lock obtain timed out

2007-10-18 Thread Brian Whitman
We have a very active large index running a solr trunk from a few weeks ago that has been going down about once a week for this: [11:08:17.149] No lockType configured for /home/bwhitman/XXX/XXX/ discovered-solr/data/index assuming 'simple' [11:08:17.150] org.apache.lucene.store.LockObtainFaile

Re: multilingual list of stopwords

2007-10-18 Thread Maria Mosolova
Thanks a lot to everyone who responded. Yes, I agree that eventually we need to use separate stopword lists for different languages. Unfortunately the data we are trying to index at the moment does not contain any direct country/language information and we need to create the first version of the in

Re: multilingual list of stopwords

2007-10-18 Thread Walter Underwood
Also "die" in German and English. --wunder On 10/18/07 4:16 AM, "Andrzej Bialecki" <[EMAIL PROTECTED]> wrote: > One example that I'm familiar with: words "is" and "by" in English and > in Swedish. Both words are stopwords in English, but they are content > words in Swedish (ice and village, respe

Re: Solr, operating systems and globalization

2007-10-18 Thread Jeff Rodenburg
OK, this simplifies things greatly. For C#, the proper culture setting for interaction with Solr should be Invariant. Basically, the primary requirement for Solrsharp is to be "culturally-consistent" with the targeted Solr server to ensure proper data-type formatting. Since Solr is culturally-ag

query handling / multiple languages / multiple cores

2007-10-18 Thread Henrib
We have an application where we index documents that can exist in many (at least 2) languages. We have 1 SolrCore per language using the same field names in their schemas (different stopwords , synonyms & stemmers), the benefits for content maintenance overweighting (at least) complexity. Using EN

Re: multilingual list of stopwords

2007-10-18 Thread Grant Ingersoll
Are you sure they don't just mean they want separate stopword lists for various different indexes in different languages? Otherwise, I agree, it doesn't make much sense for a single mixed language index (unless you had an intelligent filter that could select based on language.) Maria, pe

RE: Solr + autocomplete

2007-10-18 Thread Park, Michael
Thx! I remember coming across extjs a ways back. It was very slick. I'll give it a try. -Original Message- From: Bharani [mailto:[EMAIL PROTECTED] Sent: Thursday, October 18, 2007 5:59 AM To: solr-user@lucene.apache.org Subject: RE: Solr + autocomplete You should take a look at http:\

Re: multilingual list of stopwords

2007-10-18 Thread Andrzej Bialecki
Lukas Vlcek wrote: Hi, I haven't heard of multilingual stop words list before. What should be the purpose of it? This seems to odd to me :-) That's because multilingual stopword list doesn't make sense ;) One example that I'm familiar with: words "is" and "by" in English and in Swedish. Both

Re: multilingual list of stopwords

2007-10-18 Thread Lukas Vlcek
Hi, I haven't heard of multilingual stop words list before. What should be the purpose of it? This seems to odd to me :-) Stop words are used to cut down the size of index. One way you can go about this is to create your own list by indexing your documents (without stop words removed) and then lo

RE: Solr + autocomplete

2007-10-18 Thread Bharani
You should take a look at http:\\www.extjs.com. The combo box has got an autocomplete fultionality. Infact it even has paging built into it. I just did a demo using Solr for autocomplete and i got a very good responsive GUI. I have got about 100,000 documents with 26 fields each and get a response

Re: multilingual list of stopwords

2007-10-18 Thread Joseph Doehr
Hi Maria, this is a "me too". ;) At the moment I'll take the way to merge the various language stopword files I need to one and use it. But the main problem in this case is, having collusions with words which are stopwords in one language and in the other not. Cheers, Joe