Re: order of word in the request

2009-02-27 Thread sunnyfr
Thanks Yonik, Yonik Seeley-2 wrote: > > On Thu, Feb 26, 2009 at 11:25 AM, sunnyfr wrote: >> How can I tell it to put a lot of more weight for the book which has >> exactly >> the same title. > > A sloppy phrase query should work. > See the "pf" param in the dismax query parser. > > -Yonik >

Re: SolrCoreAware analyzer

2009-02-27 Thread Bojan Šmid
Thanks for you suggestions. I do need SolrCore, but I could probably live with just SolrResourceLoader, while also creating my own FieldType (which can be ResourceLoaderAware). Bojan On Thu, Feb 26, 2009 at 11:48 PM, Chris Hostetter wrote: > > : I am writing a custom analyzer for my field type

Search in two core of solr with a single search query

2009-02-27 Thread Sagar Khetkade
Hi, I have a issue here as i want to search a query. That query would be fired in two core of solr having different indexes and would merge the result set. This was possible in Lucene using multisearcher and then merging the result. Please suggest me how can i do it in solr. Thanks, Sa

Re: Direct control over document position in search results

2009-02-27 Thread Erik Hatcher
On Feb 23, 2009, at 7:46 PM, Ercan, Tolga wrote: I was wondering if there was any facility to directly manipulate search results based on business criteria to place documents at a fixed position in those results. For example, when I issue a query, the first four results would be based on na

Re: Search schema using q Query

2009-02-27 Thread Erik Hatcher
One first step is to use debugQuery=true as an additional parameter to your search request. That'll return debug info in the response, which includes a couple of views of the parsed query. Erik On Feb 26, 2009, at 2:05 AM, dabboo wrote: Hi, I am trying to search the schema with

Re: Search in two core of solr with a single search query

2009-02-27 Thread Otis Gospodnetic
Sagar, You can use DistributedSearch (check this on the Wiki) for that. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Sagar Khetkade > To: "solr-user@lucene.apache.org" > Sent: Friday, February 27, 2009 3:29:57 AM > Subject: Search in

Re: warming question

2009-02-27 Thread Marc Sturlese
Hey, I am working with a nighlty and just had to apply the modifications in the code and add a couple of lines in solrconfig.xml (as it's shown in the patch). Didn't it work for you? Jonathan Haddad wrote: > > Does anyone have any good documentation that explains how to set up > the warming fea

Re: Use of scanned documents for text extraction and indexing

2009-02-27 Thread Vikram Kumar
Check this: http://code.google.com/p/ocropus/wiki/FrequentlyAskedQuestions > How well does it work? > The character recognition accuracy of OCRopus right now (04/2007) is about > like Tesseract. That's because the only character recognition plug-in in > OCRopus is, in fact, Tesseract. In the futur

Re: uploading binary file or rich document through SOLRJ

2009-02-27 Thread Erik Hatcher
It actually is possible now to make that sort of request now... note that he's not actually posting file content in the request, but using a simple HTTP get parameter stream.file. Using the SolrJ library, you can simply use the add/set methods on SolrQuery. But yes, it would be great for S

passing parameters into the XSLTResponseWriter: particularly hostname

2009-02-27 Thread Fergus McMenemie
Hello all, I was wondering if there was a way of passing parameters into the XSLTResponseWriter writer. I always like the option of formatting my search results as an RSS feed. Users can therefore configure their phone, browser etc to automatically redo a search every so often and have new item

Re: custom reranking

2009-02-27 Thread Grant Ingersoll
On Feb 26, 2009, at 11:16 PM, CIF Search wrote: I believe the query component will generate the query in such a way that i get the results that i want, but not process the returned results, is that correct? Is there a way in which i can group the returned results, and rank each group separ

ApacheCon Lucene Meetup

2009-02-27 Thread Grant Ingersoll
If you're in or around Amsterdam during the week of ApacheCon (Mar 23-27), check out the Lucene Meetup we are organizing: http://wiki.apache.org/lucene-java/LuceneMeetupMarch2009 -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/So

RE: Use of scanned documents for text extraction and indexing

2009-02-27 Thread Sudarsan, Sithu D.
Thanks to all who have responded (Hanners, Shashi, Vikram, Bastian, Renaud and the rest). Using OCRopus might provide the flexibility to use multi-column documents and formatted ones. Regarding literature on OCR, few follow up of the paper link provided Renaud do exist, but could not locate an

Re: Custom Sorting

2009-02-27 Thread psyron
I was sucessful with your hint and just need to solve another problem: The problem I have is that I have implemented a custome sorting by following your advice to code a QParserPlugin and to create a custom comparator as described in your book, and it really works But now I also would like to ret

Re: Lucene sync bottleneck?

2009-02-27 Thread Yonik Seeley
I'm using trunk, but I set a breakpoint on SegmentReader.isDeleted() on an index with deletions, and I couldn't get it to be called. numDocs : 26 maxDoc : 130 reader:SolrIndexReader{this=1935e6f,r=readonlymultisegmentrea...@1935e6f,segments=5} -Yonik http://www.lucidimagination.com On Thu, Feb

Re: Lucene sync bottleneck?

2009-02-27 Thread Matthew Runo
We're using: Solr Specification Version: 1.3.0.2009.01.23.10.46.02 Solr Implementation Version: 1.4-dev 737141M - root - 2009-01-23 10:46:02 Lucene Specification Version: 2.9-dev Lucene Implementation Version: 2.9-dev 724059 - 2008-12-06 20:08:54 We'll see about getting up to trunk and firing

Re: Lucene sync bottleneck?

2009-02-27 Thread Chris Hostetter
: Solr Implementation Version: 1.4-dev 737141M - root - 2009-01-23 10:46:02 that M indicates there were local modifications (relative svn version #737141) at the time of compilation. Do you have some local patches? anything that would have affected the way IndexReaders get opened? -Hoss

Re: Lucene sync bottleneck?

2009-02-27 Thread Matthew Runo
We're just using an SVN up, with no local modifications. It's probably a formatting difference from having opened solr in an IDE. We're building from lucene and solr trunk right now, and I'll let you all know how that goes. We'll test it as best we can with JMeter. The build we had up there

Re: Lucene sync bottleneck?

2009-02-27 Thread Matthew Runo
OK. Call me chicken little. We must have had bad class files or something hanging out in our build that had the issues. Having built from trunk, we're seeing perfectly fine response times even at 500 requests a second. Thank you for your help, and sorry to bring it up without testing trunk.

Trunk Replication Page Issue

2009-02-27 Thread Jeff Newburn
In trying trunk to fix the Lucene Sync issue we have now encountered a severed java exception making the replication page non functional. Am I missing something or doing something wrong? Info: Slave server on the replication page. Just a code dump as follows. Feb 27, 2009 8:44:37 AM org.apache.

Re: warming question

2009-02-27 Thread Jonathan Haddad
I'm using the latest stable - I'm brand new to solr and I don't know where to find all the docs yet. I'm guessing I should be looking at this page: http://wiki.apache.org/solr/SolrCaching#head-34647c63c38782b2fc93c919bb34f8c795a1ee65 I have an index of 1.5 million documents. It's updated every

Redhat vs FreeBSD vs other unix flavors

2009-02-27 Thread wojtekpia
Is there a recommended unix flavor for deploying Solr on? I've benchmarked my deployment on Red Hat. Our operations team asked if we can use FreeBSD instead. Assuming that my benchmark numbers are consistent on FreeBSD, is there anything else I should watch out for? Thanks. Wojtek -- View this

Re: Redhat vs FreeBSD vs other unix flavors

2009-02-27 Thread Otis Gospodnetic
You should be fine on either Linux or FreeBSD (or any other UNIX flavour). Running on Solaris would probably give you access to goodness like dtrace, but you can live without it. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: wojtekpia

Re: Redhat vs FreeBSD vs other unix flavors

2009-02-27 Thread wojtekpia
Thanks Otis. Do you know what the most common deployment OS is? I couldn't find much on the mailing list or http://wiki.apache.org/solr/PublicServers Otis Gospodnetic wrote: > > > You should be fine on either Linux or FreeBSD (or any other UNIX flavour). > Running on Solaris would probably gi

Re: Redhat vs FreeBSD vs other unix flavors

2009-02-27 Thread Matthew Runo
I'm willing to be it'd be some flavor of Linux. We run on Gentoo. When it comes down to it, I'd think your application server (Tomcat, Resin, etc) would have more impact on Solr performance than the OS. On that front, I'd bet that Tomcat 5 or 6 is the most commonly deployed. Thanks for your

Re: warming question

2009-02-27 Thread Otis Gospodnetic
That, plus: http://wiki.apache.org/solr/SolrCaching#head-7d0ea6f02cb1d068bf6469201e013ce8e23e175b Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Jonathan Haddad > To: solr-user@lucene.apache.org > Sent: Friday, February 27, 2009 12:42:09

Re: Redhat vs FreeBSD vs other unix flavors

2009-02-27 Thread Andrzej Bialecki
Otis Gospodnetic wrote: You should be fine on either Linux or FreeBSD (or any other UNIX flavour). Running on Solaris would probably give you access to goodness like dtrace, but you can live without it. There's dtrace on FreeBSD, too. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ _

Re: Redhat vs FreeBSD vs other unix flavors

2009-02-27 Thread Yonik Seeley
On Fri, Feb 27, 2009 at 1:08 PM, wojtekpia wrote: > Thanks Otis. Do you know what the most common deployment OS is? I couldn't > find much on the mailing list or http://wiki.apache.org/solr/PublicServers I would guess RHEL (red hat enterprise linux, or CentOS for the free version). Ubuntu looks l

Re: Redhat vs FreeBSD vs other unix flavors

2009-02-27 Thread Otis Gospodnetic
Same observations here. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Yonik Seeley > To: solr-user@lucene.apache.org > Sent: Friday, February 27, 2009 1:20:53 PM > Subject: Re: Redhat vs FreeBSD vs other unix flavors > > On Fri, Feb 27,

Integrating Solr and Nutch

2009-02-27 Thread ahammad
Hello, I'm wondering if it's possible to make Solr use a Nutch index. I used Nutch to crawl some pages and I now have an index with about 2000 documents. I want to explore the features of Solr, and since both Nutch and Solr are based off Lucene, I assume that there is some way to integrate them w

Re: Integrating Solr and Nutch

2009-02-27 Thread Tony Wang
I heard Nutch 1.0 will have an easy way to integrate with Solr, but I haven't found any documentation on that yet. anyone? On Fri, Feb 27, 2009 at 12:14 PM, ahammad wrote: > > Hello, > > I'm wondering if it's possible to make Solr use a Nutch index. I used Nutch > to crawl some pages and I now h

Re: Integrating Solr and Nutch

2009-02-27 Thread Andrzej Bialecki
Tony Wang wrote: I heard Nutch 1.0 will have an easy way to integrate with Solr, but I haven't found any documentation on that yet. anyone? Indeed, this integration is already supported in Nutch trunk (soon to be released). Please download a nightly package and test it. You will need to rein

Re: Integrating Solr and Nutch

2009-02-27 Thread Tony Wang
Hi Andrez: Could you please tell us how to do the nutch1.0/solr integration in a little more detail? I'm very interested in implementing it. thanks tony On Fri, Feb 27, 2009 at 1:27 PM, Andrzej Bialecki wrote: > Tony Wang wrote: > >> I heard Nutch 1.0 will have an easy way to integrate with So

indexing while optimizing

2009-02-27 Thread Laimonas Simutis
Hey, my SOLR setup looks like the following: server running apache-tomcat with solr1.2, index size is about 1G (a bit more than 4 million documents). I have another machine that basically every minute or so sends some documents to be indexed. I have autocommit turned on with maxDocs: 5000, maxTi

Re: solr 1.3 - did something with deleting documents change?

2009-02-27 Thread Chris Hostetter
: image.1image.2 etc... (one : delete node for each image we wanted to delete) : : And that worked in 1.2. that is really suprising ... it's not a legal XML doc (multiple root nodes) so it should have been an error. Support was added in Solr 1.3 to support multiple elements in a single elem

Re: warming question

2009-02-27 Thread Jonathan Haddad
I think this is exactly what I was looking for. Thanks! On Fri, Feb 27, 2009 at 10:06 AM, Otis Gospodnetic wrote: > > That, plus: > http://wiki.apache.org/solr/SolrCaching#head-7d0ea6f02cb1d068bf6469201e013ce8e23e175b > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > >

Tomcat causing problem

2009-02-27 Thread Tony Wang
This appears to be a new problem for me. Whenever I try to stop tomcat, I always got this error: Using CATALINA_BASE: /opt/tomcat6 Using CATALINA_HOME: /opt/tomcat6 Using CATALINA_TMPDIR: /opt/tomcat6/temp Using JRE_HOME: /usr/lib/jvm/java-6-sun Exception in thread "main" java.lang.NoCla

Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-27 Thread Ryuuichi KUMAI
Hello Marc, I faced the similar problem, and I found a workaround. If the performance degradation in your application is caused by GC, this information might help you: https://issues.apache.org/jira/browse/SOLR-1042 Regards, Ryuuichi Kumai. 2009/2/21 Marc Sturlese : > > I am working with 3 inde

Re: Tomcat causing problem

2009-02-27 Thread Otis Gospodnetic
It's a classpath problem it seems, but I can't tell exactly what's wrong. It looks like a pure Tomcat problem, so you may get more help on a Tomcat list. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Tony Wang > To: solr-user@lucene.a