date:20090513

Re: Solr memory requirements?

2009-05-13 Thread vivek sar

Otis, We are not running master-slave configuration. We get very few searches(admin only) in a day so we didn't see the need of replication/snapshot. This problem is with one Solr instance managing 4 cores (each core 200 million records). Both indexing and searching is performed by the same Solr

Re: Java Environment Problem on Vista

2009-05-13 Thread Amit Nithian

To me it sounds like it's not finding solr home. I have Windows Vista and JDK 1.6.0_11 and when I run java -jar start.jar, I too get a ton of the INFO messages and one of them should read something like:INFO: solr home defaulted to 'solr/' (could not find system property or JNDI) May 13, 2009 10:45

Re: Replication master+slave

2009-05-13 Thread Shalin Shekhar Mangar

There's a related issue open. https://issues.apache.org/jira/browse/SOLR-712 On Thu, May 14, 2009 at 7:50 AM, Otis Gospodnetic < otis_gospodne...@yahoo.com> wrote: > > Bryan, maybe it's time to stick this in JIRA? > http://wiki.apache.org/solr/HowToContribute > > Thanks, > Otis > -- > Sematext -

Re: master/slave failure scenario

2009-05-13 Thread Noble Paul നോബിള്‍ नोब्ळ्

ideally , we don't do that. you can just keep the master host behind a VIP so if you wish to change the master make the VIP point to the new host On Wed, May 13, 2009 at 10:52 PM, nk 11 wrote: > This is more interesting.Such a procedure would involve taking down and > reconfiguring the slave? > >

Re: Custom Servlet Filter, Where to put filter-mappings

2009-05-13 Thread Jacob Singh

HI Grant, That's not a bad idea... I could try that. I was also looking at cactus: http://jakarta.apache.org/cactus/integration/ant/index.html It has an ant task to merge XML. Could this be a contrib-crawl add-on? Alternately, do you know of any xslt templates built for this? Could write one,

Re: Solr memory requirements?

2009-05-13 Thread Grant Ingersoll

On May 13, 2009, at 6:53 PM, vivek sar wrote: Disabling first/new searchers did help for the initial load time, but after 10-15 min the heap memory start climbing up again and reached max within 20 min. Now the GC is coming up all the time, which is slowing down the commit and search cycles. T

Re: Creating new QParserPlugin

2009-05-13 Thread Otis Gospodnetic

Andrey, I urge you to use JIRA for this. That's exactly what it's for and how it gets used. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Andrey Klochkov > To: solr-user@lucene.apache.org > Sent: Thursday, May 7, 2009 5:14:26 AM > Sub

Re: Sorting by 'starts with'

2009-05-13 Thread Otis Gospodnetic

Wojtek, I believe http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/search/spans/SpanFirstQuery.html would help, though there is no support for Span queries in Solr. But there is support for custom query parsers, and there is http://lucene.apache.org/java/2_4_1/api/contrib-snowb

Re: Replication master+slave

2009-05-13 Thread Otis Gospodnetic

Bryan, maybe it's time to stick this in JIRA? http://wiki.apache.org/solr/HowToContribute Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Bryan Talbot > To: solr-user@lucene.apache.org > Sent: Wednesday, May 13, 2009 10:11:21 PM >

Re: Replication master+slave

2009-05-13 Thread Bryan Talbot

I think the patch I included earlier covers solr core, but it looks like at least some other extensions (DIH) create and use their own XML parser. So, if this functionality is to extend to all XML files, those will need similar patches. Here's one for DIH: --- src/main/java/org/apache/sol

Re: Replication master+slave

2009-05-13 Thread Otis Gospodnetic

Coincidentally, from http://www.cloudera.com/blog/2009/05/07/what%E2%80%99s-new-in-hadoop-core-020/ : "Hadoop configuration files now support XInclude elements for including portions of another configuration file (HADOOP-4944). This mechanism allows you to make configuration files more modular

Re: Solr memory requirements?

2009-05-13 Thread Otis Gospodnetic

There is constant mixing of indexing concepts and searching concepts in this thread. Are you having problems on the master (indexing) or on the slave (searching)? That .tii is only 20K and you said this is a large index? That doesn't smell right... Otis -- Sematext -- http://sematext.com/

Re: Solr memory requirements?

2009-05-13 Thread Otis Gospodnetic

Yeah, I'm not sure why this would help. There should be nothing in FieldCaches unless you sort or use facets. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: vivek sar > To: solr-user@lucene.apache.org > Sent: Wednesday, May 13, 2009 5:5

Re: Solr memory requirements?

2009-05-13 Thread Otis Gospodnetic

Even a simple command like this will help: jmap -histo:live | head -30 Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: vivek sar > To: solr-user@lucene.apache.org > Sent: Wednesday, May 13, 2009 6:53:29 PM > Subject: Re: Solr memory re

Java Environment Problem on Vista

2009-05-13 Thread John Bennett

I'm having difficulty getting Solr running on Vista. I've got the 1.6 JDK installed, and I've successfully compiled file and run other Java programs. When I run java -jar start.jar in the Apache Solr example directory, I get a large number of INFO messages, including: INFO: JNDI not configur

Re: acts_as_solr patch support for Solr Cell style requests

2009-05-13 Thread Thanh Doan

I created Ruby class SolrCellRequest and saved it to /path/to/resume/vendor/plugins/acts_as_solr/lib directory. Here is code original from the tutorial. module ActsAsSolr class SolrCellRequest < Solr::Request::Select def initialize(doc,file_name) . . def handler '

Re: Solr memory requirements?

2009-05-13 Thread Erick Erickson

Warning: I'm wy out of my competency range when I comment on SOLR, but I've seen the statement that string fields are NOT tokenized while text fields are, and I notice that almost all of your fields are string type. Would someone more knowledgeable than me care to comment on whether this is at

Re: Solr memory requirements?

2009-05-13 Thread vivek sar

I think maxBufferedDocs has been deprecated in Solr 1.4 - it's recommended to use ramBufferSizeMB instead. My ramBufferSizeMB=64. This shouldn't be a problem I think. There has to be something else that Solr is holding up in memory. Anyone else? Thanks, -vivek On Wed, May 13, 2009 at 4:01 PM, Ja

acts_as_solr patch support for Solr Cell style requests

2009-05-13 Thread Thanh Doan

Hi Erik et all, I am following this tutorial link http://www.lucidimagination.com/blog/tag/acts_as_solr/ to play with acts_as_solr and see if we can invoke solr cell right from our Rails app. following he tutorial i created classSolrCellRequest but dont know where to save the solr_cell_re

Re: Solr memory requirements?

2009-05-13 Thread Jack Godwin

Have you checked the maxBufferedDocs? I had to drop mine down to 1000 with 3 million docs. Jack On Wed, May 13, 2009 at 6:53 PM, vivek sar wrote: > Disabling first/new searchers did help for the initial load time, but > after 10-15 min the heap memory start climbing up again and reached > max w

Re: Solr memory requirements?

2009-05-13 Thread vivek sar

Disabling first/new searchers did help for the initial load time, but after 10-15 min the heap memory start climbing up again and reached max within 20 min. Now the GC is coming up all the time, which is slowing down the commit and search cycles. This is still puzzling what does Solr holds in the

Re: Solr memory requirements?

2009-05-13 Thread vivek sar

Just an update on the memory issue - might be useful for others. I read the following, http://wiki.apache.org/solr/SolrCaching?highlight=(SolrCaching) and looks like the first and new searcher listeners would populate the FieldCache. Commenting out these two listener entries seems to do the tric

Re: Solr memory requirements?

2009-05-13 Thread Grant Ingersoll

Have you done any profiling to see where the hotspots are? I realize that may be difficult on an index of that size, but maybe you can approximate on a smaller version. Also, do you have warming queries? You might also look into setting the termIndexInterval at the Lucene level. This is

SOLR date boost

2009-05-13 Thread Jack Godwin

With solr 1.3 I'm having a problem boosting new documents to the top. I used the recommended BoostFunction "recip(rord(created_at),1,1000,1000)" but older documents, sometimes 5 years old, make it to the top 3 documents. I've started using "ord(created_at)^0.0005" and get better results, but I d

Re: Solr memory requirements?

2009-05-13 Thread vivek sar

Otis, In that case, I'm not sure why Solr is taking up so much memory as soon as we start it up. I checked for .tii file and there is only one, -rw-r--r-- 1 search staff 20306 May 11 21:47 ./20090510_1/data/index/_3au.tii I have all the cache disabled - so that shouldn't be a problem too. My

Re: Solr memory requirements?

2009-05-13 Thread Otis Gospodnetic

Hi, Sorting is triggered by the sort parameter in the URL, not a characteristic of a field. :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: vivek sar > To: solr-user@lucene.apache.org > Sent: Wednesday, May 13, 2009 4:42:16 PM > Subje

Re: Solr memory requirements?

2009-05-13 Thread vivek sar

Thanks Otis. Our use case doesn't require any sorting or faceting. I'm wondering if I've configured anything wrong. I got total of 25 fields (15 are indexed and stored, other 10 are just stored). All my fields are basic data type - which I thought are not sorted. My id field is unique key. Is th

Re: Replication master+slave

2009-05-13 Thread Peter Wolanin

Indeed - that looks nice - having some kind of conditional includes would make many things easier. -Peter On Wed, May 13, 2009 at 4:22 PM, Otis Gospodnetic wrote: > > This looks nice and simple. I don't know enough about this stuff to see any > issues. If there are no issues.? > > Otis >

Re: Replication master+slave

2009-05-13 Thread Otis Gospodnetic

This looks nice and simple. I don't know enough about this stuff to see any issues. If there are no issues.? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Bryan Talbot > To: solr-user@lucene.apache.org > Sent: Wednesday, May 13, 2

Re: Solr memory requirements?

2009-05-13 Thread Otis Gospodnetic

Hi, Some answers: 1) .tii files in the Lucene index. When you sort, all distinct values for the field(s) used for sorting. Similarly for facet fields. Solr caches. 2) ramBufferSizeMB dictates, more or less, how much Lucene/Solr will consume during indexing. There is no need to commit every 5

Solr memory requirements?

2009-05-13 Thread vivek sar

Hi, I'm pretty sure this has been asked before, but I couldn't find a complete answer in the forum archive. Here are my questions, 1) When solr starts up what does it loads up in the memory? Let's say I've 4 cores with each core 50G in size. When Solr comes up how much of it would be loaded in

Re: Selective Searches Based on User Identity

2009-05-13 Thread Michael Ludwig

Hi Terence, Terence Gannon schrieb: Yes, the ownerUid will likely be assigned once and never changed. But you still need it, in order to keep track of who has contributed which document. Yes, of course! I've been going over some of the simpler query scenarios, and Solr is capable of handlin

Re: Delete documents from index with dataimport

2009-05-13 Thread Fergus McMenemie

>Hi > >Is it possible, through dataimport handler to remove an existing >document from the Solr index? > >I import/update from my database where the active field is true. >However, if the client then set's active to false, the document stays >in the Solr index and doesn't get removed. > >Regards >A

Re: how to manually add data to indexes generated by nutch-1.0 using solr

2009-05-13 Thread Erik Hatcher

Try a search for *:* and see if you get results for that. If so, you have your documents indexed, but you need to dig into things like query parser configuration and analysis to see why things aren't matching. Perhaps you're not querying the field you think you are? Erik On May 1

Re: Commits taking too long

2009-05-13 Thread vivek sar

Hi, This problem is still haunting us. I've reduced the merge factor to 50, but as my index get fat (anything over 20G), the commit starts taking much longer. Some info, 1) Less than 20 G index size, 5000 records commit takes around 15sec 2) Over 20G the commit starts taking 50-70sec for 5K rec

Re: master/slave failure scenario

2009-05-13 Thread nk 11

This is more interesting.Such a procedure would involve taking down and reconfiguring the slave? On Wed, May 13, 2009 at 7:55 PM, Bryan Talbot wrote: > Or ... > > 1. Promote existing slave to new master > 2. Add new slave to cluster > > > > > -Bryan > > > > > > On May 13, 2009, at May 13, 9:48 AM

Re: how to manually add data to indexes generated by nutch-1.0 using solr

2009-05-13 Thread alxsss

I forget to say that when I do curl http://localhost:8983/solr/update -H "Content-Type: text/xml" --data-binary '' 0453 and search for added keywords gives 0 results. Does status 0 mean that addition was successful? Thanks. Alex. -Original Message- From: Erik Hatcher T

Re: camel-casing and dismax troubles

2009-05-13 Thread Yonik Seeley

On Wed, May 13, 2009 at 12:29 PM, Geoffrey Young wrote: >> However since the indexed term is simply "leann", a >> WordDelimiterFilter configured to split won't match (a search for >> "LeAnn" will be translated into a search for "le" "ann". > > but the concatparts and/or concatall should handle spl

Re: master/slave failure scenario

2009-05-13 Thread Bryan Talbot

Or ... 1. Promote existing slave to new master 2. Add new slave to cluster -Bryan On May 13, 2009, at May 13, 9:48 AM, Jay Hill wrote: - Migrate configuration files from old master (or backup) to new master. - Replicate from a slave to the new master. - Resume indexing to new master.

Re: master/slave failure scenario

2009-05-13 Thread Jay Hill

- Migrate configuration files from old master (or backup) to new master. - Replicate from a slave to the new master. - Resume indexing to new master. -Jay On Wed, May 13, 2009 at 4:26 AM, nk 11 wrote: > Nice. > What if the master fails permanently (like a disk crash...) and the new > master is

Re: Solr vs Sphinx

2009-05-13 Thread Todd Benge

Our company has a large search deployment serving > 50 M search hits / per day. We've been leveraging Lucene for several years and have recently deployed Solr for the distributed search feature. We were hitting scaling limits with lucene due to our index size. I did an evaluation of Sphinx and f

Re: Solr vs Sphinx

2009-05-13 Thread Grant Ingersoll

On May 13, 2009, at 11:55 AM, wojtekpia wrote: I came across this article praising Sphinx: http://www.theregister.co.uk/2009/05/08/dziuba_sphinx/. The article specifically mentions Solr as an 'aging' technology, Solr is the same age as Sphinx (2006), so if Solr is aging, then so is Sphinx.

Re: camel-casing and dismax troubles

2009-05-13 Thread Geoffrey Young

On Wed, May 13, 2009 at 6:23 AM, Yonik Seeley wrote: > On Tue, May 12, 2009 at 7:19 PM, Geoffrey Young > wrote: >> hi all :) >> >> I'm having trouble with camel-cased query strings and the dismax handler. >> >> a user query >> >> LeAnn Rimes >> >> isn't matching the indexed term >> >> Leann Rim

Re: Solr vs Sphinx

2009-05-13 Thread Yonik Seeley

It's probably the case that every search engine out there is faster than Solr at one thing or another, and that Solr is faster or better at some other things. I prefer to spend my time improving Solr rather than engage in benchmarking wars... and Solr 1.4 will have a ton of speed improvements over

Re: Replication master+slave

2009-05-13 Thread Bryan Talbot

I see that Nobel's final comment in SOLR-1154 is that config files need to be able to include snippets from external files. In my limited testing, a simple patch to enable XInclude support seems to work. --- src/java/org/apache/solr/core/Config.java (revision 774137) +++ src/java/org/a

Solr vs Sphinx

2009-05-13 Thread wojtekpia

I came across this article praising Sphinx: http://www.theregister.co.uk/2009/05/08/dziuba_sphinx/. The article specifically mentions Solr as an 'aging' technology, and states that performance on Sphinx is 2x-4x faster than Solr. Has anyone compared Sphinx to Solr? Or used Sphinx in the past? I re

RE: Selective Searches Based on User Identity

2009-05-13 Thread Terence Gannon

Yes, the ownerUid will likely be assigned once and never changed. But you still need it, in order to keep track of who has contributed which document. I've been going over some of the simpler query scenarios, and Solr is capable of handling them without having to resort to an external RDBMS. In

Delete documents from index with dataimport

2009-05-13 Thread Andrew McCombe

Hi Is it possible, through dataimport handler to remove an existing document from the Solr index? I import/update from my database where the active field is true. However, if the client then set's active to false, the document stays in the Solr index and doesn't get removed. Regards Andrew

Re: camel-casing and dismax troubles

2009-05-13 Thread Yonik Seeley

On Tue, May 12, 2009 at 7:19 PM, Geoffrey Young wrote: > hi all :) > > I'm having trouble with camel-cased query strings and the dismax handler. > > a user query > > LeAnn Rimes > > isn't matching the indexed term > > Leann Rimes This is the camel-case case that can't currently be handled by a

Re: Custom Servlet Filter, Where to put filter-mappings

2009-05-13 Thread Grant Ingersoll

Hmmm, maybe we need to think about someway to hook this into the build process or make it easier to just drop it into the conf or lib dirs. I'm no web.xml expert, but I'm sure you're not the first one to want to do this kind of thing. The easiest way _might_ be to patch build.xml to take a

Re: Selective Searches Based on User Identity

2009-05-13 Thread Michael Ludwig

Terence Gannon schrieb: Paul -- thanks for the reply, I appreciate it. That's a very practical approach, and is worth taking a closer look at. Actually, taking your idea one step further, perhaps three fields; 1) ownerUid (uid of the document's owner) 2) grantedUid (uid of users who have been g

Re: master/slave failure scenario

2009-05-13 Thread nk 11

Nice. What if the master fails permanently (like a disk crash...) and the new master is a clean machine? 2009/5/13 Noble Paul നോബിള്‍ नोब्ळ् > On Wed, May 13, 2009 at 12:10 PM, nk 11 wrote: > > Hello > > > > I'm kind of new to Solr and I've read about replication, and the fact > that a > > node

Re: master/slave failure scenario

2009-05-13 Thread Noble Paul നോബിള്‍ नोब्ळ्

On Wed, May 13, 2009 at 12:10 PM, nk 11 wrote: > Hello > > I'm kind of new to Solr and I've read about replication, and the fact that a > node can act as both master and slave. > I a replica fails and then comes back on line I suppose that it will resyncs > with the master. right > > But what happ

Re: Newbie question

2009-05-13 Thread Wayne Pope

Hello Shalin, thaks you for your help. yes it answers my question. Much appreciated Shalin Shekhar Mangar wrote: > > On Tue, May 12, 2009 at 9:48 PM, Wayne Pope > wrote: > >> >> I have this request: >> >> >> http://localhost:8983/solr/select?start=0&rows=20&qt=dismax&q=copy&hl=true&hl.snipp

Re: Who is running 1.4 nightly in production?

2009-05-13 Thread Markus Jelsma - Buyways B.V.

Thats probably Jira #1063. We have only seen it in the spellcheck results and only in PHPS and not in PHP ResponseWriter. https://issues.apache.org/jira/browse/SOLR-1063 - Markus Jelsma Buyways B.V. Tel. 050-3118123 Technisch ArchitectFriesestraatweg 215c

RE: Solr Loggin issue

2009-05-13 Thread Sagar Khetkade

In addition to earlier mail I have a particular scenario. For that I have to explain my application level logging in detail. I am using solr as embedded server. I am using solr with Solr-560-slf4j patch. I need logging information for solr. Right now my application is using log4j for logging

Re: Who is running 1.4 nightly in production?

2009-05-13 Thread Andrew McCombe

We are using a nightly from 13/04. I've found one issue with the PHP ResponseWriter but apart from that it has been pretty solid. I'm using the bundled Jetty server to run it for the moment but hope to move to Tomcat once released and stable (and I have learned Tomcat!). Andrew 2009/5/12 Walte

57 matches

Mail list logo