Re: solr 3.5 taking long to index

2012-04-14 Thread Lance Norskog
You're doing more commits than you need. You may want to turn off autocommit since you are running commit yourself. Every commit causes segment activity, so if you want to minimize that, you don't need autocommit. About memory sizing: you should drop the memory assigned to Solr until it slows down

RE: solr 3.5 taking long to index

2012-04-14 Thread Rohit
Hey Shawn, Solr is working better, though not out of the woods, freed up some memory is the system and also increased the mergeFactor to 20. Has another question, we had autocommit ON all this while in our solrconfig.xml, but since the upgrade we have been noticing keeping autocommit on is inc

Re: Monitoring SolrCloud health

2012-04-14 Thread Jamie Johnson
Otis, Is SMP a hosted service? My application is not something that is publicly available so I need a solution that I can install and monitor the system with. Zabbix seems very interesting. I will give this a whirl. On Sat, Apr 14, 2012 at 7:20 PM, Lance Norskog wrote: > SPM looks cool! If yo

Re: Solr Scoring

2012-04-14 Thread Lance Norskog
This was a common one when I was matching movie and song names. If that is your project, also try boosting if it's the first word or on shorter titles. Also try bigrams of stopwords: "Call of the Wild" becomes "call", "of-the", "wild". The bigrams trick is also good if you have people block-copyin

Re: Monitoring SolrCloud health

2012-04-14 Thread Lance Norskog
SPM looks cool! If you can get analytics out for the numbers analysts like: search quality (precision, recall, MRR etc.), effectiveness of recommendations, drop-down keystrokes etc., you've got a winner. On Sat, Apr 14, 2012 at 2:20 PM, Otis Gospodnetic wrote: > Jamie, > > We have Performance Mon

Re: Monitoring SolrCloud health

2012-04-14 Thread Lance Norskog
Other cool options: Zabbix collects from many things include Solr JVM JMX beans, which is what the solr/admin/stats.jsp page fetches. Zabbix fetches, archives, graphs and alerts. We could not find another monitor that did all of these well. NewRelic is a hosted service for Solr and a lot of other

Re: Options for automagically Scaling Solr (without needing distributed index/replication) in a Hadoop environment

2012-04-14 Thread Lance Norskog
It sounds like you really want the final map/reduce phase to put Solr index files into HDFS. Solr has a feature to do this called 'Embedded Solr'. This packages Solr as a library instead of an HTTP servlet. The Solr committers mostly hate it and want it to go away, but it is useful for exactly this

Re: Options for automagically Scaling Solr (without needing distributed index/replication) in a Hadoop environment

2012-04-14 Thread Otis Gospodnetic
Hello, Unfortunately I don't know when exactly SolrCloud release will be ready, but we've used trunk versions in the past and didn't have major issues. Otis  Performance Monitoring SaaS for Solr - http://sematext.com/spm/solr-performance-monitoring/index.html >__

Re: Monitoring SolrCloud health

2012-04-14 Thread Otis Gospodnetic
Jamie, We have Performance Monitoring for Solr (+HBase, ElasticSearch, and some other things...).  I don't think we've tested it with SolrCloud yet, but that is pretty much next big item for SPM for Solr.  The easiest way to find out when SPM for SolrCloud is available is by following @sematext

Re: Monitoring SolrCloud health

2012-04-14 Thread Jamie Johnson
ah, one last piece, and if you're not some how alert an admin about it, perhaps through email or something. Maybe this question is more application availability monitoring in general? Any opinions would be appreciated. On Sat, Apr 14, 2012 at 1:57 PM, Jamie Johnson wrote: > Right now my biggest

Re: Monitoring SolrCloud health

2012-04-14 Thread Jamie Johnson
Right now my biggest concern is if the systems are all up and running. It would be nice to be able to see the stats that are provided on the current solr admin page across the cluster though, but again that's not my biggest issue now. Just want to know, are you up and running. On Sat, Apr 14, 20

Re: Can I discover what part of a score is attributable to a subquery?

2012-04-14 Thread Paul Libbrecht
Benson, If I remember well, the big big problem is that there's all sorts of recalibration of the scores based on the query. Therefore having it in one go is really nice. I am not sure the different similarity can be put together well here though... paul Le 14 avr. 2012 à 18:58, Benson Marg

Re: Can I discover what part of a score is attributable to a subquery?

2012-04-14 Thread Benson Margulies
On Sat, Apr 14, 2012 at 12:37 PM, Paul Libbrecht wrote: > Benson, > > it was in the Lucene world in May 2010: >         > http://mail-archives.apache.org/mod_mbox/lucene-java-user/201005.mbox/%3c469705.48901...@web29016.mail.ird.yahoo.com%3E > Mark Harwood pointed me to a "FlagQuery" which was exa

Re: Large Index and OutOfMemoryError: Map failed

2012-04-14 Thread Gopal Patwa
I checked it was "MMapDirectory.UNMAP_SUPPORTED=true" and below are my system data. Is their any existing test case to reproduce this issue? I am trying understand how I can reproduce this issue with unit/integration test I will try recent solr trunk build too, if it is some bug in solr or lucene

Re: Can I discover what part of a score is attributable to a subquery?

2012-04-14 Thread Paul Libbrecht
Benson, it was in the Lucene world in May 2010: http://mail-archives.apache.org/mod_mbox/lucene-java-user/201005.mbox/%3c469705.48901...@web29016.mail.ird.yahoo.com%3E Mark Harwood pointed me to a "FlagQuery" which was exactly what I needed. His contribution sounds not to have been taken

custom org.apache.lucene.store.Directory

2012-04-14 Thread Radim Kolar
is custom /org.apache.lucene.store.Directory ///supported in Solr? I want to try infinispan. //

Re: Monitoring SolrCloud health

2012-04-14 Thread Mark Miller
On Apr 14, 2012, at 12:03 AM, Jamie Johnson wrote: > How do people currently monitor the health of a solr cluster? Are > there any good tools which can show the health across the entire > cluster? Is this something which is planned for the new admin user > interface? Work on it happening here

Re: Can I discover what part of a score is attributable to a subquery?

2012-04-14 Thread Benson Margulies
yes please On Apr 14, 2012, at 2:40 AM, Paul Libbrecht wrote: > Benson, > In mid 2009, I has such a question answered with a nifty score bitwise > manipulation, and a little precision loss. For each result I could pick the > language of a multilingual match. > If interested, I can dig. > Paul

Re: Monitoring SolrCloud health

2012-04-14 Thread Darren Govoni
Can you be more specific about "health"? On Sat, 2012-04-14 at 00:03 -0400, Jamie Johnson wrote: > How do people currently monitor the health of a solr cluster? Are > there any good tools which can show the health across the entire > cluster? Is this something which is planned for the new admin

Re: Options for automagically Scaling Solr (without needing distributed index/replication) in a Hadoop environment

2012-04-14 Thread Jan Høydahl
Hi, This won't give you the performance you need, unless you have enough RAM on the Solr box to cache the whole index in memory. Have you tested this yourself? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com On 12. apr. 2012, at 15