Re: SolrCloud on Trunk

2012-01-28 Thread Jamie Johnson
The case is actually anytime you need to add another shard. With the current implementation if you need to add a new shard the current hashing approach breaks down. Even with many small shards I think you still have this issue when you're adding/updating/deleting docs. I'm definitely interested

Re: How to Sort By a PageRank-Like Complicated Strategy?

2012-01-28 Thread Bing Li
Dear Shashi, As I learned, big data, such as Lucene index, was not suitable to be updated frequently. Frequent updating must affect the performance and consistency when Lucene index must be replicated in a large scale cluster. It is expected such a search engine must work in a write-once & read-ma

Re: Solr Warm-up performance issues

2012-01-28 Thread Lance Norskog
Another trick is to read in the parts of the index file that you search against: term dictionary and maybe a few others. (The Lucene wiki describes the various files.) That is, you copy the new index to the server and then say "cat files > /dev/null". This pre-caches the interesting files into memo

Re: SolrCloud on Trunk

2012-01-28 Thread Lance Norskog
If this is to do load balancing, the usual solution is to use many small shards, so you can just move one or two without doing any surgery on indexes. On Sat, Jan 28, 2012 at 2:46 PM, Yonik Seeley wrote: > On Sat, Jan 28, 2012 at 3:45 PM, Jamie Johnson wrote: >> Second question, I know there are

Re: Permgen Space - GC

2012-01-28 Thread Lance Norskog
Correct. Each war file instance uses its own classloader, and in this case pulling in Solr and all of the dependent jars uses that much memory. This also occurs when you deploy/undeploy/redeploy the same war file. Doing that over and over fills up PermGen. Accd. to this, you should use both this an

Re: SolrCloud on Trunk

2012-01-28 Thread Yonik Seeley
On Sat, Jan 28, 2012 at 3:45 PM, Jamie Johnson wrote: > Second question, I know there are discussion about storing the shard > assignments in ZK (i.e. shard 1 is responsible for hashed values > between 0 and 10, shard 2 is responsible for hashed values between 11 > and 20, etc), this isn't done ye

Re: SolrCloud on Trunk

2012-01-28 Thread Jamie Johnson
Thanks Yonik! I had not dug deeply into it but had expected to find a class named Murmur which I did not. Second question, I know there are discussion about storing the shard assignments in ZK (i.e. shard 1 is responsible for hashed values between 0 and 10, shard 2 is responsible for hashed value

Re: Solr on remote server

2012-01-28 Thread Gora Mohanty
On Sat, Jan 28, 2012 at 10:37 PM, remi tassing wrote: > Hi, > > The example works well on the local machine, but how to make that work on a > remote server? Do you have to install jetty or tomcat ...? Have you taken a look at the Solr Wiki, especially http://wiki.apache.org/solr/#Using_Solr and

Solr on remote server

2012-01-28 Thread remi tassing
Hi, The example works well on the local machine, but how to make that work on a remote server? Do you have to install jetty or tomcat ...? Remi

Re: SolrCloud on Trunk

2012-01-28 Thread Yonik Seeley
On Fri, Jan 27, 2012 at 11:46 PM, Jamie Johnson wrote: > I just want to verify some of the features in regards to SolrCloud > that are now on Trunk > > documents added to the cluster are automatically distributed amongst > the available shards (I had seen that Yonik had ported the Murmur > hash, b

Re: DataImportHandler fails silently

2012-01-28 Thread Erik Hatcher
On Jan 28, 2012, at 09:02 , mathieu lacage wrote: > This deserves an entry in > http://wiki.apache.org/solr/DataImportHandlerFaqwhich I would have > updated but it is immutable. *hint to those who have > edit powers there* You can make yourself a wiki account and then edit the page. An account i

Re: DataImportHandler fails silently

2012-01-28 Thread mathieu lacage
On Sat, Jan 28, 2012 at 10:35 AM, mathieu lacage wrote: > > (I have tried two different sqlite jdbc drivers so, I doubt it could > be a problem there, but, who knows). > I eventually screamed really loud when I read the source code of the sqlite jdbc drivers: they interpret the jdbcDataSource at

Re: Complex query, need filtering after query not before

2012-01-28 Thread Mikhail Khludnev
Hello Jay, You can lose some precision in favour of performance: reducing precision of coordinates (by putting them onto grid) you can increase hit ratio; then try bbox for faster rough filtration http://wiki.apache.org/solr/SpatialSearch#bbox_-_Bounding-box_filter and apply geodist() function in

Re: DataImportHandler fails silently

2012-01-28 Thread mathieu lacage
On 1/28/12, mathieu lacage wrote: > > Le 28 janv. 2012 à 05:17, Lance Norskog a écrit : > >> Do all of the documents have unique id fields? > > yes. I have debugged this further with http://localhost:8080/solr/admin/dataimport.jsp?handler=/dataimport The returned xml file when I ask for verbose

Re: querying multivalue fields

2012-01-28 Thread Mikhail Khludnev
Considering Lucene level every term in the document field is attributed by positions (if you don't omit them) i.e. your document looks like ["red"@0, "redder"@1, "reddest"@2,"yellow"@3, "blue"@4] pls check the info PhraseQuery, SpanQueries and positionIncrementGap. I have an experience of obtaining