RE: trouble instantiating CloudSolrServer

2012-11-03 Thread Markus Jelsma
Hi, i added the follow dependancy to Apache Nutch: org="org.apache.solr" name="solr-solrj" rev="4.0.0" -Original message- > From:Lance Norskog > Sent: Sat 03-Nov-2012 04:34 > To: solr-user@lucene.apache.org; markrmil...@gmail.com > Subject: Re: trouble instantiating CloudSolrServer >

RE: SolrCloud indexing blocks if node is recovering

2012-11-03 Thread Markus Jelsma
Hi - yes, i should be able to make sense out of them next monday. I assume you're not too interested in the OOM machine but all surrounding nodes that blocked instead? -Original message- > From:Mark Miller > Sent: Sat 03-Nov-2012 03:14 > To: solr-user@lucene.apache.org > Subject: Re

Solr 3.6 -> 4.0

2012-11-03 Thread Nathan Findley
Hi all, I have one machine running solr 3.6. I would like to move this data to solr 4.0 and set up a solrcloud. I feel like I should replicate the existing data. After that, it isn't clear to me what I need to do. 1) Create a slave (4.0) that replicates from the master (3.6). 2) Somehow t

Re: Solr 3.6 -> 4.0

2012-11-03 Thread Otis Gospodnetic
Hi, Check the archive for a similar Q&A yesterday. Reindexing would be the cleanest. Otis -- Performance Monitoring - http://sematext.com/spm On Nov 3, 2012 8:22 AM, "Nathan Findley" wrote: > Hi all, > > I have one machine running solr 3.6. I would like to move this data to > solr 4.0 and set

Re: SolrCloud indexing blocks if node is recovering

2012-11-03 Thread Mark Miller
The OOM machine and any surrounding if possible (eg especially the leader of the shard). Not sure what I'm looking for yet, so the more info the better. - Mark On Nov 3, 2012, at 5:23 AM, Markus Jelsma wrote: > Hi - yes, i should be able to make sense out of them next monday. I assume > you'

Re: Continuous Ping query caused exception: java.util.concurrent.RejectedExecutionException

2012-11-03 Thread Mark Miller
On Nov 1, 2012, at 5:39 AM, Markus Jelsma wrote: > File bug? Please. - Mark

Re: trunk is unable to replicate between nodes ( Unable to download ... completely)

2012-11-03 Thread Mark Miller
Likely some of the trunk work around allowing any Directory impl to replicate. JIRA pls :) - Mark On Oct 30, 2012, at 12:29 PM, Markus Jelsma wrote: > Hi, > > We're testing again with today's trunk and using the new Lucene 4.1 format by > default. When nodes are not restarted things are kind

Re: No lockType configured for NRTCachingDirectory

2012-11-03 Thread Mark Miller
I think I've seen it on 4.X as well yesterday. Let's file a JIRA to track looking into it. - Mark On Oct 31, 2012, at 11:30 AM, Markus Jelsma wrote: > That's 5, the actual trunk/ > > -Original message- >> From:Mark Miller >> Sent: Wed 31-Oct-2012 16:29 >> To: solr-user@lucene.apache.

Re: Possible memory leak in recovery

2012-11-03 Thread Mark Miller
Nothing I know of - file a bug please. Might be related to the EOF issue, so you might add the details to that JIRA. - Mark On Nov 2, 2012, at 10:13 AM, Markus Jelsma wrote: > Hi, > > We wiped clean the data directories for one node. That node is never able to > recover and regularly runs O

SolrCloud failover behavior

2012-11-03 Thread Nick Chase
I think there's a change in the behavior of SolrCloud vs. what's in the wiki, but I was hoping someone could confirm for me. I checked JIRA and there were a couple of issues requesting partial results if one server comes down, but that doesn't seem to be the issue here. I also checked CHANGES

All document keywords must match the query keywords

2012-11-03 Thread SR
Solr 4.0 I need to return documents when all their keywords are matching the query. In other words, all the document keywords should match the query keywords e.g., query: best chinese food restaurant doc1: chinese food doc2: italian food doc3: chinese store Only doc1 should be returned ("chine

customize similarity function

2012-11-03 Thread SR
Solr 4.0 I want to avoid the TF.IDF and use a "binary" model, i.e., if the keyword is in the document, the score is 1, no matter how frequent the keyword is in that document. If the keyword is not in the document, than the score is zero. I also want to avoid the idf. e.g., query: pizza doc:

Re: All document keywords must match the query keywords

2012-11-03 Thread Gora Mohanty
On 3 November 2012 22:17, SR wrote: > Solr 4.0 > > I need to return documents when all their keywords are matching the query. > In other words, all the document keywords should match the query keywords > > e.g., query: best chinese food restaurant > > doc1: chinese food > doc2: italian food > doc

Re: All document keywords must match the query keywords

2012-11-03 Thread SR
On 2012-11-03, at 12:55 PM, Gora Mohanty wrote: > On 3 November 2012 22:17, SR wrote: > >> Solr 4.0 >> >> I need to return documents when all their keywords are matching the query. >> In other words, all the document keywords should match the query keywords >> >> e.g., query: best chinese foo

Solr - Disk writes and set up suggestions

2012-11-03 Thread tictacs
Hi, My site has 30,000 widgets and 500,000 widget users. I have created two solr indexes, one for widgets and one for users. The widgets index is 324MB and the users index is 9.3GB. We are opimizing the index every hour and during this time the server is slowing to a crawl, looks like due to

RE: Solr - Disk writes and set up suggestions

2012-11-03 Thread Michael Ryan
I'd recommend not optimizing every hour. Are you seeing a significant performance increase from optimizing this frequently? -Michael

Re: customize similarity function

2012-11-03 Thread Otis Gospodnetic
Hi, Look where Similarity implementation is specified in solrconfig.xml. Find that class in Lucene and you will see tf and idf methods you need for your implementation, which you can then specify I'm solrconfig. Reindexing required. Otis -- Performance Monitoring - http://sematext.com/spm On Nov

Re: Solr - Disk writes and set up suggestions

2012-11-03 Thread Otis Gospodnetic
Hi, This should become a FAQ. Short version: don't optimize. Check ML archives for recent messages and explanations. If you have a monitoring tool, look at disk io during and after optimization, check solr cache hit rates, etc. Otis -- Performance Monitoring - http://sematext.com/spm On Nov 3, 2

Re: All document keywords must match the query keywords

2012-11-03 Thread Jack Krupansky
But neither "best" nor "restaurant" are in any of the documents, so how are any of these documents reasonable matches? You have the semantics of query backwards. The documents are the "data" and the query is the "operation" to be performed on the data. The intent of a query is to specify what

Re: All document keywords must match the query keywords

2012-11-03 Thread SR
Thanks Jack. This is not the ultimate goal of my search system; it's only one of the features I need. I don't need "best" and "restaurant" to match in this feature. Yes, I do have the semantic of query backwards, and that's what I need in my application. -S On 2012-11-03, at 10:05 PM, Jack K

Re: solr search issue

2012-11-03 Thread Erick Erickson
You really need to spend some time becoming familiar with 1> the results of putting &debugQuery=on in your queries in order to see how your query terms are spread across various fields. 2> the admin/analysis page to understand field tokenization. >From your message, it looks like you're confusing

Re: All document keywords must match the query keywords

2012-11-03 Thread Ahmet Arslan
Hi Steve, I would store my documents as queries in your case. You may find these relevant. http://lucene.apache.org/core/4_0_0-BETA/memory/org/apache/lucene/index/memory/MemoryIndex.html http://www.elasticsearch.org/blog/2011/02/08/percolator.html --- On Sun, 11/4/12, SR wrote: > From: SR

Re: All document keywords must match the query keywords

2012-11-03 Thread SR
Thanks Ahmet that's exactly what I need. Do you now whether this feature exists in Solr? Or do I have to go through Lucene directly? Thanks, -SR On 2012-11-03, at 10:26 PM, Ahmet Arslan wrote: > Hi Steve, > > I would store my documents as queries in your case. You may find these > relevant.

Re: Nested Join Queries

2012-11-03 Thread Erick Erickson
I'm going to go a bit sideways on you, partly because I can't answer the question ... But, every time I see someone doing what looks like substituting "core" for "table" and then trying to use Solr like a DB, I get on my soap-box and preach.. In this case, consider de-normalizing your DB so y

Re: All document keywords must match the query keywords

2012-11-03 Thread Otis Gospodnetic
It doesn't exist in solr. We've built it for clients. Elasticsearch has it built in. Otis -- Performance Monitoring - http://sematext.com/spm On Nov 3, 2012 10:37 PM, "SR" wrote: > Thanks Ahmet that's exactly what I need. Do you now whether this feature > exists in Solr? Or do I have to go throu

Re: All document keywords must match the query keywords

2012-11-03 Thread SR
Thanks Otis. By "we" you mean "Lucid works"? Is there a chance to get it sometime soon in the open source? Thanks, -S On 2012-11-03, at 10:39 PM, Otis Gospodnetic wrote: > It doesn't exist in solr. We've built it for clients. Elasticsearch has it > built in. > > Otis > -- > Performance Monito

Re: SolrCloud failover behavior

2012-11-03 Thread Erick Erickson
SolrCloud doesn't work unless every shard has at least one server that is up and running. I _think_ you might be killing both nodes that host one of the shards. The admin page has a link showing you the state of your cluster. So when this happens, does that page show both nodes for that shard bein