Solr Replication

2013-03-14 Thread vicky desai
Hi, I am using solr 4 setup. For the backup purpose once in a day I start one additional tomcat server with cores having empty data folders and which acts as a slave server. However it does not replicate data from the master unless there is a commit on the master. Is there a possibility to pull da

New-Question On Search data who does not have "x" field

2013-03-14 Thread anurag.jain
My prev question was I have updated 250 data to solr. and some of data have "category" field and some of don't have. for example. { "id":"321", "name":"anurag", "category":"30" }, { "id":"3", "name":"john" } now i want to search that docs who does not have that field. what que

Re: Solr Replication

2013-03-14 Thread Ahmet Arslan
Hi Vicky, May be startup ? For backups http://master_host:port/solr/replication?command=backup would be more suitable. or startup --- On Thu, 3/14/13, vicky desai wrote: > From: vicky desai > Subject: Solr Replication > To: solr-user@lucene.apache.org > Date: Thursday, March 14, 2013, 9:

Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Chantal Ackermann
Hi all, this is not a question. I just wanted to announce that I've written a blog post on how to set up Maven for packaging and automatic testing of a SOLR index configuration. http://blog.it-agenten.com/2013/03/integration-testing-your-solr-index-with-maven/ Feedback or comments appreciated

Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread David Philip
Informative. Useful.Thanks On Thu, Mar 14, 2013 at 1:59 PM, Chantal Ackermann < c.ackerm...@it-agenten.com> wrote: > Hi all, > > > this is not a question. I just wanted to announce that I've written a blog > post on how to set up Maven for packaging and automatic testing of a SOLR > index config

Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Paul Libbrecht
Nice, Chantal can you indicate there or here what kind of speed for integration tests you've reached with this, from a bare source to a successfully tested application? (e.g. with 100 documents) thanks in advance Paul On 14 mars 2013, at 09:29, Chantal Ackermann wrote: > Hi all, > > > thi

OutOfMemoryError

2013-03-14 Thread Arkadi Colson
Hi I'm getting this error after a few hours of filling solr with documents. Tomcat is running with -Xms1024m -Xmx4096m. Total memory of host is 12GB. Softcommits are done every second and hard commits every minute. Any idea why this is happening and how to avoid this? *top* PID USER P

Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Chantal Ackermann
Hi Paul, I'm sorry I cannot provide you with any numbers. I also doubt it would be wise to post any as I think the speed depends highly on what you are doing in your integration tests. Say you have several request handlers that you want to test (on different cores), and some more complex use c

Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Paul Libbrecht
Chantal, the goal is different: get a general feeling how practical it is to integrate this in the routine. If you are able, on your contemporary machine which I assume is not a supercomputer of some special sort, to run this whole process somewhat useful for you in about 2 minutes then I'll be

Re: OutOfMemoryError

2013-03-14 Thread Arkadi Colson
When I shutdown tomcat free -m and top keeps telling me the same values. Almost no free memory... Any idea? On 03/14/2013 10:35 AM, Arkadi Colson wrote: Hi I'm getting this error after a few hours of filling solr with documents. Tomcat is running with -Xms1024m -Xmx4096m. Total memory of hos

Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread Rafał Radecki
Hi All. I am monitoring two solr 4.1 solr instances in master-slave setup. On both nodes I check url /solr/replication?command=details and parse it to get: - on master: if replication is enabled -> field replicationEnabled - on slave: if replication is enabled -> field replicationEnabled - on slav

Re: New-Question On Search data who does not have "x" field

2013-03-14 Thread Jack Krupansky
Writing "OR -" is simply the same as "-", so the query would match documents containing category 20 and then remove all documents that had any category (including 20) specified, giving you nothing. Try: http://localhost:8983/search?q=*:*&wt=json&start=0&fq=category:"20"; OR (*:* -category:[*

Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread Rafał Radecki
In the output of: /solr/replication?command=details there is indexVersion mentioned many times: 0 3 22.59 KB /usr/share/solr/data/index/ 1363259880360 4 _1.tvx _1_nrm.cfs _1_Lucene41_0.doc _1_Lucene41_0.tim _1_Lucene41_0.tip _1.fnm _1_nrm.cfe _1.fdx _1_Lucene41_0.pos _1.tvf _1.fdt _1_Luce

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

2013-03-14 Thread Luis Cappa Banda
Hello! Thanks a lot, Erick! I've attached some stack traces during a normal 'engine' running. Cheers, - Luis Cappa 2013/3/13 Erick Erickson > Stack traces.. > > First, > jps -l > > that will give you a the process IDs of your running Java processes. Then: > > jstack > > Usually I pipe the o

Re: Poll: Largest SolrCloud out there?

2013-03-14 Thread Christian von Wendt-Jensen
Does it only count if you are using SolrCloud? We are using a traditional Master/Slave setup with Solr 4.1: 1 Master per 14 days: Documents: ~15mio Index size: ~150GB (stored fields) #of masters: +30 Performance: SUCKS big time until caches catches up. Unfortunately that takes quite some time.

Advice: solrCloud + DIH

2013-03-14 Thread roySolr
Hello, I need some advice with my solrcloud cluster and the DIH. I have a cluster with 3 cloud servers. Every server has an solr instance and a zookeeper instance. I start it with the -Dzkhost parameter. It works great, i send updates by an curl(xml) like this: curl http:/ip:SOLRport/solr/update

Re: Poll: Largest SolrCloud out there?

2013-03-14 Thread Otis Gospodnetic
Christian, SSDs will warm up muuuch faster. Your other questionable require more info / discussion. Otis Solr & ElasticSearch Support http://sematext.com/ On Mar 14, 2013 8:47 AM, "Christian von Wendt-Jensen" < christian.vonwendt-jen...@infopaq.com> wrote: > Does it only count if you are using S

Re: OutOfMemoryError

2013-03-14 Thread Toke Eskildsen
On Thu, 2013-03-14 at 13:10 +0100, Arkadi Colson wrote: > When I shutdown tomcat free -m and top keeps telling me the same values. > Almost no free memory... > > Any idea? Are you reading top & free right? It is standard behaviour for most modern operating systems to have very little free memory

Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread Mark Miller
On Mar 14, 2013, at 8:10 AM, Rafał Radecki wrote: > Is this a bug? Yes, 4.1 had some replication issues just as you seem to describe here. It all should be fixed in 4.2 which is available now and is a simple upgrade. - Mark

Re: Advice: solrCloud + DIH

2013-03-14 Thread Mark Miller
On Mar 14, 2013, at 9:22 AM, roySolr wrote: > Hello, > > When i run this it goes with 3 doc/s(Really > slow). When i run solr alone(not solrcloud) it goes 600 docs/sec. > > What's the best way to do a full re-index with solrcloud? Does solrcloud > support DIH? > > Thanks > SolrCloud suppo

Re: OutOfMemoryError

2013-03-14 Thread Arkadi Colson
On 03/14/2013 03:11 PM, Toke Eskildsen wrote: On Thu, 2013-03-14 at 13:10 +0100, Arkadi Colson wrote: When I shutdown tomcat free -m and top keeps telling me the same values. Almost no free memory... Any idea? Are you reading top & free right? It is standard behaviour for most modern operatin

Replication

2013-03-14 Thread Arkadi Colson
Based on what does solr replicate the whole shard again from zero? From time to time after a restart of tomcat solr copies over the whole shard to the replicator instead of doing only the changes. BR, Arkadi

Question about email search

2013-03-14 Thread Jorge Luis Betancourt Gonzalez
I'm using solr 3.6.2 to crawl some data using nutch, in my schema I've one field with all the content extracted from the page, which could possibly include email addresses, this is the configuration of my schema:

Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread richardg
I believe this is the same issue as described, I'm running 4.2 and as you can see my slave is a couple versions ahead of the master (all three slaves show the same behavior). This was never the case until I upgraded from 4.0 to 4.2. Master: 1363272681951 93 1,022.31 MB Slave: 1363273274085 95

Re: Question about email search

2013-03-14 Thread Ahmet Arslan
Hi, Since you have word delimiter filter in your analysis chain, I am not sure if e-mail addresses are recognised. You can check that on solr admin UI, analysis page. If e-mail addresses kept one token, I would use leading wildcard query. &q=*@gmail.com There was a similar question recently:

Strange error in Solr 4.2

2013-03-14 Thread Uwe Klosa
Hi We have been using Solr 4.0 for a while now and wanted to upgrade to 4.2. But our application stopped working. When we tried 4.1 it was working as expected. Here is a description of the situation. We deploy a Solr web application under java 7 on a Glassfish 3.1.2.2 server. We added some clas

Re: Solr 4.1 monitoring with /solr/replication?command=details - indexVersion?

2013-03-14 Thread Mark Miller
What calls are you using to get the versions? Or is it the admin UI? Also can you add any details about your setup - if this is a problem, we need to duplicate it in one of our unit tests. Also, is it affecting proper replication in any way that you can tell. - Mark On Mar 14, 2013, at 11:12 A

Re: Strange error in Solr 4.2

2013-03-14 Thread Mark Miller
Perhaps as a result of https://issues.apache.org/jira/browse/SOLR-4451 ? Just a guess. The root cause looks to be: > Caused by: java.io.IOException: Keystore was tampered with, or password was > incorrect - Mark On Mar 14, 2013, at 11:24 AM, Uwe Klosa wrote: > Hi > > We have been using Sol

need general advice on how others version and mange core deployments over time

2013-03-14 Thread geeky2
hello everyone, i know this is a general topic - but would really appreciate info from others that are doing this now. - how are others managing this so that users are impacted the least - how are others handling the scenario where users don't want to migrate forward. thx mark -- View

Re: Strange error in Solr 4.2

2013-03-14 Thread Uwe Klosa
Thanks, but nobody has tempered with keystores. I have tested the application on different machines. Always the same exception is thrown. Do we have to set some system property to fix this? /Uwe On 14 March 2013 16:36, Mark Miller wrote: > Perhaps as a result of https://issues.apache.org/ji

Handling a closed IndexWriter in SOLR 4.0

2013-03-14 Thread Danzig, Scott
Hey all, We're using a Solr 4 core to handle our article data. When someone in our CMS publishes an article, we have a listener that indexes it straight to solr. We use the previously instantiated HttpSolrServer, build the solr document, add it with server.add(doc) .. then do a server.commit(

Re: Strange error in Solr 4.2

2013-03-14 Thread Uwe Klosa
I found the answer myself. Thanks for the pointer. Cheers Uwe On 14 March 2013 16:48, Uwe Klosa wrote: > Thanks, but nobody has tempered with keystores. I have tested the > application on different machines. Always the same exception is thrown. > > Do we have to set some system property to fix

Re: Replication

2013-03-14 Thread Timothy Potter
Hi Arkadi, If the update delta between the shard leader and replica >100 docs, then Solr punts and replicas the entire index. Last I heard, the 100 was hard-coded in 4.0 so is not configurable. This makes sense because the replica shouldn't be out-of-sync with the leader unless it has been offline

Out of Memory doing a query Solr 4.2

2013-03-14 Thread raulgrande83
Hi After doing a query to Solr to get the uniqueIds (string of 20 characters) of 700 documents in a collection, I'm getting an out of memory error using Solr 4.2. I tried to increase the JVM-Memory 1G (from 3G to 4G) however this didn't change anything. This was working on 3.5. I've moved from

ids request to shard with star query are slow

2013-03-14 Thread srinir
ids request to shard with star query are slow I have a distributed solr environment and I am investigating all the request where the shard took significant amount of time. One common pattern i saw was all the ids request with q=*:* and ids= took around 2-3sec. i picked some shard request q=xyz an

Re: Strange error in Solr 4.2

2013-03-14 Thread Stefan Matheis
On Thursday, March 14, 2013 at 4:57 PM, Uwe Klosa wrote: > I found the answer myself. Thanks for the pointer. Would you mind sharing you answer, Uwe?

Re: Out of Memory doing a query Solr 4.2

2013-03-14 Thread Robert Muir
On Thu, Mar 14, 2013 at 12:07 PM, raulgrande83 wrote: > JVM: IBM J9 VM(1.6.0.2.4) I don't recommend using this JVM.

Re: Strange error in Solr 4.2

2013-03-14 Thread Shawn Heisey
On 3/14/2013 9:24 AM, Uwe Klosa wrote: This exception occurs in this part new ConcurrentUpdateSolrServer("http://solr.diva-portal.org:8080/search";, 5, 50) Side comment, unrelated to your question: If you're already aware that ConcurrentUpdateSolrServer has no built-in error handling and you

Re: Strange error in Solr 4.2

2013-03-14 Thread Mark Miller
On Mar 14, 2013, at 1:27 PM, Shawn Heisey wrote: > I have been told that it is possible to override the handleError method to > fix this I'd say mitigate more than fix. I think the real fix requires some dev work. - Mark

Re: OutOfMemoryError

2013-03-14 Thread Shawn Heisey
On 3/14/2013 3:35 AM, Arkadi Colson wrote: Hi I'm getting this error after a few hours of filling solr with documents. Tomcat is running with -Xms1024m -Xmx4096m. Total memory of host is 12GB. Softcommits are done every second and hard commits every minute. Any idea why this is happening and how

Meaning of "Current" in Solr Cloud Statistics

2013-03-14 Thread Michael Della Bitta
Hi everyone, Is there an official definition of the "Current" flag under Core > Home > Statistics? What would it mean if a shard leader is not "Current"? Thanks, Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-627

Solr 4.2 mechanism proxy request error

2013-03-14 Thread yriveiro
Hi, I think that in solr 4.2 the new feature to proxy a request if the collection is not in the requested node has a bug. If I do a query with the parameter rows=0 and the node doesn't have the collection. If the parameter is rows=4 or superior then the search works as expected the curl return

Re: Solr 4.2 mechanism proxy request error

2013-03-14 Thread Mark Miller
I'll add a test with rows = 0 and see how easy it is to replicate. Looks to me like you should file a JIRA issue in any case. - Mark On Mar 14, 2013, at 2:04 PM, yriveiro wrote: > Hi, > > I think that in solr 4.2 the new feature to proxy a request if the > collection is not in the requested

Re: Solr 4.2 mechanism proxy request error

2013-03-14 Thread yriveiro
The log of the UI null:org.apache.solr.common.SolrException: Error trying to proxy request for url: http://192.168.20.47:8983/solr/ST-3A856BBCA3_12/select I will open the issue in Jira. Thanks - Best regards -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-2-me

Re: Version conflict during data import from another Solr instance into clean Solr

2013-03-14 Thread Chris Hostetter
: It looks strange to me that if there is no document yet (foundVersion < 0) : then the only case when document will be imported is when input version is : negative. Guess I need to test specific cases using SolrJ or smth. to be sure. you're assuming that if foundVersion < 0 that means no documen

Re: Question about email search

2013-03-14 Thread Jorge Luis Betancourt Gonzalez
Sorry for the duplicated mail :-(, any advice on a configuration for searching emails in a field that does not have only email addresses, so the email addresses are contained in larger textual messages? - Mensaje original - De: "Ahmet Arslan" Para: solr-user@lucene.apache.org Enviados:

Searching across multiple collections (cores)

2013-03-14 Thread kfdroid
I've been looking all over for a clear answer to this question and can't seem to find one. It seems like a very basic concept to me though so maybe I'm using the wrong terminology. I want to be able to search across multiple collections (as it is now called in SolrCloud world, previously called Co

Re: Searching across multiple collections (cores)

2013-03-14 Thread Mark Miller
Yes, with SolrCloud, it's just the collection param (as long as the schemas are compatible for this): http://wiki.apache.org/solr/SolrCloud#Distributed_Requests - Mark On Mar 14, 2013, at 2:55 PM, kfdroid wrote: > I've been looking all over for a clear answer to this question and can't seem >

Re: Meaning of "Current" in Solr Cloud Statistics

2013-03-14 Thread Stefan Matheis
Hey Michael I was a bit confused because you mentioned SolrCloud in the subject. We're talking about http://host:port/solr/#/collection1 (f.e.) right? And there, the left-upper Box "Statistics" ? If so, the Output comes from /solr/collection1/admin/luke ( http://svn.apache.org/viewvc/lucene/de

Re: Searching across multiple collections (cores)

2013-03-14 Thread kfdroid
I'm assuming from that link I would use the following: /Query all shards of multiple compatible collections, explicitly specified:/ http://localhost:8983/solr/collection1/select?collection=collection1_NY,collection1_NJ,collection1_CT where collection1_NY, NJ and CT could be books, movies, music i

Re: Meaning of "Current" in Solr Cloud Statistics

2013-03-14 Thread Michael Della Bitta
Stefan, Thanks a lot! Makes sense. So I don't have to worry about my leader thinking it's out of date, then. Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game On

Re: Meaning of "Current" in Solr Cloud Statistics

2013-03-14 Thread Stefan Matheis
Perhaps the wording of "Current" is a bit too generic in that context? I'd like to change that description if that clarifies things .. but not sure which one is a better fit? On Thursday, March 14, 2013 at 8:26 PM, Michael Della Bitta wrote: > Stefan, > > Thanks a lot! Makes sense. So I don

Re: Meaning of "Current" in Solr Cloud Statistics

2013-03-14 Thread Mark Miller
Something like 'Reader is Current' might be better. Personally, I don't even know if it's worth showing. - Mark On Mar 14, 2013, at 3:40 PM, Stefan Matheis wrote: > Perhaps the wording of "Current" is a bit too generic in that context? I'd > like to change that description if that clarifies t

Solr indexing binary files

2013-03-14 Thread Luis
Hi, I am new with Solr and I am extracting metadata from binary files through URLs stored in my database. I would like to know what fields are available for indexing from PDFs (the ones that would be initiated as in column=””). For example how would I extract something like file size, format or f

Re: Solr indexing binary files

2013-03-14 Thread Jack Krupansky
Take a look at Solr Cell: http://wiki.apache.org/solr/ExtractingRequestHandler Include a dynamicField with a "*" pattern and you will see the wide variety of metadata that is available for PDF and other rich document formats. -- Jack Krupansky -Original Message- From: Luis Sent: Th

Re: Question about email search

2013-03-14 Thread Alexandre Rafalovitch
Sure. copyField it into a new indexed non-stored field with the following type definition: Content of filter_email.txt is (including <> signs): You will have the emails only left as tokens. Can't display them easily, but can certainly search. Regards,

Re: Handling a closed IndexWriter in SOLR 4.0

2013-03-14 Thread Otis Gospodnetic
Hi Scott, Not sure why IW would be closed, but: * consider not (hard) committing after each doc, but just periodically, every N minutes * soft committing instead * using 4.2 Otis -- Solr & ElasticSearch Support http://sematext.com/ On Thu, Mar 14, 2013 at 11:55 AM, Danzig, Scott wrote: > He

Re: Blog Post: Integration Testing SOLR Index with Maven

2013-03-14 Thread Lance Norskog
Wow! That's great. And it's a lot of work, especially getting it all keyboard-complete. Thank you. On 03/14/2013 01:29 AM, Chantal Ackermann wrote: Hi all, this is not a question. I just wanted to announce that I've written a blog post on how to set up Maven for packaging and automatic testi

Re: Advice: solrCloud + DIH

2013-03-14 Thread rulinma
3docs/s is lower, I test with 4 node is more 1000docs/s and 4k/doc with solrcloud. Every leader has a replica. I am tuning to improve to 3000docs/s. 3docs/s is too slow. 3x! -- View this message in context: http://lucene.472066.n3.nabble.com/Advice-solrCloud-DIH-tp4047339p4047559.html Sent fr

Re: Embedded Solr

2013-03-14 Thread rulinma
give u to test embeded solr: import java.io.File; import java.io.IOException; import java.net.MalformedURLException; import java.util.ArrayList; import java.util.Collection; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.core.SimpleAnalyzer; import org.apache.lucene

Re: discovery-based core enumeration with embedded solr

2013-03-14 Thread Erick Erickson
H, could you raise a JIRA and assign it to me? Please be sure and emphasize that it's embedded because I'm pretty sure this is fine for the regular case. But I have to admit that the embedded case completely slipped under the radar. Even better if you could make a test case, but that might no

Re: Can we manipulate termfreq to count as 1 for multiple matches?

2013-03-14 Thread Felipe Lahti
Hi! Take a look on http://wiki.apache.org/solr/SchemaXml#Common_field_options parameter "*omitTermFreqAndPositions"* or you can use a custom similarity class that overrides the term freq and return one for only that field. http://wiki.apache.org/solr/SchemaXml#Similarity B

SOLR Num Docs vs NumFound

2013-03-14 Thread Nathan Findley
On my solr 4 setup a query returns a higher "NumFound" value during a *:* query than the "Num Docs" value reported on the statistics page of collection1. Why is that? My data is split across 3 data import handlers where each handler has the same type of data but the ids are guaranteed to be dif

Re: Solr Replication

2013-03-14 Thread vicky desai
Hi, I have a multi core setup and there is continuous updation going on in each core. Hence I dont prefer a bckup as it would either cause a downtime or if during a backup there is a write activity my backup will be corrupted. Can you please suggest if there is a cleaner way to handle this -- V

Re: Solr 4.1/4.2 - SolrException: Error opening new searcher - With JUnit test class

2013-03-14 Thread mark12345
I wrote a simple test to reproduce a very similar stack trace to the above issue, where only some line numbers differences. Any ideas as to why the following happens? Any help would be very appreciated. * The test case: > @Test > public void documentCommitAndRollbackTest() throws Exc