Re: SolrCloud constantly crashes after upgrading to Solr 4.7

2014-03-07 Thread Martin de Vries
The memory leak seems to be in: org.apache.solr.handler.component.ShardFieldSortedHitQueue I think our issue might be related to this one, because this change has been introduced in 4.7 and has changes to ShardFieldSortedHitQueue: https://issues.apache.org/jira/browse/SOLR-5354 Is the memor

Re: Date Range Query taking more time.

2014-03-07 Thread Erick Erickson
OK, something is not right here. What are your autocommit settings? What you pasted above looks like you're looking at a searcher that has _just_ opened, which would mean either 1> you just had a hard commit with openSearcher=false happen or 2> you just had a soft commit happen In either case, the

Re: Indexing huge data

2014-03-07 Thread Erick Erickson
Kranti and Susheel's appoaches are certainly reasonable assuming I bet right :). Another strategy is to rack together N indexing programs that simultaneously feed Solr. In any of these scenarios, the end goal is to get Solr using up all the CPU cycles it can, _assuming_ that Solr isn't the bottle

Re: Partial Counts in SOLR

2014-03-07 Thread Chris Hostetter
: Reason: In an index with millions of documents I don't want to know that a : certain query matched 1 million docs (of course it will take time to : calculate that). Why don't just stop looking for more results lets say : after it finds 100 docs? Possible?? but if you care about sorting, ie: you

Re: Solr Cores going down in Solrcloud 4.3.1

2014-03-07 Thread Veera Raghavan
I did more deep diving and found out the following exception while it tries to replicate. 135531514-ERROR - 2014-03-07 23:08:35.454; org.apache.solr.common.SolrException; SnapPull failed :org.apache.lucene.store.AlreadyClosedException: Already closed 135531665- at org.apache.solr.core.CachingDirec

Re: SolrCloud setup guidance

2014-03-07 Thread Furkan KAMACI
Hi; What's your performance expectation for qps (query per second)? Thanks; Furkan KAMACI 7 Mar 2014 08:50 tarihinde "Priti Solanki" yazdı: > Thanks Susheel, > > But this index will keep on growing that my worry So I always have to > increase the RAM . > > Can you suggest how many nodes one can

Slow Indexing - High JVM Old Gen utilization causing low throughput

2014-03-07 Thread Veera Raghavan
Hi While multiple indexing jobs are running [all are map reduce jobs hard committing at the end of every mapper], the Old gen utilization spikes a lot and CMS aggressively tries to keep the mem usage < 70% of the heap [I set this to trigger CMS when heap is 70%]. CPU utilization is hardly aroun

Re: Solr 4.7.0 - cursorMark question

2014-03-07 Thread Chris Hostetter
: Thank-you, that all sounds great. My assumption about documents being : missed was something like this: ... : In that situation D would always be missed, whether the cursorMark 'C or : greater' or 'greater than B' (I'm not sure which it is in practice), simply : because the cursorMark is

SOLR JOINS not working and not returning any data for simple query

2014-03-07 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
Hi All, I am facing a strange behavior with the Solr Server. All my joins are not working suddenly after a restart. Individual collections are returning the response but when I join the collection , I am getting zero documents. Let me know if anyone have same type of issues.

Re: Solr Cores going down in Solrcloud 4.3.1

2014-03-07 Thread Veera Raghavan
Forgot to attach the log during the recovery failed solr.log.129:1625677:ERROR - 2014-03-06 13:29:31.909; org.apache.solr.common.SolrException; Error while trying to recover:org.apache.solr.common.SolrException: Replication for recovery failed. solr.log.129-1625849- at org.apache.solr.cloud.Recove

Unable to get offsets using AtomicReader.termPositionsEnum(Term)

2014-03-07 Thread Jefferson French
We have an API on top of Lucene 4.6 that I'm trying to adapt to running under Solr 4.6. The problem is although I'm getting the correct offsets when the index is created by Lucene, the same method calls always return -1 when the index is created by Solr. In the latter case I can see the character o

Re: Solrj Backward Compatibility After 4.5.1

2014-03-07 Thread Shawn Heisey
On 3/7/2014 11:58 AM, Furkan KAMACI wrote: > Hi; > > I have a cluster as SolrCloud of 4.5.1 When I use a Solrj version greater > than 4.5.1 I get an error when deleting a document via CloudSolrServer of > Solrj. When I change the version to 4.5.1 as it works as usual. > > I know that I should us

Solr Cores going down in Solrcloud 4.3.1

2014-03-07 Thread Veera Raghavan
Hi there I have a 6 node solrcloud cluster with 50 collections. All collections are sharded across all the 6 nodes. I am seeing a weird behavior where both the replicas for a shard go to down to go to a "recovering" state and never come back (No specific corelation to writes or reads). I manu

Re: Configurable collectors for custom ranking

2014-03-07 Thread Peter Keegan
Hi Joel, Although I solved this issue with a custom CollectorFactory, I also have a solution that uses a PostFilter and and optional ValueSource. Could you take a look at SOLR-5831 and see if I've got this right? Thanks, Peter On Mon, Dec 23, 2013 at 6:37 PM, Joel Bernstein wrote: > Peter, >

Re: Solr Production Installation

2014-03-07 Thread Shawn Heisey
On 3/7/2014 11:54 AM, leevduhl wrote: > In prep for a production setup I have a few questions that I would like to > get some feedback on: > 1) By default everything is running from under the "example" folder which > does not make much sense for a production environment. So is it possible to > re-

Solrj Backward Compatibility After 4.5.1

2014-03-07 Thread Furkan KAMACI
Hi; I have a cluster as SolrCloud of 4.5.1 When I use a Solrj version greater than 4.5.1 I get an error when deleting a document via CloudSolrServer of Solrj. When I change the version to 4.5.1 as it works as usual. I know that I should use same versions to avoid compatibility issues. However I

Solr Production Installation

2014-03-07 Thread leevduhl
We are currently running Solr 4.6.1 in a dev/testing environment running on top of CentOS 6.x 64bit w/12 gigs ram. Not being real familiar with Linux and Solr we basically just copied the Solr-4.6.1 folder/file structure right into the "var" folder of the server and launched the Solr engine with t

Re: Polygon search returning "InvalidShapeException: incompatible dimension (2)... error.

2014-03-07 Thread leevduhl
Problem resolve. Once we got JTS properly installed all was well. Lee -- View this message in context: http://lucene.472066.n3.nabble.com/Polygon-search-returning-InvalidShapeException-incompatible-dimension-2-error-tp4121704p4122081.html Sent from the Solr - User mailing list archive at Nabb

Re:Re: Re: Re: how to use "LukeRequestHandler" count top term more than once in each document? (v 4.6.1, 4.7.0)

2014-03-07 Thread cqlangyi
hi Ahmet, thank you for the reply. i'd give it a try with some sample docs tomorrow. thank you! Cq At 2014-03-08 00:52:01,"Ahmet Arslan" wrote: > >Hi, > >Looks like totaltermfreq (ttf) is equals to collection frequency. >Please see other relevancy functions : >http://wiki.apache.

Re: Re: Re: how to use "LukeRequestHandler" count top term more than once in each document? (v 4.6.1, 4.7.0)

2014-03-07 Thread Ahmet Arslan
Hi, Looks like totaltermfreq (ttf) is equals to collection frequency.  Please see other relevancy functions :  http://wiki.apache.org/solr/FunctionQuery#Relevance_Functions Ahmet On Friday, March 7, 2014 6:38 PM, cqlangyi wrote: hi Ahmet, thank you, quite clear!!! so now i could get 'df' vi

Re: SolrCloud constantly crashes after upgrading to Solr 4.7

2014-03-07 Thread Martin de Vries
> IndexSchema is using 62% of the memory That seems odd. Can you see what objects are taking all the RAM in the IndexSchema? We investigated this and found out that a dictionary was loaded for each core, taking loads of memory. We the the config to shareSchema=true. The memory usage decreas

Re:Re: Re: how to use "LukeRequestHandler" count top term more than once in each document? (v 4.6.1, 4.7.0)

2014-03-07 Thread cqlangyi
hi Ahmet, another followed question is: is there any methods to get 'cf' for each document? thanks a lot. Cq At 2014-03-08 00:06:50,"Ahmet Arslan" wrote: >Hi, > >You already gave examples, using your example documents : > >>1. "fox jump over the gray dog, fox gone" >>2. "fox is a kin

Re:Re: Re: how to use "LukeRequestHandler" count top term more than once in each document? (v 4.6.1, 4.7.0)

2014-03-07 Thread cqlangyi
hi Ahmet, thank you, quite clear!!! so now i could get 'df' via 'LukeRequestHandler', but how about 'cf', could i get it somehow? thanks! Cq At 2014-03-08 00:06:50,"Ahmet Arslan" wrote: >Hi, > >You already gave examples, using your example documents : > >>1. "fox jump over the

Re: Facets, termvectors, relevancy and Multi word tokenizing

2014-03-07 Thread Ahmet Arslan
Hi, Please optimize your index (you can do it core admin GUI) and see if problem goes away.  Ahmet On Friday, March 7, 2014 1:18 PM, epnRui wrote: Hi guys! I solved my problem on the client side but at least I solved it... Anyway, now I have another problem, which is related to the followi

Re: Re: how to use "LukeRequestHandler" count top term more than once in each document? (v 4.6.1, 4.7.0)

2014-03-07 Thread Ahmet Arslan
Hi, You already gave examples, using your example documents : >1. "fox jump over the gray dog, fox gone" >2. "fox is a kind of animal, dog also is" >3. "i like red fox" collection frequency : cf('fox') = 4 = tf('fox',d1) + tf('fox',d2) + tf('fox',d3) = 2 + 1 + 1 = 4 document frequency   : df('f

Re: SolrCloud constantly crashes after upgrading to Solr 4.7

2014-03-07 Thread Martin de Vries
We parsed the "Unreachable Objects" of the memory dump. The memory leak seems to be in: org.apache.solr.handler.component.ShardFieldSortedHitQueue https://www.dropbox.com/s/hdv49xlb4g4wi03/Screenshot%202014-03-07%2016.51.56.png Martin

Re:Re: how to use "LukeRequestHandler" count top term more than once in each document? (v 4.6.1, 4.7.0)

2014-03-07 Thread cqlangyi
hi Ahmet, thank you very much for the reply. i'm just a little bit confused about "collection frequency" & "document frequency", would you mind help me out with these 2 phrases? thank you! Cq At 2014-03-07 22:43:34,"Ahmet Arslan" wrote: >Hi, > >Thats collection frequency (cf). Ter

Re: hung threads and CLOSE_WAIT sockets

2014-03-07 Thread Mark Miller
On Mar 7, 2014, at 3:11 AM, Avishai Ish-Shalom wrote: > SOLR-5216 Yes, that is the one. - Mark http://about.me/markrmiller

Re: Implementing a customised tokenizer

2014-03-07 Thread Ahmet Arslan
Hi, After you delete your document, did you commit with expungeDeletes=true?  Also please see : https://people.apache.org/~hossman/#xyproblem Ahmet On Friday, March 7, 2014 1:16 PM, epnRui wrote: Hi iorixxx! Thanks for replying. I managed to get around well enough not to need a tokenizer cu

Re: how to use "LukeRequestHandler" count top term more than once in each document? (v 4.6.1, 4.7.0)

2014-03-07 Thread Ahmet Arslan
Hi, Thats collection frequency (cf). TermsComponent could be modified to report cf instead of document frequency(df). Ahmet On Friday, March 7, 2014 10:49 AM, cqlangyi wrote: hi there, i have a question with following example. say i have only 3 documents indexd, 1. "fox jump over the g

Re: Partial Counts in SOLR

2014-03-07 Thread Salman Akram
I know about numFound. That's where the issue is. On a complex query that takes mins I think there would be a major chunk of that spent in calculating "numFound" whereas I don't need it. Let's say I just need first 100 docs and then want SOLR to STOP looking further to populate the "numFound". Le

Re: Facets, termvectors, relevancy and Multi word tokenizing

2014-03-07 Thread epnRui
Hi guys! I solved my problem on the client side but at least I solved it... Anyway, now I have another problem, which is related to the following: - I had previously used replace chars and replace patterns, charfilters and filters, at index time to replace "EP" by "European Parliament". At that

Re: Partial Counts in SOLR

2014-03-07 Thread Dmitry Kan
you limit the number of results by using the rows parameter. You query however may hit more documents (stored in numFound of the response) that what will be returned back to you as rows prescribes. On Fri, Mar 7, 2014 at 11:48 AM, Salman Akram < salman.ak...@northbaysolutions.net> wrote: > All,

Re: Implementing a customised tokenizer

2014-03-07 Thread epnRui
Hi iorixxx! Thanks for replying. I managed to get around well enough not to need a tokenizer customized implementation. That would be a pain in ... Anyway, now I have another problem, which is related to the following: - I had previously used replace chars and replace patterns, charfilters and

Re: Replication Problem from solr-3.6 to solr-4.0

2014-03-07 Thread yuegary
Hi, i am running into the exact same problem: 27534 [qtp989080272-12] INFO org.apache.solr.core.SolrCore – [collection1] webapp=/solr path=/replication params={command=details&_=1394164320017&wt=json} status=0 QTime=12 28906 [qtp989080272-12] INFO org.apache.solr.core.SolrCore – [collection1

Re: Partial Counts in SOLR

2014-03-07 Thread Gora Mohanty
On 7 March 2014 15:18, Salman Akram wrote: > All, > > Is it possible to get partial counts in SOLR? The idea is to get the count > but if its above a certain limit than just return that limit. > > Reason: In an index with millions of documents I don't want to know that a > certain query matched 1

Partial Counts in SOLR

2014-03-07 Thread Salman Akram
All, Is it possible to get partial counts in SOLR? The idea is to get the count but if its above a certain limit than just return that limit. Reason: In an index with millions of documents I don't want to know that a certain query matched 1 million docs (of course it will take time to calculate t

howto count total word amount of all documents in solr index?

2014-03-07 Thread cqlangyi
hi there, i have following questions, please help me out, very appreciate. say i have a field configured as "text_general" type, and indexed 3 pieces content as documents. 1. "today is a good day" 2. "call your family every day" 3. "come with me" how could i count the total (even roughly) wor

how to use "LukeRequestHandler" count top term more than once in each document? (v 4.6.1, 4.7.0)

2014-03-07 Thread cqlangyi
hi there, i have a question with following example. say i have only 3 documents indexd, 1. "fox jump over the gray dog, fox gone" 2. "fox is a kind of animal, dog also is" 3. "i like red fox" with query "http://localhost/solr/admin/luke?fl=myfield&numTerms=5 solr give back the "top terms" a

howto: count total word amount of all documents in solr index

2014-03-07 Thread cqlangyi
hi there, i have following questions, please help me out, very appreciate. say i have a field configured as "text_general" type, and indexed 3 pieces content as documents. 1. "today is a good day" 2. "call your family every day" 3. "come with me" how could i count the total (even roughly) wor

Re: What is mean by Index Searcher?

2014-03-07 Thread Alexandre Rafalovitch
Some events close and reopen the searcher. Commit is the main one during lifetime of Solr server. So, you can read this "until commit". Of course, you have soft and hard commits with settings to reopen or not reopen the searcher, so you may want to read up on that if you are trying to understand th

Re: What is mean by Index Searcher?

2014-03-07 Thread search engn dev
Thanks Alex, But what is mean by "...lifetime of that searcher." Is is lifetime of any particular query or what.? Sorry but i am not able to understand this. :( -- View this message in context: http://lucene.472066.n3.nabble.com/What-is-mean-by-Index-Searcher-tp4121898p4121912.html Sent from

Re: hung threads and CLOSE_WAIT sockets

2014-03-07 Thread Avishai Ish-Shalom
SOLR-5216 ? On Fri, Mar 7, 2014 at 12:13 AM, Mark Miller wrote: > It sounds like the distributed update deadlock issue. > > It's fixed in 4.6.1 and 4.7. > > - Mark > > http://about.me/markrmiller > > On Mar 6, 2014, at 3:10 PM, Avishai Ish-Shalom > wrote: > > > Hi, > > > > We've had a strange