Problem in facet.contains

2015-07-02 Thread Pritam Kute
Hello All, I am new user to solr and using solr 5.2.0 setup. I am trying to create multiple types of facets on same field. I am filtering the facets by using " *facet.contains*". The following is the data into field. roles : { "0/Student Name/", "1/Student Name/1000/", "0/Center Na

Re: Distributed queries hang in a non-SolrCloud environment, Solr 4.10.4

2015-07-02 Thread Shalin Shekhar Mangar
On Fri, Jul 3, 2015 at 1:06 AM, Ronald Wood wrote: > > > I had initially suspect that distributed searches combined with faceting > might be part of the issue, since I had seen some long-running threads that > seemed to spend a long time in the FastLRUCache when getting facets for a > single fi

Re: optimize status

2015-07-02 Thread Summer Shire
Upayavira: I am using solr 4.7 and yes I am using TieredMergePolicy Erick: All my boxes have SSD’s and there isn’t a big disparity between qTime and response time. The performance hit on my end is because of the fragmented index files causing more disk seeks are you mentioned. And I tried reques

Re: heatmaps

2015-07-02 Thread Joseph Obernberger
Hi - perhaps you do not have enough geospatial data in your index to generate a larger image? Try setting the facet.heatpmap.gridLevel to something higher like 4. I've run queries like: q=insert whatever here&wt=json&indent=true&facet=true&facet.heatmap=geo&facet.heatmap.gridLevel=4&facet.heat

Re: Distributed queries hang in a non-SolrCloud environment, Solr 4.10.4

2015-07-02 Thread Matthew Dickinson
unsubscribe On 2 July 2015 at 20:36, Ronald Wood wrote: > > We are running into an issue when doing distributed queries on Solr > 4.10.4. We do not use SolrCloud but instead keep track of shards that need > to be searched based on date ranges. > > We have been running distributed queries without

Re: Distributed queries hang in a non-SolrCloud environment, Solr 4.10.4

2015-07-02 Thread Chris Hostetter
: Thanks I’ll try that. Is the Thread Dump view in the Solr Admin panel not reliable for diagnosing thread hangs? If the JVM is totally hung, you might not be able to connect to solr to even ask it to generate the hread dump itself -- but jstack may still be able to. -Hoss http://www.lucidw

Re: Distributed queries hang in a non-SolrCloud environment, Solr 4.10.4

2015-07-02 Thread Ronald Wood
Thanks I’ll try that. Is the Thread Dump view in the Solr Admin panel not reliable for diagnosing thread hangs? On a different note, I am considering introducing a dedicated aggregator to avoid using a shard both for search and aggregation, in case there is an issue there. Ronald S. Wood | Sen

RE: Distributed queries hang in a non-SolrCloud environment, Solr 4.10.4

2015-07-02 Thread Ryan, Michael F. (LNG-DAY)
Try running jstack on the aggregator - that will show you where the threads are hanging. -Michael -Original Message- From: Ronald Wood [mailto:rw...@smarsh.com] Sent: Thursday, July 02, 2015 3:37 PM To: solr-user@lucene.apache.org Subject: Distributed queries hang in a non-SolrCloud env

Distributed queries hang in a non-SolrCloud environment, Solr 4.10.4

2015-07-02 Thread Ronald Wood
We are running into an issue when doing distributed queries on Solr 4.10.4. We do not use SolrCloud but instead keep track of shards that need to be searched based on date ranges. We have been running distributed queries without incident for several years now, but we only recently upgraded to

Re: Problem XY - X = SolrCloud 4.8 replicas down, Y = SolrCloud upgrade to a new version

2015-07-02 Thread Erick Erickson
1.5M docs in an hour isn't near the rates I saw that trigger the LIR problem, so I strongly doubt that's the issue, never mind ;) On Thu, Jul 2, 2015 at 1:47 PM, Vincenzo D'Amore wrote: > We are trying to send documents as fast as we can, we wrote a multi-thread > Solrj application that read from

Re: Problem XY - X = SolrCloud 4.8 replicas down, Y = SolrCloud upgrade to a new version

2015-07-02 Thread Vincenzo D'Amore
We are trying to send documents as fast as we can, we wrote a multi-thread Solrj application that read from file, solr, or rdbms and update a collection. But if we have too much threads during the day servers become unresponsive. Now, in the night, with a low number of search, we reindex the entire

Re: accent insensitive field-type

2015-07-02 Thread Steve Rowe
See https://issues.apache.org/jira/browse/SOLR-7749 > On Jul 2, 2015, at 8:31 AM, Steve Rowe wrote: > > Hi Søren, > > “charFilter” should be “charFilters”, and “filter” should be “filters”; and > both their values should be arrays - try this: > > { > "add-field-type”: { >"name":"myTxtFie

Re: Problem XY - X = SolrCloud 4.8 replicas down, Y = SolrCloud upgrade to a new version

2015-07-02 Thread Erick Erickson
bq: and we do a full update of all documents during the night. How fast are you sending documents? Prior to Solr 5.2 the replicas would do a twice the amount of work for indexing that the leader did (odd, but...) See: http://lucidworks.com/blog/indexing-performance-solr-5-2-now-twice-fast/ Still

Re: Suggester duplicating values

2015-07-02 Thread Rafael
Absolutely! Thanks man. []'s Rafael On Thu, Jul 2, 2015 at 12:42 PM, Alessandro Benedetti < benedetti.ale...@gmail.com> wrote: > That is what I was saying :) > Hope it helps > > 2015-07-02 16:32 GMT+01:00 Rafael : > > > Just double checking: > > > > In my ruby backend I ask for (using the given

Re: Problem XY - X = SolrCloud 4.8 replicas down, Y = SolrCloud upgrade to a new version

2015-07-02 Thread Vincenzo D'Amore
Hi Erick, thanks for your answer. We use java 8 and allocate a 16GB heap size -Xms2g -Xmx16g There are 1.5M docs and about 16 GB index size on disk. Let me also say, during the day we have a lot of little update, from 1k to 50k docs every time, and we do a full update of all documents during

Re: DocValues: Which format is better Default or Memory?

2015-07-02 Thread Aman Tandon
So should I use Memory format? With Regards Aman Tandon On Thu, Jul 2, 2015 at 9:20 PM, Toke Eskildsen wrote: > Alessandro Benedetti wrote: > > DocValues is a strategy to store on the disk ( or in memory) the > > Un-inverted index for the field of interests. > > True. > > > This has been done

Re: AND for multiple faceted queries

2015-07-02 Thread Erick Erickson
What have you done to try to track this down? What proof do you have that the intersection of all those sets is indeed empty? Have you tried the fq clauses one at a time? If my guess is correct, you'll see the last two returning all documents. This certainly isn't the way fq's work, and if it were

Re: Location of config files in Zoo Keeper

2015-07-02 Thread Erick Erickson
_Why_ do you want to access the raw file? Because in the "normal" case you shouldn't have to care a whit about where ZK keeps the files. The normal pattern for changing these files is to use the upconfig/downconfig zkcli commands to replace the configs wholesale. For production situations, it's o

Re: DocValues: Which format is better Default or Memory?

2015-07-02 Thread Toke Eskildsen
Alessandro Benedetti wrote: > DocValues is a strategy to store on the disk ( or in memory) the > Un-inverted index for the field of interests. True. > This has been done to SPEED UP the faceting calculus using the "fc" > algorithm, and improve the memory usage. Part of the reason was to speed u

Re: How to do a Data sharding for data in a database table

2015-07-02 Thread Erick Erickson
bq: Does Solr automatically loads search index into memory after the index is built? No. That's what the autowarm counts on on your queryResultCache and filterCache are intended to facilitate. Also after every commit, a newSearcher event is fired and any warmup queries you have configured in the n

AND for multiple faceted queries

2015-07-02 Thread Aki Balogh
I'm trying to specify multiple fq and get the intersection: (lines separated for readability) query? q=webCrawlId:36& fq=(body:"crib bedding" OR title:"crib bedding")& fq={!frange l=0 u=0}termfreq(body,"crib bedding")& fq={!frange l=0 u=0}termfreq(title,"crib bedding")& rows=25000& tv=false& start

Re: Suggester duplicating values

2015-07-02 Thread Alessandro Benedetti
That is what I was saying :) Hope it helps 2015-07-02 16:32 GMT+01:00 Rafael : > Just double checking: > > In my ruby backend I ask for (using the given example) all suggested terms > that starts with "J." , then I (probably) add all the terms to a Set, and > then return the Set to the view. Righ

Re: DocValues: Which format is better Default or Memory?

2015-07-02 Thread Erick Erickson
How are you testing? I'd do a couple of things: 1> turn of your queryResultCache (set its size to 0). 2> run multiple queries through something like jmeter 3> insure you've run enough warmup queries to load all your fields into memory. Basically, if this were always the case, I'd expect a _lo

Re: DocValues: Which format is better Default or Memory?

2015-07-02 Thread Aman Tandon
Anything wrong? With Regards Aman Tandon On Thu, Jul 2, 2015 at 4:19 PM, Aman Tandon wrote: > Hi, > > I tried to query the without and with docValues, the query with docValues > was taking more time. Does it may be due to IO got involved as some data > will be in some file. > > Are you sure any

Re: Problem XY - X = SolrCloud 4.8 replicas down, Y = SolrCloud upgrade to a new version

2015-07-02 Thread Erick Erickson
Vincenzo: First and foremost, figure out why you're having 20 second GC pauses. For indexes like you're describing, this is unusual. How big is the heap you allocate to the JVM? Check your Zookeeper timeout. In earlier versions of SolrCloud it defaulted to 15 seconds. Going into leader election w

Re: Suggester duplicating values

2015-07-02 Thread Rafael
Just double checking: In my ruby backend I ask for (using the given example) all suggested terms that starts with "J." , then I (probably) add all the terms to a Set, and then return the Set to the view. Right ? []'s Rafael On Thu, Jul 2, 2015 at 12:12 PM, Alessandro Benedetti < benedetti.ale...

Re: accent insensitive field-type

2015-07-02 Thread Steve Rowe
Hi Søren, “charFilter” should be “charFilters”, and “filter” should be “filters”; and both their values should be arrays - try this: { "add-field-type”: { "name":"myTxtField", "class":"solr.TextField", "positionIncrementGap":"100", "analyzer”: { "charFilters": [ {"class":

Re: Suggester duplicating values

2015-07-02 Thread Alessandro Benedetti
No, I was referring to the fact that a Suggester as a unit of information manages simple terms which are identified simply by themselves. What you need to do is tu sums some Ruby Datastructure that prevent the duplicates to be inserted, and then offer the Suggestions from there. Cheers 2015-07-0

Re: accent insensitive field-type

2015-07-02 Thread Shawn Heisey
On 7/2/2015 8:53 AM, Søren wrote: > I'm trying the add the ICUFoldingFilterFactory in the analyzer. > I suspect that my problem is that the filter class doesn't load. > The managed-schema file is the same info as when looking in the schema > browser in the web gui. The ICU analysis components are

Re: Location of config files in Zoo Keeper

2015-07-02 Thread Shawn Heisey
On 7/2/2015 2:31 AM, dinesh naik wrote: > For solr version 5.1.0, Where does Zoo keeper keep all the config files > ?How do we access them ? > > From Admin console , Cloud-->Tree-->config , we are able to see them but > where does Zoo Keeper store them(location)? The information you can see in Cl

Re: accent insensitive field-type

2015-07-02 Thread Søren
Thanks Ahmet I'm trying the add the ICUFoldingFilterFactory in the analyzer. I suspect that my problem is that the filter class doesn't load. The managed-schema file is the same info as when looking in the schema browser in the web gui. Cheers On 02-07-2015 10:47, Ahmet Arslan wrote: Hi Sore

Re: Suggester duplicating values

2015-07-02 Thread Rafael
Thanks, Alessandro! Well, I'm using Ruby and the r-solr as a client library. I didn't get what you said about term id. Do I have to create this field ? Or is it a "hidden field" utilized by solr under the hood ? []'s Rafael On Thu, Jul 2, 2015 at 6:41 AM, Alessandro Benedetti < benedetti.ale...@

Re: how to

2015-07-02 Thread Jack Krupansky
Use a fractional boost for the test term, and make test optional: +iphone test^0.5 -- Jack Krupansky On Wed, Jul 1, 2015 at 9:51 PM, rulinma wrote: > search "iphone" > > > but I don't want "iphone test" content is the first record, I want minus > "test" weights , how to do this. > > thanks. > >

Re: How to do a Data sharding for data in a database table

2015-07-02 Thread wwang525
Hi, I worked with other search solutions before, and cache management is important in boosting performance. Apart from the cache generated due to user's requests, loading the search index into memory is the very initial step after the index is built. This is to ensure search results to be retrieve

Re: Suggester configuration queries.

2015-07-02 Thread ssharma7...@gmail.com
Erick, We actaully have a working version of Solr 4.6 Spellchecker, the configuration details are as mentioned below: *Solr 4.6 - schema.xml*

Re: DocValues: Which format is better Default or Memory?

2015-07-02 Thread Aman Tandon
Hi, I tried to query the without and with docValues, the query with docValues was taking more time. Does it may be due to IO got involved as some data will be in some file. Are you sure anything else could affect your times ? Yes I am sure. We re-indexed the whole index of 40 Million records, t

RE: language identification during solrj indexing

2015-07-02 Thread Markus Jelsma
https://wiki.apache.org/solr/LanguageDetection -Original message- > From:Alessandro Benedetti > Sent: Thursday 2nd July 2015 11:06 > To: solr-user@lucene.apache.org > Subject: Re: language identification during solrj indexing > > SolrJ is simply a java client to access Solr REST API.

Problem XY - X = SolrCloud 4.8 replicas down, Y = SolrCloud upgrade to a new version

2015-07-02 Thread Vincenzo D'Amore
Hi All, In the latest months my SolrCloud clusters, sometimes (one/two times a week), have few replicas down. Usually all the replicas goes down on the same node. I'm unable to understand why a 3 nodes cluster with 8 core/32 GB and high performance disks have this problem. The main index is small,

Re: DocValues: Which format is better Default or Memory?

2015-07-02 Thread Alessandro Benedetti
So first of all, DocValues is a strategy to store on the disk ( or in memory) the Un-inverted index for the field of interests. This has been done to SPEED UP the faceting calculus using the "fc" algorithm, and improve the memory usage. It is really weird that this is the cause of a degrading of pe

Re: Suggester duplicating values

2015-07-02 Thread Alessandro Benedetti
Hi Rafael, Your problem is clear and it has actually been explored few times in the past. I agree with you in a first instance. A Suggester basic unit of information is a term. Not a document. This means that actually it does not make a lot of sense to return duplicates terms ( because they are co

Re: Bug: replies mixed up with concurrent requests from the same host

2015-07-02 Thread Kevin Perros
Thanks for the answers, I also found that blog post about such issues: http://techbytes.anuragkapur.com/2014/08/potential-jetty-concurrency-bug-seen-in.html On 01/07/15 20:26, Chris Hostetter wrote: : Hmm, interesting. That particular bug was fixed by upgrading to Jetty : 4.1.7 in https://issue

Re: how to

2015-07-02 Thread Alessandro Benedetti
You request is very cryptic, actually I really discourage this kind of requests … Give always at least basic information for : 1) Environment ( Solr you are using ? Architecture ?) 2) Domain ( Problem you are trying to solve ? Data model ? ) 3) Specific problem with a generic and detailed descript

Re: language identification during solrj indexing

2015-07-02 Thread Alessandro Benedetti
SolrJ is simply a java client to access Solr REST API. This means that " indexing through SolrJ" doesn't exist. You simply need to add the proper chain to the update request handler you are using. Taking a look to the code , by Default SolrJ UpdateRequest refers to the "/update" endpoint. Have you

Re: accent insensitive field-type

2015-07-02 Thread Ahmet Arslan
Hi Soren, I am not familiar with managed schema part, but there are built-in filters for this task. ASCIIFoldingFilter and ICUFoldingFilter are two examples. Also solr provides two files: mapping-FoldToASCII.txt and mapping-ISOLatin1Accent.txt to be used with MappingCharFilter as you did. Yo

Location of config files in Zoo Keeper

2015-07-02 Thread dinesh naik
Hi all, For solr version 5.1.0, Where does Zoo keeper keep all the config files ?How do we access them ? >From Admin console , Cloud-->Tree-->config , we are able to see them but where does Zoo Keeper store them(location)? -- Best Regards, Dinesh Naik

accent insensitive field-type

2015-07-02 Thread Søren
Hi Solr users I'm new to Solr and I need to be able to search in structured data in a case and accent insensitive manner. E.g. find "Crème brûlée", both when quering with "Crème brûlée" and "creme brulee". It seems that none of the build-in text types support this, or am I wrong? So I try to

language identification during solrj indexing

2015-07-02 Thread vineet yadav
Hi, I want to identify language identification during solrj indexing. I have made configuration changes required for language identification on the basis of solr wiki( https://cwiki.apache.org/confluence/display/solr/Detecting+Languages+During+Indexing ). language detection update chain is workin