Replication Issue with Repeater Please help

2014-08-12 Thread waqas sarwar
Hi, I'm using Solr. I need a little bit assistance from you. I am bit stuck with Solr replication, before discussing issue let me write a brief description.Scenario:- I want to set up solr in distributed architecture, suppose start with least no of nodes (suppose 3), how can i replic

Re: matching "starts with" only

2014-08-12 Thread zameer
On solr3.6 search while giving query "black\ cat*"(as you mentioned in post), I am not getting any result. Instead of "black\ cat*" if I am querying "black*\ cat*", its giving result as black forest cat black cat black color cat. But I need only these type result i.e. black cat black cat is beaut

Re: explaination of query processing in SOLR

2014-08-12 Thread abhi Abhishek
Thanks Alex and Jack for the direction, actually what i was trying to understand was how various files had an effect on the search. Thanks, Abhishek On Fri, Aug 8, 2014 at 6:35 PM, Alexandre Rafalovitch wrote: > Abhishek, > > Your first part of the question is interesting, but your specific >

Re: Can I use multiple cores

2014-08-12 Thread Ramprasad Padmanabhan
And how many machines running the SOLR ? On 12 August 2014 22:12, Noble Paul wrote: > The machines were 32GB ram boxes. You must do the RAM requirement > And how many machines running the SOLR ? I expect that I will have to add more servers. What I am looking for is how do I calculate how m

Re: what's the difference between solr and elasticsearch in hdfs case?

2014-08-12 Thread Jianyi
Thanks Erick. I will try. -- View this message in context: http://lucene.472066.n3.nabble.com/what-s-the-difference-between-solr-and-elasticsearch-in-hdfs-case-tp4152413p4152626.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: ICUTokenizer acting very strangely with oriental characters

2014-08-12 Thread Steve Rowe
In the table below, the "IsSameS" (is same script) and "SBreak?" (script break = not IsSameS) decisions are based on what I mentioned in my previous message, and the "WBreak" (word break) decision is based on UAX#29 word break rules: CharCode Point ScriptIsSameS?SBreak? WBreak?

Re: ICUTokenizer acting very strangely with oriental characters

2014-08-12 Thread Shawn Heisey
On 8/12/2014 6:29 PM, Steve Rowe wrote: > Shawn, > > ICUTokenizer is operating as designed here. > > The key to understanding this is > o.a.l.analysis.icu.segmentation.ScriptIterator.isSameScript(), called from > ScriptIterator.next() with the scripts of two consecutive characters; these > m

Re: ICUTokenizer acting very strangely with oriental characters

2014-08-12 Thread Steve Rowe
Shawn, ICUTokenizer is operating as designed here. The key to understanding this is o.a.l.analysis.icu.segmentation.ScriptIterator.isSameScript(), called from ScriptIterator.next() with the scripts of two consecutive characters; these methods together find script boundaries. Here’s ScriptIt

Re: ICUTokenizer acting very strangely with oriental characters

2014-08-12 Thread Rik Tamm-Daniels
mmn jnbbbjb)n9nooon Sent from my HTC - Reply message - From: "Shawn Heisey" To: "solr-user@lucene.apache.org" Subject: ICUTokenizer acting very strangely with oriental characters Date: Tue, Aug 12, 2014 19:00 See the original message on this thread for full details. Some addi

Solr query involving Street Addresses

2014-08-12 Thread Guph
I'm very new to Solr, and could use a point in the right direction on a task I've been assigned. I have a database containing customer information (phone number, email address, credit card, billing address, shipping address, etc.). I need to be able to take user-entered data, and use it to search

Re: ICUTokenizer acting very strangely with oriental characters

2014-08-12 Thread Shawn Heisey
See the original message on this thread for full details. Some additional information: This happens on version 4.6.1, 4.7.2, and 4.9.0. Here is a screenshot showing the analysis problem in more detail. The first line you can see is the ICUTokenizer. https://www.dropbox.com/s/9wbi7lz77ivya9j/IC

Re: Access request

2014-08-12 Thread Shawn Heisey
On 8/12/2014 3:29 PM, Vitaliy Zhovtiuk wrote: Please provide me access. User id vzhovtyuk My email vzhovt...@gmail.com Wiki user 'Vitaliy Zhovtyuk' Wiki username added to the Solr wiki contributors group.You didn't indicate exactly what kind of access you wanted, but that's the only kind of a

Re: SolrCloud OOM Problem

2014-08-12 Thread Shawn Heisey
On 8/12/2014 3:12 PM, tuxedomoon wrote: I have modified my instances to m2.4xlarge 64-bit with 68.4G memory. Hate to ask this but can you recommend Java memory and GC settings for 90G data and the above memory? Currently I have CATALINA_OPTS="${CATALINA_OPTS} -XX:NewSize=1536m -XX:MaxNewSize=1

ICUTokenizer acting very strangely with oriental characters

2014-08-12 Thread Shawn Heisey
The field value is this: 20世紀の100人;ポートレートアーカイブス;政治家・軍人;政治家・指導 者・軍人;[政 治],100peopeof20century,pploftwentycentury,pploftwentycentury The problem: We can't match this field with a search for 100peopeof20century. The analysis shows that there are three terms indexed at the critical point by ICUT

Access request

2014-08-12 Thread Vitaliy Zhovtiuk
Hello, Please provide me access. User id vzhovtyuk My email vzhovt...@gmail.com Wiki user 'Vitaliy Zhovtyuk'

Re: SolrCloud OOM Problem

2014-08-12 Thread tuxedomoon
I have modified my instances to m2.4xlarge 64-bit with 68.4G memory. Hate to ask this but can you recommend Java memory and GC settings for 90G data and the above memory? Currently I have CATALINA_OPTS="${CATALINA_OPTS} -XX:NewSize=1536m -XX:MaxNewSize=1536m -Xms5120m -Xmx5120m -XX:+UseParNewGC

RE: Updates to index not available immediately as index scales, even with autoSoftCommit at 1 second

2014-08-12 Thread Matt Kuiper (Springblox)
Based on your solrconfig.xml settings for the filter and queryResult caches, I believe Chris's initial guess is correct. After a commit, there is likely plenty of time spent warming these caches due to the significantly high autowarm counts. Suggest you try setting the autowarmcount v

Re: what's the difference between solr and elasticsearch in hdfs case?

2014-08-12 Thread Erick Erickson
I just pinged "someone who really knows this stuff" and the reply is that he's copied the index from HDFS to a local file system in order to inspect it with Luke, which means the bits on disk are identical and may freely be copied back and forth. So I'd say go for it. Erick On Tue, Aug 12, 2014

Re: Updates to index not available immediately as index scales, even with autoSoftCommit at 1 second

2014-08-12 Thread cwhit
Immediately after triggering the update, this is what is in the logs: /2014-08-12 12:58:48,774 | [71] | 153414367 [qtp2038499066-4772] INFO org.apache.solr.update.processor.LogUpdateProcessor – [collection1] webapp=/solr path=/update params={wt=json} {add=[52627624 (1476251068652322816)]} 0 34

Re: Can I use multiple cores

2014-08-12 Thread Noble Paul
The machines were 32GB ram boxes. You must do the RAM requirement calculation for your indexes . Just the no:of indexes alone won't be enough to arrive at the RAM requirement On Tue, Aug 12, 2014 at 6:59 PM, Ramprasad Padmanabhan < ramprasad...@gmail.com> wrote: > On 12 August 2014 18:18, Noble

Re: Updates to index not available immediately as index scales, even with autoSoftCommit at 1 second

2014-08-12 Thread Chris Hostetter
: I'm not seeing any messages in the log with respect to cache warming at the : time, but I will investigate that possibility. Thank you. In case it is what logs *do* you see at the time you send the doc? w/o details, we can't help you. : helpful, I pasted the entire solrconfig.xml at http:/

Re: Updates to index not available immediately as index scales, even with autoSoftCommit at 1 second

2014-08-12 Thread cwhit
I'm not seeing any messages in the log with respect to cache warming at the time, but I will investigate that possibility. Thank you. In case it is helpful, I pasted the entire solrconfig.xml at http://pastebin.com/C0iQ7E9a -- View this message in context: http://lucene.472066.n3.nabble.com/U

Re: Updates to index not available immediately as index scales, even with autoSoftCommit at 1 second

2014-08-12 Thread Chris Hostetter
You havne't given us a lot of information to go on (ie: full solrconfig.xml, log messages arround the tim of your update, etc...) but my best guess would be that you are seeing a delay between the time the new searcher is opened and the time the newSearcher is made available to requests due to

Re: Modifying date format when using TrieDateField.

2014-08-12 Thread Erick Erickson
The response will always be the full specification, so you'll have -MM-dd'T'HH:mm:ss format. If you want the user to just see the -MM-dd you could use a DocTransformer to change it on the way out. You cannot change the way the dates are stored internally. The DateTransformer is just there

RE: Can I use multiple cores

2014-08-12 Thread Toke Eskildsen
Ramprasad Padmanabhan [ramprasad...@gmail.com] wrote: > I have a single machine 16GB Ram with 16 cpu cores Ah! I thought you had more machines, each with 16 Solr cores. This changes a lot. 400 Solr cores of ~200MB ~= 80GB of data. You're aiming for 7 times that, so about 500GB of data. Running t

Re: Help Required

2014-08-12 Thread Shawn Heisey
On 8/12/2014 3:57 AM, Dmitry Kan wrote: > Hi, > > is http://wiki.apache.org/solr/Support page immutable? All pages on that wiki are changeable by end users. You just need to create an account on the wiki and then ask on this list to have your wiki username added to the Contributor group. Thanks,

Updates to index not available immediately as index scales, even with autoSoftCommit at 1 second

2014-08-12 Thread cwhit
I've been trying to debug through this but I'm stumped. I have a Solr index with ~40 million documents indexed currently sitting idle. I update an existing document through the web interface (collection1 -> Documents -> /update) and the web request returns successfully. At this point, I expect t

Re: Can I use multiple cores

2014-08-12 Thread Ramprasad Padmanabhan
On 12 August 2014 18:18, Noble Paul wrote: > Hi Ramprasad, > > > I have used it in a cluster with millions of users (1 user per core) in > legacy cloud mode .We used the on demand core loading feature where each > Solr had 30,000 cores and at a time only 2000 cores were in memory. You are > just

RE: When I use minimum match and maxCollationTries parameters together in edismax, Solr gets stuck

2014-08-12 Thread Dyer, James
Harun, What do you mean by the "terminal console"? Do you mean to say the admin gui freezes but you can still issue queries to solr directly through your browser? James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Harun Reşit Zafer [mailto:harun.za...@tubitak.gov.

Re: Can I use multiple cores

2014-08-12 Thread Aurélien MAZOYER
Hi Paul and Ramprasad, I follow your discussion with interest as I will have more or less the same requirement. When you say that you use on demand core loading, are you talking about LotsOfCore stuff? Erick told me that it does not work very well in a distributed environnement. How do you han

Re: Can I use multiple cores

2014-08-12 Thread Noble Paul
Hi Ramprasad, I have used it in a cluster with millions of users (1 user per core) in legacy cloud mode .We used the on demand core loading feature where each Solr had 30,000 cores and at a time only 2000 cores were in memory. You are just hitting 400 and I don't see much of a problem . What is y

Re: Modifying date format when using TrieDateField.

2014-08-12 Thread Modassar Ather
Hi Jack, Thanks for your suggestion. I think the way I am using the ParseDateFieldUpdateProcessorFactory is not right hence the date is not getting transformed to the desired format. I added following in solrconfig.xml and see no effect in search result. The date is still in "-MM-dd'T'HH:mm:ss

Re: Can I use multiple cores

2014-08-12 Thread Toke Eskildsen
On Tue, 2014-08-12 at 14:14 +0200, Ramprasad Padmanabhan wrote: > Sorry for missing information. My solr-cores take less than 200MB of > disk So ~3GB/server. If you do not have special heavy queries, high query rate or heavy requirements for index availability, that really sounds like you could p

Re: Can I use multiple cores

2014-08-12 Thread Ramprasad Padmanabhan
Sorry for missing information. My solr-cores take less than 200MB of disk What I am worried about is If I run too many cores from a single solr machine there will be a limit to the number of concurrent searches it can support. I am still benchmarking for this. Also another major bottleneck I fin

Re: Can I use multiple cores

2014-08-12 Thread Toke Eskildsen
On Tue, 2014-08-12 at 11:50 +0200, Ramprasad Padmanabhan wrote: > Are there documented benchmarks with number of cores > As of now I just have a test bed. > > > We have 150 million records ( will go up to 1000 M ) , distributed in 400 > cores. > A single machine 16GB RAM + 16 cores search is w

Re: Modifying date format when using TrieDateField.

2014-08-12 Thread Jack Krupansky
Use the parse date update request processor: http://lucene.apache.org/solr/4_9_0/solr-core/org/apache/solr/update/processor/ParseDateFieldUpdateProcessorFactory.html Additional examples are in my e-book: http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook

Modifying date format when using TrieDateField.

2014-08-12 Thread Modassar Ather
Hi, I have a TrieDateField where I want to store a date in "-MM-dd" format as my source contains the date in same format. As I understand TrieDateField stores date in "-MM-dd'T'HH:mm:ss" format hence the date is getting formatted to the same. Kindly let me know: How can I change the

Re: Help Required

2014-08-12 Thread Dmitry Kan
Hi, is http://wiki.apache.org/solr/Support page immutable? Dmitry On Fri, Aug 8, 2014 at 4:24 PM, Jack Krupansky wrote: > And the Solr Support list is where people register their available > consulting services: > http://wiki.apache.org/solr/Support > > -- Jack Krupansky > > -Original Mes

Re: Can I use multiple cores

2014-08-12 Thread Ramprasad Padmanabhan
Are there documented benchmarks with number of cores As of now I just have a test bed. We have 150 million records ( will go up to 1000 M ) , distributed in 400 cores. A single machine 16GB RAM + 16 cores search is working "fine" But I still am not sure will this work fine in production Obvio

Re: SolrCloud OOM Problem

2014-08-12 Thread Toke Eskildsen
On Tue, 2014-08-12 at 01:27 +0200, dancoleman wrote: > My SolrCloud of 3 shard / 3 replicas is having a lot of OOM errors. Here are > some specs on my setup: > > hosts: all are EC2 m1.large with 250G data volumes Is that 3 (each running a primary and a replica shard) or 6 instances? > documents

Re: When I use minimum match and maxCollationTries parameters together in edismax, Solr gets stuck

2014-08-12 Thread Harun Reşit Zafer
I tried again to make sure. Server starts, I can see web admin gui but I can't navigate btw tabs. It just says "loading". But on the terminal console everything seems normal. Harun Reşit Zafer TÜBİTAK BİLGEM BTE Bulut Bilişim ve Büyük Veri Analiz Sistemleri Bölümü T +90 262 675 3268 W http://w

Re: what's the difference between solr and elasticsearch in hdfs case?

2014-08-12 Thread Jianyi
Hi Alex, Thanks for your reply. I'm comparing Solr vs. ElasticSearch. Dose solr store index on hdfs in raw lucene format? I mean, if in that way, we can get the index files from hdfs and directly put them into an application based on lucene. It seems that ElasticSearch dose not store the raw l

Re: Can I use multiple cores

2014-08-12 Thread Harshvardhan Ojha
I think this question is more aimed at design and performance of large number of cores. Also solr is designed to handle multiple cores effectively, however it would be interesting to know If you have observed any performance problem with growing number of cores, with number of nodes and solr versio

Re: Can I use multiple cores

2014-08-12 Thread Toke Eskildsen
On Tue, 2014-08-12 at 08:40 +0200, Ramprasad Padmanabhan wrote: > I need to store in SOLR all data of my clients mailing activitiy > > The data contains meta data like From;To:Date;Time:Subject etc > > I would easily have 1000 Million records every 2 months. If standard searches are always insid

Re: Can I use multiple cores

2014-08-12 Thread Anshum Gupta
Hi Ramprasad, You can certainly have a system with hundreds of cores. I know of more than a few people who have done that successfully in their setups. At the same time, I'd also recommend to you to have a look at SolrCloud. SolrCloud takes away the operational pains like replication/recovery etc