Re: Searchquery on field that contains space

2014-01-09 Thread Ahmet Arslan
Hi Peter, Here are two different ways to do it. 1) Use phrase query q=yourField:"new y" with the following type.             2) Use prefix query q={!prefix f=yourField}new y with following type: https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-PrefixQueryParser

Copying Index

2014-01-09 Thread anand chandak
Hi, I am testing replication feature of solr 4.x with large index, unfortunately, that index that we had was for 3.x format. So I copied the index file and ran the upgrade index utility to convert it to 4.x format. The utility did, what it is suppose to do and I 4.x index (verified it with

Re: Searchquery on field that contains space

2014-01-09 Thread Alexandre Rafalovitch
On Thu, Jan 9, 2014 at 11:34 PM, PeterKerk wrote: > Basically a user starts typing the first letters of a city and I want to > return citynames that start with those letters, case-insensitive and not > splitting the cityname on separate words (whether the separator is a > whitespace or a "-"). >

Re: Index size - to determine storage

2014-01-09 Thread Alexandre Rafalovitch
Try running PDF through standalone Tika and see what comes back. That's the size of the input. It usually be quite a small proportion of PDF size. Possibly down to metadata only and no text, if your PDF does not include text layer. Then, it depends on your storing and indexing options, your tokeni

Re: Solr Cloud Query Scaling

2014-01-09 Thread Joel Bernstein
You do need to load balance the initial query request across the SolrCloud nodes. Solj's CloudSolrServer and LBHttpSolrServer can perform the load balancing for you in the client. Or you can use a hardware load balancer. Joel Bernstein Search Engineer at Heliosearch On Thu, Jan 9, 2014 at 5:58

Re: need help on OpenNLP with Solr

2014-01-09 Thread Lance Norskog
There is no way to do these things with LUCENE-2899. On Mon, Jan 6, 2014 at 8:07 AM, rashi gandhi wrote: > Hi, > > > > I have applied OpenNLP (LUCENE 2899.patch) patch to SOLR-4.5.1 for nlp > searching and it is working fine. > > Also I have designed an analyzer for this: > > positionIncrementG

Re: Solr Cloud Query Scaling

2014-01-09 Thread Shawn Heisey
On 1/9/2014 4:09 PM, Garth Grimm wrote: As a follow-up question on this One would want to use some kind of load balancing 'above' the SolrCloud installation for search queries, correct? To ensure that the initial requests would get distributed evenly to all nodes? If you don't have that,

RE: Solr Cloud Query Scaling

2014-01-09 Thread Garth Grimm
As a follow-up question on this One would want to use some kind of load balancing 'above' the SolrCloud installation for search queries, correct? To ensure that the initial requests would get distributed evenly to all nodes? If you don't have that, and send all requests to M2S2 (IRT OP), i

Re: Invalid version (expected 2, but 60) or the data in not in 'javabin' format exception while deleting 30k records

2014-01-09 Thread gpssolr2020
Thanks. We will try with more heap. And we noticed that zookeeper(open jdk) and Solr(sun jdk) is using different jvm. Will this really cause this OOM issue ?. -- View this message in context: http://lucene.472066.n3.nabble.com/Invalid-version-expected-2-but-60-or-the-data-in-not-in-javabin-for

Re: Range queries with Grouping is slow?

2014-01-09 Thread Kranti Parisa
Thank you, will take a look at it. Thanks, Kranti K. Parisa http://www.linkedin.com/in/krantiparisa On Thu, Jan 9, 2014 at 10:25 AM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > Hello, > > Here is workaround for caching separate clauses in OR filters. > http://blog.griddynamics.com/

Re: Index size - to determine storage

2014-01-09 Thread Michael Della Bitta
Hi Amit, It really boils down to how much of that 100kb is actually text, and how you analyze and store the text. Meaning, it's really hard for us to say. You're probably going to need to experiment to figure out what the storage needs for your use case are. Michael Della Bitta Applications Deve

Index size - to determine storage

2014-01-09 Thread Amit Jha
Hi, I would like to know if I index a file I.e PDF of 100KB then what would be the size of index. What all factors should be consider to determine the disk size? Rgds AJ

Return only distinct combinations of 2 field values

2014-01-09 Thread PeterKerk
I'm searching on cities and returning city and province, some cities exist in different provinces, which is ok. However, I have some duplicates, meaning 2 cities occur in the same province. In that case I only want to return 1 result. I therefore need to have a distinct and unique city+province com

RE: Solr Cloud Query Scaling

2014-01-09 Thread Tim Potter
Absolutely adding replicas helps you scale query load. Queries do not need to be routed to leaders; they can be handled by any replica in a shard. Leaders are only needed for handling update requests. In general, a distributed query has two phases, driven by a controller node (what you called c

Solr Cloud Query Scaling

2014-01-09 Thread Sir Gilligan
Question: Does adding replicas help with query load? Scenario: 3 Physical Machines. 3 Shards Query any machine, get results. Standard Solr Cloud stuff. Update Scenario: 6 Physical Machines. 3 Shards. M = Machine, S = Shard, -L = Leader M1S1-L M2S2 M3S3 M4S1 M5S2-L M6S3-L Incoming Query to M2S2.

Re: Searchquery on field that contains space

2014-01-09 Thread PeterKerk
Hi Ahmet, Thanks. Also for that link, although it's too advanced for my usecase. I see that by using KeywordTokenizerFactory it almost works now, but when I search on: "new y", no results are found, but when I search on "new", I do get "New York". So the space in the searchquery is still caus

Re: solr increase number of digits that tint fields can store

2014-01-09 Thread Hakim Benoudjit
Thanks that's the response I was searching for. And, I have confirmed that I need to reindex my data because tlong isnt compatible with tint. 2014/1/9 Chris Hostetter > > A TrieIntField field can never contain a value greater then java's > Integer.MAX_VALUE -- it doesn't matter what settings yo

Re: Searchquery on field that contains space

2014-01-09 Thread Ahmet Arslan
Hi Peter, Use KeywordTokenizerFactory instead of Whitespace tokenizer. Also you might interested in this :  http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/ Ahmet On Thursday, January 9, 2014 6:35 PM, PeterKerk wrote: Basically a user starts typing the first letters

Re: Solr 4.6.0: DocValues (distributed search)

2014-01-09 Thread ku3ia
Today I setup a simple SolrCloud with tow shards. Seems the same. When I'm debugging a distributed search I can't catch a break-point at lucene codec file, but when I'm using faceted search everything looks fine - debugger stops. Can anyone help me with my question? Thanks. -- View this message

Re: Zookeeper as Service

2014-01-09 Thread Peter Keegan
There's also: http://www.tanukisoftware.com/ On Thu, Jan 9, 2014 at 11:18 AM, Nazik Huq wrote: > > > From your email I gather your main concern is starting zookeeper on server > startups. > > You may want to look at these non-native service oriented options too: > Create a script( cmd or bat)

Re: solr increase number of digits that tint fields can store

2014-01-09 Thread Chris Hostetter
A TrieIntField field can never contain a value greater then java's Integer.MAX_VALUE -- it doesn't matter what settings you use. If you want to store larger values, you need to use a TrieLongField and re-index. https://lucene.apache.org/solr/4_6_0/solr-core/org/apache/solr/schema/TrieIntField.

Re: Searchquery on field that contains space

2014-01-09 Thread PeterKerk
Basically a user starts typing the first letters of a city and I want to return citynames that start with those letters, case-insensitive and not splitting the cityname on separate words (whether the separator is a whitespace or a "-"). But although the search of a user is case-insensitive, I want

Re: Checking for similar text (duplicates)

2014-01-09 Thread Mikhail Khludnev
On Thu, Jan 9, 2014 at 5:39 PM, Cristian Bichis wrote: > Hi Mikhail, > > I seen deduplication part as well but I have some concerns: > > 1. Is deduplication supposed to work as well into a check-only (not try to > actually add new record to index) request ? So if I just check to see if > "could b

Re: Zookeeper as Service

2014-01-09 Thread Nazik Huq
From your email I gather your main concern is starting zookeeper on server startups. You may want to look at these non-native service oriented options too: Create a script( cmd or bat) to start ZK on server bootup. This method may not restart Zk if Zk crashes(not the server). Create C# commad

Re: How to boost documents ?

2014-01-09 Thread Anca Kopetz
Hi, I tested the BoostQueryParser and it works on the simplified example. But we need to keep the edismax Query parser, so I tried the following query and it seems to work (I defined a local bf='' for qq). &q=beautiful Christmas tree &mm=2 &qf=title^12 description^2 &defType=edismax &bf=map(que

Re: Searchquery on field that contains space

2014-01-09 Thread PeterKerk
@Ahmet: Thanks, but I also need to be able to search via wildcard and just found that a "-" might be resulting in unwanted results. E.g. when using this query: http://localhost:8983/solr/tt-cities/select/?indent=off&facet=false&fl=id,title,provincetitle_nl&q=title_search:nij*&defType=lucene&start

Re: Range queries with Grouping is slow?

2014-01-09 Thread Mikhail Khludnev
Hello, Here is workaround for caching separate clauses in OR filters. http://blog.griddynamics.com/2014/01/segmented-filter-cache-in-solr.html No coding is required, just try to experiment with request parameters. On Wed, Jan 8, 2014 at 9:11 PM, Erick Erickson wrote: > Well, actually you can us

Re: Range queries with Grouping is slow?

2014-01-09 Thread Smiley, David W.
It won¹t hit the filter cache if you set {! cache=false} local-param. On 1/8/14, 12:18 PM, "Kranti Parisa" wrote: >yes thats the key, these time ranges change frequently and hitting >filtercache then is a problem. I will try few more samples and probably >debug thru it. thanks. > > >Thanks, >Kra

Re: Zookeeper as Service

2014-01-09 Thread Charlie Hull
On 09/01/2014 09:44, Karthikeyan.Kannappan wrote: I am hosting in windows OS -- View this message in context: http://lucene.472066.n3.nabble.com/Zookeeper-as-Service-tp4110396p4110413.html Sent from the Solr - User mailing list archive at Nabble.com. There are various ways to 'servicify' (

solr increase number of digits that tint fields can store

2014-01-09 Thread Hakim Benoudjit
Hi, I have a price field of type tint, from which I will generate a range facet. And I have now some items in my index that exceed tint type limit (max integer). How do I increase tint max integer value? Here is tint definition in schema.xml: Do I have to increase precisionStep=8? because an e

Re: Checking for similar text (duplicates)

2014-01-09 Thread Cristian Bichis
Hi Mikhail, I seen deduplication part as well but I have some concerns: 1. Is deduplication supposed to work as well into a check-only (not try to actually add new record to index) request ? So if I just check to see if "could be" some duplicates of some text ? 2. As far as I seen the dedupl

Re: solr text analysis showing a red bar error

2014-01-09 Thread Aruna Kumar Pamulapati
See if this helps: https://groups.google.com/forum/#!topic/lily-discuss/IaQLpNVJRi8 On Thu, Jan 9, 2014 at 8:33 AM, Umapathy S wrote: > I checked that before. I am using solr-4.6.0. maxFieldLength is not > applicable. > > > On 9 January 2014 13:23, Aruna Kumar Pamulapati >wrote: > > > If yo

Re: solr text analysis showing a red bar error

2014-01-09 Thread Umapathy S
I checked that before. I am using solr-4.6.0. maxFieldLength is not applicable. On 9 January 2014 13:23, Aruna Kumar Pamulapati wrote: > If you are using a Solr version before 4.0 you should look into. > > solrconfig.xml: > > 1 > > > What is your solr version? > > > > On Thu, Jan 9,

Re: Checking for similar text (duplicates)

2014-01-09 Thread Mikhail Khludnev
Hello Cristian, Have you seen http://wiki.apache.org/solr/Deduplication ? On Thu, Jan 9, 2014 at 5:01 PM, Cristian Bichis wrote: > Hi, > > I have one app where the search part is based currently on something else > than Solr. However, as the scale/demand and complexity grows I am looking > at

Re: solr text analysis showing a red bar error

2014-01-09 Thread Aruna Kumar Pamulapati
If you are using a Solr version before 4.0 you should look into. solrconfig.xml: 1 What is your solr version? On Thu, Jan 9, 2014 at 8:16 AM, Aruna Kumar Pamulapati < apamulap...@gmail.com> wrote: > Thanks, can you paste the text that you were trying to analyze? > > > On Thu, Jan

Re: solr text analysis showing a red bar error

2014-01-09 Thread Aruna Kumar Pamulapati
Thanks, can you paste the text that you were trying to analyze? On Thu, Jan 9, 2014 at 8:10 AM, Umapathy S wrote: > Thanks. > > Actually there is no error thrown. Just a red bar appears on top. > I have pasted it on http://snag.gy/U9IiJ.jpg > > > On 9 January 2014 12:56, Aruna Kumar Pamulapati

Re: solr text analysis showing a red bar error

2014-01-09 Thread Umapathy S
Thanks. Actually there is no error thrown. Just a red bar appears on top. I have pasted it on http://snag.gy/U9IiJ.jpg On 9 January 2014 12:56, Aruna Kumar Pamulapati wrote: > Can you copy paste the error, for some reason I can not see the image of > the screenshot you posted. > > > On Thu, Ja

Checking for similar text (duplicates)

2014-01-09 Thread Cristian Bichis
Hi, I have one app where the search part is based currently on something else than Solr. However, as the scale/demand and complexity grows I am looking at Solr for a potential better fit, including for some features currently implemented into scripting layer (so which are not on search curren

Re: solr text analysis showing a red bar error

2014-01-09 Thread Aruna Kumar Pamulapati
Can you copy paste the error, for some reason I can not see the image of the screenshot you posted. On Thu, Jan 9, 2014 at 7:52 AM, Umapathy S wrote: > Hi, > > I am a new to solr/lucene. > I am trying to do a text analysis on my index. The below error > (screenshot) is shown when I increase th

solr text analysis showing a red bar error

2014-01-09 Thread Umapathy S
Hi, I am a new to solr/lucene. I am trying to do a text analysis on my index. The below error (screenshot) is shown when I increase the field value length. I have tried searching in vain for any length specific restrictions in solr.TextField. There is no error text/exception thrown. [image: Inl

Re: PeerSync Recovery fails, starting Replication Recovery

2014-01-09 Thread Anca Kopetz
Hi, We tried to understand why we get a "Connection reset" exception on the leader when it tries to foward the documents to one of its replica. We analyzed the GC logs and we did not see any long GC pauses around the time the exception was thrown. For 24 hours of gc logs, the max full gc pause

Re: Shard splitting error: cannot uncache file="_1.nvm"

2014-01-09 Thread rafal janik
Greg Preston wrote > [qtp243983770-60] ERROR org.apache.solr.core.SolrCore – > java.io.IOException: cannot uncache file="_1.nvm": it was separately > also created in the delegate directory > at > org.apache.lucene.store.NRTCachingDirectory.unCache(NRTCachingDirectory.java:297) > a

Re: Zookeeper as Service

2014-01-09 Thread Karthikeyan.Kannappan
I am hosting in windows OS -- View this message in context: http://lucene.472066.n3.nabble.com/Zookeeper-as-Service-tp4110396p4110413.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr OOM Crash

2014-01-09 Thread Sébastien Michel
Hi Sandra, Excuse me for the late reply. We use lotsofcores (http://wiki.apache.org/solr/LotsOfCores) Solr feature, around 100 simultaneous loaded cores. But the issue is reproducible with few less cores. We also have a high rate of indexing, and also reindexing (atomic update). We are indexing m