No live SolrServers available to handle this request

2015-11-05 Thread wilanjar .
Hi All, I'm very new handle the solrcloud. I've changed the scema.xml with adding field to index but after reload the collection we got error from logging " No live SolrServers available to handle this request". i have check solrcloud from localhost each node and running well. i'm using solr ver

Exception in grouping with docValues enable field.

2015-11-05 Thread Modassar Ather
Hi, I have following docValues enabled field. *Field : * *Type: * When I am grouping on this field I am getting following exception. Kindly let me know if I am missing something or it is an issue. org.apache.solr.common.SolrException; java.lang.NullPointerException at org.apache.

Re: MatchAllDocsQuery is much slower in solr5.3.1 compare to solr4.7

2015-11-05 Thread Jack Krupansky
I vaguely recall some discussion concerning removal of the field cache in Lucene. -- Jack Krupansky On Thu, Nov 5, 2015 at 10:38 PM, wei wrote: > We are running our search on solr4.7 and I am evaluating whether to upgrade > to solr5.3.1. I found MatchAllDocsQuery is much slower in solr5.3.1. An

Re: Solr Cloud and Multiple Indexes

2015-11-05 Thread Modassar Ather
Thanks for your response. I have already gone through those documents before. My point was that if I am using Solr Cloud the only way to distribute my indexes is by adding shards? and I don't have to do anything manually (because all the distributed search is handled by Solr Cloud). Yes as per my

MatchAllDocsQuery is much slower in solr5.3.1 compare to solr4.7

2015-11-05 Thread wei
We are running our search on solr4.7 and I am evaluating whether to upgrade to solr5.3.1. I found MatchAllDocsQuery is much slower in solr5.3.1. Anyone know why? We have a lot of queries without any query keyword, but we apply filters on the query. Load testing shows those queries are much slower

Trying to apply patch for SOLR-7036

2015-11-05 Thread r b
I just wanted to double check that my steps were not too off base. I am trying to apply the patch from 8/May/15 and it seems to be slightly off. Inside the working revision is 1658487 so I checked that out from svn. This is what I did. svn checkout http://svn.apache.org/repos/asf/lucene/dev/t

SolrSpatial conversion error

2015-11-05 Thread Gangl, Michael E (398H)
I’m processing some satellite coverage data and storing it in solr to search by geographical regions. I can create the correct WKT and pass ‘invalid’ tests when its created, but when I output to WKT and then ingested in solr, it looks like some string to digit conversion errors are happening: 2

Re: [Newbie question] in SOLR 5, would I have a "master-to-slave" relationship for two servers?

2015-11-05 Thread Erick Erickson
To pile on to Chris' comment. In the M/S situation you describe, all the query traffic goes to the slave. True, this relieves the slave from doing the work of indexing, but it _also_ prevents the master from answering queries. So going to SolrCloud trades off indexing on _both_ machines to also qu

Re: tikaparser docx file fails with exception

2015-11-05 Thread Alexandre Rafalovitch
It is quite clear actually that the problem is this: Caused by: java.io.CharConversionException: Characters larger than 4 bytes are not supported: byte 0xb7 implies a length of more than 4 bytes at org.apache.xmlbeans.impl.piccolo.xml.UTF8XMLDecoder.decode(UTF8XMLDecoder.java:162) at

RE: tikaparser docx file fails with exception

2015-11-05 Thread Aswath Srinivasan (TMS)
Thank you for attempting to answer. I will try out with solrj and standalone java with tika parser. I completely understand that a bad document could cause this, however, when I opened up the document I couldn't find anything suspicious expect for some binary images/pictures embedded into the do

Re: how to efficiently get sum of an int field

2015-11-05 Thread Renee Sun
thanks Yonik... I bet with solr 3.5 we do not have jason facet api support yet ... -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-efficiently-get-sum-of-an-int-field-tp4238464p4238522.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: how to efficiently get sum of an int field

2015-11-05 Thread Chris Hostetter
: On Thu, Nov 5, 2015 at 4:55 PM, Renee Sun wrote: : > Also Yonik, out of curiosity... when I run stats on a large msg set (such as : > 200 million msgs), it tends to use a lot of memory, this should be expected : > correct? : : With the stats component, yeah. the amount of RAM needed by the st

Re: how to efficiently get sum of an int field

2015-11-05 Thread Yonik Seeley
On Thu, Nov 5, 2015 at 4:55 PM, Renee Sun wrote: > Also Yonik, out of curiosity... when I run stats on a large msg set (such as > 200 million msgs), it tends to use a lot of memory, this should be expected > correct? With the stats component, yeah. > if I were able to use !sum=true to only get s

Re: how to efficiently get sum of an int field

2015-11-05 Thread Renee Sun
Also Yonik, out of curiosity... when I run stats on a large msg set (such as 200 million msgs), it tends to use a lot of memory, this should be expected correct? if I were able to use !sum=true to only get sum, a clever algorithm should be able to tell if sum is only requited, it will avoid memory

Re: how to efficiently get sum of an int field

2015-11-05 Thread Renee Sun
now I think with solr 3.5 (that we are using), !sum=true (overwrite default ) probably is not supported yet :-( thanks Renee -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-efficiently-get-sum-of-an-int-field-tp4238464p4238519.html Sent from the Solr - User mailing l

Re: collection API timeout

2015-11-05 Thread lboutros
Hi Julien, just one additional thing, if you developed some plugins/filters, you will have to adapt and compile them for the Solr 5 API. Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/collection-API-timeout-tp4238150p4238511.html Sent from the

Re: [Newbie question] in SOLR 5, would I have a "master-to-slave" relationship for two servers?

2015-11-05 Thread Chris Hostetter
: The database of server 2 is considered the "master" and it is replicated : regularly to server 1, the "slave". : : The advantage is the responsiveness of server 1 is not impacted with server : 2 gets busy with lots of indexing. : : QUESTION: When deploying a SOLR 5 setup, do I set things up th

Re: [Newbie question] what is a "core" and are they different from 3.x to 5.x ?

2015-11-05 Thread Chris Hostetter
: I can see there is something called a "core" ... it appears there can be : many cores for a single SOLR server. : : Can someone "explain like I'm five" -- what is a core? https://cwiki.apache.org/confluence/display/solr/Solr+Cores+and+solr.xml "In Solr, the term core is used to refer to a sin

[Newbie question] what is a "core" and are they different from 3.x to 5.x ?

2015-11-05 Thread Robert Hume
Trying to learn about SOLR. I can see there is something called a "core" ... it appears there can be many cores for a single SOLR server. Can someone "explain like I'm five" -- what is a core? And how do "cores" differ from 3.x to 5.x. Any pointers in the right direction are helpful! Thanks! R

[Newbie question] in SOLR 5, would I have a "master-to-slave" relationship for two servers?

2015-11-05 Thread Robert Hume
Hi, In my SOLR 3 deployment (inherited it), I have (1) one SOLR server that is used by my web application, and (2) a second SOLR server that is used to index documents via a customer datasource. The database of server 2 is considered the "master" and it is replicated regularly to server 1, the "s

Re: how to efficiently get sum of an int field

2015-11-05 Thread Renee Sun
I did try single quote with backslash of the bang. also tried disable history chars... did not work for me. unfortunately, we are using solr 3.5, probably does not support json format? -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-efficiently-get-sum-of-an-int-f

Re: how to efficiently get sum of an int field

2015-11-05 Thread Yonik Seeley
You can also try the new JSON Facet API if you are on a recent version of Solr. json.facet={x:"sum(myfield)"} http://yonik.com/solr-facet-functions/ -Yonik On Thu, Nov 5, 2015 at 1:14 PM, Renee Sun wrote: > Hi - > I have been using stats to get the sum of a field data (int) like: > > &stats=t

Re: how to efficiently get sum of an int field

2015-11-05 Thread Alexandre Rafalovitch
Ah, Unix. Isn't it wonderful (it is, but): http://unix.stackexchange.com/questions/3051/how-to-echo-a-bang Try single quotes and backslash before the bang. Or disable history characters. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-st

Solr Search: Access Control / Role based security

2015-11-05 Thread Susheel Kumar
Hi, I have seen couple of use cases / need where we want to restrict result of search based on role of a user. For e.g. - if user role is admin, any document from the search result will be returned - if user role is manager, only documents intended for managers will be returned - if user role is

Re: how to efficiently get sum of an int field

2015-11-05 Thread Renee Sun
thanks! but it is silly that I can seem to escape the {!sum=true} properly to make it work in my curl :-( time curl -d 'q=*:*&rows=0&shards=solrhostname:8080/solr/413-1,anothersolrhost:8080/solr/413-2&stats=true&stats.field={!sum=true}myfieldname' http://localhost:8080/solr/413-1/select/? | xmll

Re: highlighting on child document

2015-11-05 Thread Yangrui Guo
So if child document highlighting doesn't work how can I let solr tell which child document and its field matched? On Wednesday, November 4, 2015, Mikhail Khludnev wrote: > Hello, > > Highlighter for block join hasn't been implemented. So, far you can call > highlighter with children query also

Re: how to efficiently get sum of an int field

2015-11-05 Thread Chris Hostetter
: &stats=true&stats.field=my_field_name&rows=0 ... : I noticed the 'stats' give out more information than I needed (just sum), I : suspect the min/max/mean etc are the ones that caused the time. : : Is there a simple way I can just get the sum without other things, and run : it on a faste

how to efficiently get sum of an int field

2015-11-05 Thread Renee Sun
Hi - I have been using stats to get the sum of a field data (int) like: &stats=true&stats.field=my_field_name&rows=0 It works fine but when the index has hundreds million messages on a sharded indices, it take long time. I noticed the 'stats' give out more information than I needed (just sum), I

Re: Child document and parent document with same key

2015-11-05 Thread Mikhail Khludnev
On Fri, Oct 16, 2015 at 10:41 PM, Jamie Johnson wrote: > Is this expected to work? I think it is. I'm still not sure I understand the question. But let me bring some details from SOLR-3076: - Solr's backs on Lucene's "deleteTerm" which is supplied into indexWriter.updateDocument(); - when pare

Re: Solr Features

2015-11-05 Thread Alexandre Rafalovitch
On 5 November 2015 at 11:22, Shawn Heisey wrote: > As far as I know, there are no currently available books covering > version 5, but I believe there is at least one on the horizon. Rafal's book is "compatible" with Solr 5: http://solr.pl/solr-cookbook-third-edition/ . But the number of features

Re: Solr Features

2015-11-05 Thread Shawn Heisey
On 11/5/2015 8:38 AM, Jack Krupansky wrote: > It's unfortunate, but the official Solr reference guide does not have a > table of contents: > http://mirror.olnevhost.net/pub/apache/lucene/solr/ref-guide/apache-solr-ref-guide-5.3.pdf > https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Refe

Re: Invalid parsing with solr edismax operators

2015-11-05 Thread Jack Krupansky
Great. Now, we'll have to see if any enterprising committers will step up and take a look. -- Jack Krupansky On Thu, Nov 5, 2015 at 4:46 AM, Mahmoud Almokadem wrote: > Thanks Jack. I have reported it as a bug on JIRA > > https://issues.apache.org/jira/browse/SOLR-8237 < > https://issues.apache.

Re: Securing field level access permission by filtering the query itself

2015-11-05 Thread Alessandro Benedetti
Be careful to the suggester as well. You don't want to show suggestions coming from sensitive fields. Cheers On 5 November 2015 at 15:28, Scott Stults wrote: > Good to hear! Depending on how far you want to take it, you can then scan > the initial request coming in from the client (and the fina

Re: Solr Features

2015-11-05 Thread Jack Krupansky
It's unfortunate, but the official Solr reference guide does not have a table of contents: http://mirror.olnevhost.net/pub/apache/lucene/solr/ref-guide/apache-solr-ref-guide-5.3.pdf https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide My Solr 4.4 Deep Dive is now a little o

Re: Boosting a document score when advertised! Please help!

2015-11-05 Thread Walter Underwood
The elevation component will be a ton of manual work. Instead, use edismax and the boost parameter. Add a field that is true for paid documents, then boost for paid:true. It might be easier to use a boost query (bq) to do this. The extra boost will be a tiebreaker for documents that would have

Re: Securing field level access permission by filtering the query itself

2015-11-05 Thread Scott Stults
Good to hear! Depending on how far you want to take it, you can then scan the initial request coming in from the client (and the final response) for raw Solr fields -- that shouldn't happen. I've used mod_security as a general-purpose application firewall and would recommend it. k/r, Scott On Wed

Re: Solr Features

2015-11-05 Thread Erick Erickson
I agree with Alexandre, the question is far too broad. Better to pick something you want to _do_ and ask how to accomplish that. Define a use case that's useful for your user base (actual or future) and see if Solr can do that. See the "books" section here for a number of resources that people ha

Re: collection API timeout

2015-11-05 Thread Erick Erickson
You should be able to go straight to 5.3.1 Solr/Lucene tries to guarantee that the indexes are readable one major version back, i.e. 5.x Solr/Lucene code should be able to read any 4.x index. New segments are written in the most modern format, so over time all the 4x remnants should disappear. Yo

Re: collection API timeout

2015-11-05 Thread Julien David
Seems I'ill need to upgrade to 5.3.1 It is possible to upgrade from 4.9 to 5.3 or do I need deploy all intermediate versions? Thks

Re: Solr Features

2015-11-05 Thread Alexandre Rafalovitch
Glad you liked it. The problem with your request is that it is not clear what you already know and in which direction you are trying to go. Cloud is a big topic all on its own. Relevancy - another one. Crafting schema to best represent your data - a third. Loading data with DIH vs. SolrJ vs. 3rd p

Re: Boosting a document score when advertised! Please help!

2015-11-05 Thread Doug Turnbull
Funny I'm editing a chapter about boosting for a book :) http://manning.com/turnbull Anyway, I've been told by others that this blog post I wrote was really useful in teaching them how to carefully boost documents. Maybe it would help you? http://opensourceconnections.com/blog/2013/07/21/improve-s

Re: Boosting a document score when advertised! Please help!

2015-11-05 Thread Paul Libbrecht
Alessandro, none of them seem to match what I'd expect be done: given an extra param that indicates the author, for each query, add an extra boosting. Christian, I used to do that with a query component (in java) but I think that nowadays you can do that with the bq parameter of edismax. paul

Re: Solr Features

2015-11-05 Thread Salman Ansari
Thanks Alex for your response. Much appreciated effort! For sure, I will need to look for all those details and information to fully understand Solr but I don't have that much time in my hand. That's why I was thinking instead of reading everything from the beginning is to start with a feature list

Re: Solr Cloud and Multiple Indexes

2015-11-05 Thread Salman Ansari
Thanks for your response. I have already gone through those documents before. My point was that if I am using Solr Cloud the only way to distribute my indexes is by adding shards? and I don't have to do anything manually (because all the distributed search is handled by Solr Cloud). What is the Xm

Re: Solr Features

2015-11-05 Thread Alexandre Rafalovitch
Well, I've started to answer, but it hit a nerve and turned into a guide. Which is now a blog post with 6 steps (not mentioning step 0 - Admitting you have a problem). I hope this is helpful: http://blog.outerthoughts.com/2015/11/learning-solr-comprehensively/ Regards, Alex. Solr Analyzer

Re: Child document and parent document with same key

2015-11-05 Thread Jamie Johnson
The field is "key" and this is the value of unique key in schema.xml On Oct 17, 2015 3:23 AM, "Mikhail Khludnev" wrote: > Hello, > > What are the field names for parent and child docs exactly? > Whats' in schema.xml? > What you've got if you actually try to do this? > > On Fri, Oct 16, 2015 at 1

Re: Boosting a document score when advertised! Please help!

2015-11-05 Thread Alessandro Benedetti
Hi Christian, there are several ways : 1) Elevation query component - it should be your winner : https://cwiki.apache.org/confluence/display/solr/The+Query+Elevation+Component 2) Play with boosting according to your requirements Cheers On 5 November 2015 at 10:52, wrote: > Hi everyone,I'm bui

[SolrJ Clients] RequestWriter VS BinaryRequestWriter

2015-11-05 Thread Alessandro Benedetti
Hi guys, I was taking a look to the implementation details to understand how Solr requests are written by SolrJ APIs. The interesting classes are : *org.apache.solr.client.solrj.request.RequestWriter* *org.apache.solr.client.solrj.impl.BinaryRequestWriter* ( wrong package ? ) I discovered that :

Boosting a document score when advertised! Please help!

2015-11-05 Thread liviuchristian
Hi everyone,I'm building a food recipe search engine based on solr. I need to boost documents score for the recipes that their authors paid for in order to have them returned first when somebody searches for "chocolate cake with hazelnuts". So those recipes that match the query terms and their

Re: Solr Cloud and Multiple Indexes

2015-11-05 Thread Modassar Ather
SolrCloud makes the distributed search easier. You can find details about it under following link. https://cwiki.apache.org/confluence/display/solr/How+SolrCloud+Works You can also refer to following link: https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud >Fro

Re: Invalid parsing with solr edismax operators

2015-11-05 Thread Mahmoud Almokadem
Thanks Jack. I have reported it as a bug on JIRA https://issues.apache.org/jira/browse/SOLR-8237 Mahmoud > On Nov 4, 2015, at 5:30 PM, Jack Krupansky wrote: > > I think you should go ahead and file a Jira ticket for this as a bug since > eit

Re: Solr Cloud and Multiple Indexes

2015-11-05 Thread Salman Ansari
Here is the current info How much memory is used? Physical memory consumption: 5.48 GB out of 14 GB. Swap space consumption: 5.83 GB out of 15.94 GB. JVM-Memory consumption: 1.58 GB out of 3.83 GB. What is your index size? I have around 70M documents distributed on 2 shards (so each shard has 35M

Re: OpenNLP plugin or similar NER software for Solr ??? !!!

2015-11-05 Thread Alessandro Benedetti
Apparently this mail thread is duplicated, anyway I will copy and paste my previous comment as well : Hi Christian. This was quite easy to have, since 2011. But you can complicate this as much as you want. Or customise it as much as you want. Take a look : https://cwiki.apache.org/confluence/dis

Re: Solr Cloud and Multiple Indexes

2015-11-05 Thread Modassar Ather
What is your index size? How much memory is used? What type of queries are slow? Are there GC pauses as they can be a cause of slowness? Are document updates/additions happening in parallel? The queries are very slow to run so I was thinking to distribute the indexes into multiple indexes and cons

Solr Cloud and Multiple Indexes

2015-11-05 Thread Salman Ansari
Hi, I am using Solr cloud and I have created a single index that host around 70M documents distributed into 2 shards (each having 35M documents) and 2 replicas. The queries are very slow to run so I was thinking to distribute the indexes into multiple indexes and consequently distributed search. C