Re: serious JSON Facet bug

2015-07-23 Thread Nagasharath
I don't have this issue. I have tried with various json facet queries and my filter cache always come down to the 'minsize'( never exceeds configured) with solr version 5.2.1, and all my queries are json nested faceted. > On 23-Jul-2015, at 7:43 pm, Yonik Seeley wrote: > >> On Thu, Jul 23, 20

Re: Unexpected docvalues type error using result grouping - Use UninvertingReader or index with docvalues

2015-07-23 Thread William Bell
You could try stopping SOLR, going into the data directory and rm -rf * and starting SOLR again. Did you use the schema REST api? Residual ? On Thu, Jul 23, 2015 at 6:57 PM, Shamik Bandopadhyay wrote: > Hi, > > I'm facing this weird error while running result grouping queries. This > started w

Unexpected docvalues type error using result grouping - Use UninvertingReader or index with docvalues

2015-07-23 Thread Shamik Bandopadhyay
Hi, I'm facing this weird error while running result grouping queries. This started when I turned on "docvalues" for an existing facet field and indexed the documents. Looking at the exception, I reverted back the change and re-indexed the documents again. But I'm still getting the exception, her

Re: serious JSON Facet bug

2015-07-23 Thread Yonik Seeley
On Thu, Jul 23, 2015 at 5:00 PM, Harry Yoo wrote: > Is there a way to patch? I am using 5.2.1 and using json facet in production. First you should see if your queries tickle the bug... check the size of the filter cache from the admin screen (under plugins, filterCache) and see if it's current si

Re: Running SOLR 5.2.1 on Embedded Jetty

2015-07-23 Thread Shawn Heisey
On 7/23/2015 3:14 PM, Darin Amos wrote: > I have been trying to run the SOLR war with embedded Jetty and can’t seem to > get the config quiet right. Is there any known documentation on this or is > someone else doing this? I seem to just be setting up a document server at my > solr.home director

How to connect Solr with Impala?

2015-07-23 Thread Rex X
Given following Impala query: SELECT date, SUM(CAST(price AS DOUBLE)) AS price FROM table WHERE date='2014-01-01' AND store_id IN(1,2,3) GROUP BY date; To work with Solr 1. Will it be more efficient to directly use equivalent Solr query? Any curl command equivalent to the Impala

Re: Basic Auth (again)

2015-07-23 Thread Peter Sturge
Hi Steve, We've not yet moved to Solr 5, but we do use Jetty 9. In any case, Basic Auth is a Jetty thing, not a Solr thing. We do use this mechanism to great effect to secure things like index writers and such, and it does work well once it's setup. Jetty, as with all containers, is a bit fussy ab

Re: SolrCloud 5.2.1 - collection creation error

2015-07-23 Thread Upayavira
Yay! On Thu, Jul 23, 2015, at 10:13 PM, Aaron Gibbons wrote: > Ah, now we're on to something! Solr 4.10.0 is also using the same > zookeepers, and both are using Oracle Java 8 JRE. > > Did some research and uploaded a new config to zookeeper using chroot to > isolate them. Changed the init scrip

RE: cache implemetation?

2015-07-23 Thread cbuxbaum
Hi Shawn, Thanks for your help. I settled on the following solution, that I am in the process of testing out:

Re: caceh implemetation?

2015-07-23 Thread Shawn Heisey
On 7/23/2015 10:55 AM, cbuxbaum wrote: > Say we have 100 party records. Then the child SQL will be run 100 > times (once for each party record). Isn't there a way to just run the child > SQL on all of the party records at once with a join, using a GROUP BY and > ORDER BY on the PARTY_ID?

Re: Issues sending mail to the list

2015-07-23 Thread Erick Erickson
sometimes when echoing back the whole thread it looks like spam On Thu, Jul 23, 2015 at 1:42 PM, Steven White wrote: > Three emails to the existing subject of "Basic auth" didn't make it. As > you may have seen, I started a new email thread on this subject under > "Basic Auth (again)" and no

Re: Using payloads and user provided data in score

2015-07-23 Thread Jamie Johnson
Well you've at least confirmed what I was thinking :). I am using payloads now for this and I think I have something very basic working. The results don't get dropped out when the scores are 0 so I had to also write a custom collector that could be plugged into the AnalyticQueryAPI (maybe there i

Re: Inconsistent Solr Search Results

2015-07-23 Thread Erick Erickson
Same issue really as long as there's more than one replica/shard. Tied scores are broken by internal ID, specifying a secondary sort should regularize things. Best, Erick On Thu, Jul 23, 2015 at 10:01 AM, Tarala, Magesh wrote: > Erick, > The 3 node cluster is setup to use 3 shards each with 1 r

Re: serious JSON Facet bug

2015-07-23 Thread Harry Yoo
Is there a way to patch? I am using 5.2.1 and using json facet in production. > On Jul 16, 2015, at 1:43 PM, Yonik Seeley wrote: > > To anyone using the JSON Facet API in released Solr versions: > I discovered a serious memory leak while doing performance benchmarks > (see http://yonik.com/facet

Running SOLR 5.2.1 on Embedded Jetty

2015-07-23 Thread Darin Amos
Hello, I have been trying to run the SOLR war with embedded Jetty and can’t seem to get the config quiet right. Is there any known documentation on this or is someone else doing this? I seem to just be setting up a document server at my solr.home directory. The code snippet below seems incomple

Re: SolrCloud 5.2.1 - collection creation error

2015-07-23 Thread Aaron Gibbons
Ah, now we're on to something! Solr 4.10.0 is also using the same zookeepers, and both are using Oracle Java 8 JRE. Did some research and uploaded a new config to zookeeper using chroot to isolate them. Changed the init script to have ZK_Host="zk1,zk2,zk3/DevConfigs". I did see that you should b

Re: Issues sending mail to the list

2015-07-23 Thread Steven White
Three emails to the existing subject of "Basic auth" didn't make it. As you may have seen, I started a new email thread on this subject under "Basic Auth (again)" and now they are making it to the list. I don't know what to make of this. Steve On Thu, Jul 23, 2015 at 4:31 PM, Upayavira wrote:

Re: Basic Auth (again)

2015-07-23 Thread Steven White
Hi Petter, I'm on Solr 5.2.1 which comes with Jetty 9.2. I'm setting this up on Windows 2012 but will need to do the same on Linux too. I followed the step per this link: https://wiki.apache.org/solr/SolrSecurity#Jetty_realm_example very much to the book. Here are the changes I made: File: C:\

Re: Issues sending mail to the list

2015-07-23 Thread Upayavira
Be sure to be sending plain text emails, not HTML, and watch out for things that could be considered spam. Apache mail servers do receive a LOT of spam, so need to have relatively aggressive spam filters in place. Upayavira On Thu, Jul 23, 2015, at 07:29 PM, Steven White wrote: > Hi Everyone, >

Re: Different scores for the same search

2015-07-23 Thread Upayavira
There's a few odd things here, looking at your explains output: First search, first result: 6.357613 = (MATCH) weight(description:jackshaft in 3339) [DefaultSimilarity], result of: 6.357613 = fieldWeight in 3339, product of: 1.0 = tf(freq=1.0), with freq of: 1.0 = termFreq=1.0 10.1

How to:- Extending Tika within Solr

2015-07-23 Thread Aditya Dhulipala
Hi, I have implemented a new file-type parser for TIka. It parses a custom filetype (*.mx) I would like my Solr instance to use my version of Tika with the mx parser. I found this by a google search https://lucidworks.com/blog/extending-apache-tika-capabilities/ But it seems to be over 5 yea

RE: Different scores for the same search

2015-07-23 Thread Tarala, Magesh
I added the explicit sort: http://server1.domain.com:8983/solr/serviceorder_shard1_replica2/select?q=description%3Ajackshaft&fl=service_order&wt=json&indent=true&debugQuery=true&sort=score%20desc,id%20asc Still seeing the same behavior - inconsistent results: First time I run: "rawquerystrin

Re: Different scores for the same search

2015-07-23 Thread Upayavira
What it looks like is kinda as Erick suggested - the scores are the same for some docs, so it probably depends upon which order they come back from the shards as to which will be shown first. If the issue is that the score is the same for some docs, try adding a deliberate sort: &sort=score desc

Re: Basic Auth (again)

2015-07-23 Thread Peter Sturge
Hi Steve, What version of Jetty are you using? Have you got a webdefault.xml in your etc folder? If so, does it have an entry like this: BASIC Realm Name as specified in jetty.xml It's been a few years since I set this up, but I believe you also need an auth-constraint in webdefau

Basic Auth (again)

2015-07-23 Thread Steven White
(re-posting as new email thread to see if this will make it to the list) That didn't help. I still get the same result and virtually no log to help me figure out where / what things are going wrong. Here is all that I see in C:\Solr\solr-5.2.1\server\logs\solr.log: INFO - 2015-07-23 05:29:1

Issues sending mail to the list

2015-07-23 Thread Steven White
Hi Everyone, I'm seeing that some of my emails are not making it to the mailing list and I confirmed that I'm subscribed: Hi! This is the ezmlm program. I'm managing the solr-user@lucene.apache.org mailing list. I'm working for my owner, who can be reached at solr-user-ow...@luce

Different scores for the same search

2015-07-23 Thread Tarala, Magesh
I'm executing a very simple search in a 3 node cluster - 3 shards with 1 replica each. Solr version 4.10.2: http://server1.domain.com:8983/solr/serviceorder_shard1_replica2/select?q=description%3Ajackshaft&fl=service_order&wt=json&indent=true&debugQuery=true I'm getting different scores when I ru

Re: Performance of facet contain search in 5.2.1

2015-07-23 Thread Erick Erickson
bq: I do not understand why anyone would ever use facet.prefix or facet.contains for any use other than a development... Gotta disagree a bit here. AFAIK, it depends on the number of unique terms in the field. How would either one be worse than facet.field? And you can freely use facet.field on a

Re: Using payloads and user provided data in score

2015-07-23 Thread Erick Erickson
bq: Your "ugly problem" is my situation I think ;) No, your problem is much worse ;( The _contents_ of fields are restricted, which is horrible. OK, here's another idea out of waaay left field: Payloads. It hinges on there being an OK number of possible combinations which seems to be the ca

Re: caceh implemetation?

2015-07-23 Thread Mikhail Khludnev
Hello, I briefly described the similar problems at http://blog.griddynamics.com/2015/07/how-to-import-structured-data-into-solr.html Let me know if you have further questions On Thu, Jul 23, 2015 at 7:55 PM, cbuxbaum wrote: > That's OK, I have determined that caching is not relevant to our us

Re: SolrCloud 5.2.1 - collection creation error

2015-07-23 Thread Upayavira
Hmmm, what other Solr nodes do you have connected to Zookeeper? Are any of them running a different Java or Solr version? It looks like you have another node connected to your Zookeeper that has taken the role of "overseer" and it is sending back serialized java objects that your own node cannot h

RE: Inconsistent Solr Search Results

2015-07-23 Thread Tarala, Magesh
Erick, The 3 node cluster is setup to use 3 shards each with 1 replica. So, the index is split on 3 servers. Another piece of info - I think the issue happens only when I use pagination. Verifying if that's the case.. Here's a query from the solr log on the server I'm pointing the query to:

Re: caceh implemetation?

2015-07-23 Thread cbuxbaum
That's OK, I have determined that caching is not relevant to our use case. However, I have a question about the full import queries that we are using: Here is the SQL from the top level entity: query="SELECT DISTINCT 'LEAP_PARTY' AS DOCUMENT_TYPE, VPARTY.OWNER AS PARTY_OW

Re: caceh implemetation?

2015-07-23 Thread Erick Erickson
Your version of the config didn't come through, the mail program is pretty aggressive about stripping attachments and things. Best, Erick On Thu, Jul 23, 2015 at 8:20 AM, cbuxbaum wrote: > Hi, We are trying to improve the performance of our data import. We tried > using the CachedSqlEntityProce

Re: Inconsistent Solr Search Results

2015-07-23 Thread Erick Erickson
The query you're running would help. But here's a guess: You say you have a "3 node Solr cluster". By that I'm guessing you mean a single shard with 1 leader and 2 replicas. when the primary sort criteria (score by default) is tied between two documents, the internal Lucene doc ID is used as a tie

Re: solr blocking and client timeout issue

2015-07-23 Thread Erick Erickson
Thanks for letting us know how it turned out. Too often I'm never sure what actually _worked_ Erick On Thu, Jul 23, 2015 at 8:56 AM, Jeremy Ashcraft wrote: > A quick follow up, after finding and eliminating some code that was > generating multiple update requests per second, applying the CMS

Re: Per-document and per-query analysis

2015-07-23 Thread Alessandro Benedetti
markus, the first idea that come to my mind is this : 1) you configure your schema, creating your field types, and if necessary fields associated 2) you build an UpdateRequestProcessor that do a conditional check per document, and create the proper fields starting from one input field . In this w

Per-document and per-query analysis

2015-07-23 Thread Markus Jelsma
Hello - the title says it all. When indexing a document, we need to run one or more additional filters depending on the value of a specific field. Likewise, we need to run that same filter over the already analyzed tokens when querying. This is not going to work if i extend TextField, at all. An

Re: solr blocking and client timeout issue

2015-07-23 Thread Jeremy Ashcraft
A quick follow up, after finding and eliminating some code that was generating multiple update requests per second, applying the CMS GC tuning options, and upgrading to Java 8, we've not experienced a single long term GC pause. The java 8 upgrade got rid of the final couple of pauses during th

Re: SolrCloud 5.2.1 - collection creation error

2015-07-23 Thread Aaron Gibbons
I originally started using Ansible playbooks which did install the JDK (with the same error), but have been doing manual installs to take Ansible completely out of the equation. Safari wasn't giving showing the XML response so I ran this in Chrome.. http://XX.X.XX.XXX:8983/solr/admin/collections?a

Inconsistent Solr Search Results

2015-07-23 Thread Tarala, Magesh
I have about 15K documents in a 3 node solr cluster. When I execute a simple search, I get the results in different order every time I search. But the number of records is the same. Here's the definition for the field. Any ideas, suggestions would be greatly appreciated.

Re: SolrCloud 5.2.1 - collection creation error

2015-07-23 Thread Upayavira
Have you tried it with a JDK? I tend to use JDK rather than JRE, but don't recall whether this is a specific requirement for Solr. Can you show the URL you use for the API, and the JSON/XML response you get? I wouldn't expect to see mention of solrj in the API because it isn't used. Just for the r

caceh implemetation?

2015-07-23 Thread cbuxbaum
Hi, We are trying to improve the performance of our data import. We tried using the CachedSqlEntityProcessor implementation, but that is apparently broken. I am looking at the workaround described below: https://issues.apache.org/jira/browse/SOLR-3857

Re: SolrCloud 5.2.1 - collection creation error

2015-07-23 Thread Aaron Gibbons
*When you run bin/solr you are doing that on the instance itself? * Yes *You show a collections API URL below. Does that fail the same way?* Error from API: 50042java.io.InvalidClassException: org.apache.solr.client.solrj.SolrResponse; local class incompatible: stream classdesc serialVersionUID =

Re: Solr Clustering Issue

2015-07-23 Thread Shawn Heisey
On 7/23/2015 7:51 AM, Joseph Obernberger wrote: > Hi Upayavira - the URL was: > > http://server1:9100/solr/MYCOL1/clustering?q=Collection:(COLLECT1008+OR+COLLECT2587)+AND+(amazon+AND+soap)&wt=json&indent=true&clustering=true&rows=1&df=FULL_DOCUMENT&debugQuery=true > > > Here is the relevant part

Re: Solr Clustering Issue

2015-07-23 Thread Upayavira
I've seen something like this on another system - where the OR is consumed as a query term rather than an operator. Remember that Edismax will use the Lucene query parser (which supports OR, etc) unless there is an exception, and defer to dismax if there is a syntax error. What I'd suggest here i

Re: Using payloads and user provided data in score

2015-07-23 Thread Jamie Johnson
Sorry for being vague, I'll try to explain more. In my use case a particular field does not have a security control, it's the data in the field. So for instance if I had a schema with a field called name, there could be data that should be secured at A, B, A&B, A|B, etc within that field. So aga

Re: SolrCloud 5.2.1 - collection creation error

2015-07-23 Thread Upayavira
I'd still like to just confirm that you're using the same Java for running Solr and for running bin/solr. When you run bin/solr you are doing that on the instance itself? You show a collections API URL below. Does that fail the same way? Basically, the exception you showed was a SolrJ error. Usi

Re: Solr Clustering Issue

2015-07-23 Thread Joseph Obernberger
Hi Upayavira - the URL was: http://server1:9100/solr/MYCOL1/clustering?q=Collection:(COLLECT1008+OR+COLLECT2587)+AND+(amazon+AND+soap)&wt=json&indent=true&clustering=true&rows=1&df=FULL_DOCUMENT&debugQuery=true Here is the relevant part of the response - notice that the default field (FULL_DOCU

Re: SolrCloud 5.2.1 - collection creation error

2015-07-23 Thread Aaron Gibbons
I've mainly used Oracle Java 8, but tested 7 also. Typically I'll wipe the machines and start from scratch before installing a different version. The latest attempt followed these steps exactly on each machine: - sudo apt-get install python-software-properties - sudo add-apt-repository ppa

Re: XSLT with maps

2015-07-23 Thread Upayavira
0 1 Note, if you find XSLT hard (which, unfortunately, is common), you might find the Velocity response writer an easier, more conventional way to do it. You will be generating XML, but you'll be treating it much more as just text output. Upayavira On Thu, Jul 23, 2015, at 01:32

Re: XSLT with maps

2015-07-23 Thread Sreekant Sreedharan
That worked for most of my attributes. I have only one issue to fix. How would I convert boolean values to integers? For example: ... false to Is that possible as well? On that note, what version of XSLT should I assume SOLR supports? XSLT 1.0 or 2.0? Also, is it fair t

Re: Performance of facet contain search in 5.2.1

2015-07-23 Thread Alessandro Benedetti
Hi Dave and Markus, I would definitely suggest to use the *Suggester Component* . In particular, for your use case I suggest the AnalyzingInfixLookup strategy . As usual i suggest : Erick's post - http://lucidworks.com/blog/solr-suggester/ My post - http://alexbenedetti.blogspot.co.uk/2015/07/sol

Re: XSLT with maps

2015-07-23 Thread Upayavira
you are correct, I should have said: On Thu, Jul 23, 2015, at 10:15 AM, Sreekant Sreedharan wrote: > Well, if you had a result say: > ... > > 589587B2B1CA4C4683FC106967E7C326 > EE3YYK > 31034 > > ... > > applying the template: > > > > > > > would result in the following XML:

Re: XSLT with maps

2015-07-23 Thread Sreekant Sreedharan
Well, if you had a result say: ... 589587B2B1CA4C4683FC106967E7C326 EE3YYK 31034 ... applying the template: would result in the following XML: This is because 'id' is not an attribute of 'doc'. It is not even the attribute of doc/str. It is a value of doc/str/@name. But the

RE: Performance of facet contain search in 5.2.1

2015-07-23 Thread Markus Jelsma
Hello - You should index your terms as n-grams indeed, especially for autocompletion. I do not understand why anyone would ever use facet.prefix or facet.contains for any use other than a development tool. It won't perform on any index larger than small. Jan Høydahl has put up a thorough exampl