Re: SOLR 4.3.1?

2013-06-16 Thread Anshum Gupta
It's already cut and the vote has been passed. It should be out any time now. On Mon, Jun 17, 2013 at 11:26 AM, William Bell wrote: > When is 4.3.1 coming out? > > -- > Bill Bell > billnb...@gmail.com > cell 720-256-8076 > -- Anshum Gupta http://www.anshumgupta.net

SOLR 4.3.1?

2013-06-16 Thread William Bell
When is 4.3.1 coming out? -- Bill Bell billnb...@gmail.com cell 720-256-8076

sort=geodist() asc

2013-06-16 Thread William Bell
This simple feature of "sort=geodist() asc" is very powerful since it enables us to move from SOLR 3 to SOLR 4 without rewriting all our queries. We also use boost=geodist() in some cases, and some bf/bq. bf=recip(geodist(),2,200,20)&sort=score desc

Re: Adding pdf/word file using JSON/XML

2013-06-16 Thread Jack Krupansky
I won't assert "total" mastery as a requirement. Degrees of mastery are sufficient. But even then, even "partial" mastery of some rather basic areas of Solor can be quite daunting. It is enlightening to consider just how many nooks and crannies of Solr there are to master, and how many reasona

Re: Best way to match umlauts

2013-06-16 Thread Lance Norskog
One small thing: German u-umlaut is often "flattened" as 'ue' instead of 'u'. And the same with o-umlaut, it can be 'oe' or 'o'. I don't know if Lucene has a good solution for this problem. On 06/16/2013 06:44 AM, adityab wrote: Thanks for the explanation Steve. I now see it clearly. In my cas

Re: Adding pdf/word file using JSON/XML

2013-06-16 Thread Lance Norskog
No, they just learned a few features and then stopped because it was "good enough", and they had a thousand other things to code. As to REST- yes, it is worth having a coherent API. Solr is behind the curve here. Look at the HATEOS paradigm. It's ornate (and a really goofy name) but it provide

Re: Adding pdf/word file using JSON/XML

2013-06-16 Thread Otis Gospodnetic
Serious thread hiJacking here Hey, why was I singled out? ;) I don't have time to get deep into this (there are non-experts I need to help! kidding...) , but I'll say this: * Do you know any non-trivial piece of software in which an average developer is a master? I've managed to master t

Re: Adding pdf/word file using JSON/XML

2013-06-16 Thread Alexandre Rafalovitch
On Sun, Jun 16, 2013 at 7:27 PM, Walter Underwood wrote: > 2. Someone who expects partial update in a search engine, or transactions, > has a deep misunderstandings of the tradeoffs you make for what search can > do. That isn't mastery of arcane details, that is search 101. Yes, they might (hav

Re: Adding pdf/word file using JSON/XML

2013-06-16 Thread Walter Underwood
1. Total mastery of a product is a strange requirement. That would would be a huge trivia contest that would include all the vestigial bad bits. For example, I feel no need to master the Porter stemmer. I have no idea how to do geo search in Solr, though I'm sure I could learn it pretty quickly

Re: Adding pdf/word file using JSON/XML

2013-06-16 Thread Yago Riveiro
I'm share the yonik's opinion that a pure REST application is in some cases is a pain in the ass. But like Jack referred, exists some cases where REST is more expressive and is easy to understand what are you doing. At this point, I think that is more important make the actual API more stable

Re: Adding pdf/word file using JSON/XML

2013-06-16 Thread Jack Krupansky
Exactly. For the case in point that is the real, underlying subject of this thread, the desire is to partially update an existing Solr document using the output of SolrCell/Tika. With true/pure REST, that should be the HTTP "PUT" verb. And the path would indicate the collection and key value.

Re: Adding pdf/word file using JSON/XML

2013-06-16 Thread Yonik Seeley
On Sun, Jun 16, 2013 at 6:05 PM, Jack Krupansky wrote: > Except, that Solr's divergence from a true, pure REST API is certainly one > of the elements of its "badness". Most complex systems seem to feel the need to diverge from pure REST for the sake of being practical. >From that perspective "pur

Re: Adding pdf/word file using JSON/XML

2013-06-16 Thread Jack Krupansky
Jan, you made no mention of "mastering" Solr - which was the crux of my comments. I think everyone agrees that anyone can download and "use" Solr, in a basic sense, with minimal effort. The issue is how far the average application developer can get beyond "start" towards "mastery" without a de

RE: filter query from external list of Solr unique IDs

2013-06-16 Thread samabhiK
Does anything exists already in solr 4.3 to meet this usecase scenario? -- View this message in context: http://lucene.472066.n3.nabble.com/filter-query-from-external-list-of-Solr-unique-IDs-tp1709060p4070874.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Adding pdf/word file using JSON/XML

2013-06-16 Thread Jan Høydahl
Hi, I've never heard the complaint that Solr is hard to use. To the contrary, most people I come across have downloaded Solr themselves, walked through the tutorial and praise the simplicity with which they can start indexing and searching content. When they come to us asking for consultancy o

Re: in Solr 3.5, optimization increase the index size to double

2013-06-16 Thread Jason Hellman
And let's not forget the interesting bug in MMapDirectory: http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/api/all/org/apache/lucene/store/MMapDirectory.html "NOTE: memory mapping uses up a portion of the virtual memory address space in your process equal to the size of the file

Re: Solr using a ridiculous amount of memory

2013-06-16 Thread Jack Krupansky
Yeah, this is yet another "anti-pattern" we need to be discouraging - large multivalued fields. They indicate that the data model is not well balanced and aligned with the strengths of Solr and Lucene. -- Jack Krupansky -Original Message- From: adityab Sent: Sunday, June 16, 2013 9:3

Re: Best way to match umlauts

2013-06-16 Thread adityab
Thanks for the explanation Steve. I now see it clearly. In my case it should work. -- View this message in context: http://lucene.472066.n3.nabble.com/Best-way-to-match-umlauts-tp4070256p4070805.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr using a ridiculous amount of memory

2013-06-16 Thread adityab
It was interesting to read this post. I had similar issue on Solr v4.2.1. The nature of our document is that it has huge multiValued fields and we were able to knock off out server in about 30muns We then found a bug "Lucene-4995" which was causing all the problem. Applying the patch has helped a

Re: Solr large boolean filter

2013-06-16 Thread Jack Krupansky
Whenever I see one of this "big" query filters, my first thought is that there is something wrong with the application data model. Where do the long list of IDs come from? Somebody must be generating and/or storing them, right? Why not store them in Solr, right in the data model? Maybe store

Re: New operator.

2013-06-16 Thread Jack Krupansky
It all depends on what you mean by an "operator". Start by describing in more detail what problem you are trying to solve. And how do you expect your users or applications to use this "operator". Give some examples. Solr and Lucene do not have "operators" per say, except in query parser synt

Re: Solr cloud: zkHost in solr.xml gets wiped out

2013-06-16 Thread Erick Erickson
Al: As it happens, I hope sometime today to put up a patch for SOLR-4910 that should harden up many things in persisting solr.xml, I'll be sure to include this. It's kind of a pain to create an automated test for this, so I'll give it a whirl manually. As you say, most of this is going away in 5.

Re: Replicas and soft commit

2013-06-16 Thread Erick Erickson
You're mixing things up pretty thoroughly SolrCloud with leaders and replicas is orthogonal to Master/Slave setups, generally people use one or the other. Master/Slave setups don't get NRT updates at all. I'm a little surprised that your setup works, it sounds like you have replication set up

Re: Solr 3.5 Optimization takes index file size almost double

2013-06-16 Thread Erick Erickson
Unix or Windows? And are the files still there after restarting Solr? Best Erick On Fri, Jun 14, 2013 at 10:54 AM, Pravin Bhutada wrote: > One thing that you can try is optimize incrementally. Instead of optimizing > to 1 segment, optimize to 100, then 50 , 25, 10 ,5 ,2 ,1 > After each step, the

Re: Solr using a ridiculous amount of memory

2013-06-16 Thread Erick Erickson
John: If you'd like to add your experience to the Wiki, create an ID and let us know what it is and we'll add you to the contributors list. Unfortunately we had problems with spam pages to we added this step. Make sure you include your logon in the request. Thanks, Erick On Fri, Jun 14, 2013 at

Re: in Solr 3.5, optimization increase the index size to double

2013-06-16 Thread Erick Erickson
Optimzing will _temporarily_ double the index size, but it shouldn't be permanent. Is it possible that you have inadvertently told Solr to keep an extra snapshot? I think it's "numberToKeep" in your replication handler, but I'm going from memory here. Best Erick On Fri, Jun 14, 2013 at 2:15 AM,

Re: Solr large boolean filter

2013-06-16 Thread Mikhail Khludnev
Right. FieldCacheTermsFilter is an option. You need to create own QParserPlugin which yields FieldCacheTermsFilter, hook him as ..&fq={!idsqp cache=false}&.. Mind disabling caching! Mind term ecoding due to field type! I also suggest to check how much it spend for tokenization. Once a day I've got

Re: New operator.

2013-06-16 Thread Mikhail Khludnev
Hello Yanis, Two options. 1. Create own SearchComponent, which adds filterQuery into request, and add it into SearchHandler. http://wiki.apache.org/solr/SearchComponent 2. Create QParserPlugin and call them by request param ...&fq={!yanisqp}applyvector&... http://wiki.apache.org/solr/SolrPlugins#Q