Re: schemaless vs schema based core

2016-01-22 Thread Walter Underwood
Yo. That is the truth. You can get stuff indexed with an automatic schema, but if you want to make your customers happy, tune it. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Jan 22, 2016, at 6:22 PM, Erick Erickson wrote: > > And, more genera

Re: schemaless vs schema based core

2016-01-22 Thread Erick Erickson
And, more generally, schemaless makes a series of assumptions, any of which may be wrong. You _must_ hand-tweak your schema to squeeze all the performance out of Solr that you can. If your collection isn't big enough that you need to squeeze, don't bother FWIW, Erick On Fri, Jan 22, 2016 at

Re: Taking Solr to production

2016-01-22 Thread Erick Erickson
It boils down to whether the response rate when you query a single shard is "acceptable", plus some overhead for sharding. So, if you need 100QPS and all you can get after tuning on a single shard (which you can test with &distrib=false) is 10QPS, you need 10 replicas. But if a single shard can o

Re: SearchComponent does not handle negative fq ???

2016-01-22 Thread Chris Hostetter
Concrete details are crucial -- what exactly are you trying, what results are you getting, how do those results differ from what you expect? https://wiki.apache.org/solr/UsingMailingLists Normally, even when someone only gives a small subset of the crucial details needed to answer thei

RE: Taking Solr to production

2016-01-22 Thread Aswath Srinivasan (TMS)
Thanks guys for all the responses. True. What I wanted to convey is 2 shards with 4 replicas. >> use more shards if the query latency is too high. Shouldn't we go for more replicas if query latency is too high? You can go for more shard if you have number of indexing documents and at a much fr

Re: Taking Solr to production

2016-01-22 Thread Walter Underwood
I agree, sharding may hurt more than it helps. And estimate the text size after the documents are processed. We all love Solr Cloud, but this could be a good application for traditional master/slave Solr. That means no Zookeeper nodes and it is really easy to add a new query slave, just clone t

Re: Taking Solr to production

2016-01-22 Thread Jack Krupansky
"1 Leader & 3 Replicas" SolrCloud does not distinguish leaders from replicas - that's old master-slave terminology. The leader is just one of the replicas. So, are you really talking about 2 shards with 4 replicas each or 2 shards with 2 replicas each? Putting multiple replica instances on each

SearchComponent does not handle negative fq ???

2016-01-22 Thread vitaly bulgakov
>From my experiments looks like SearchComponent does not handle negative fq correctly. Does anybody have have such experience ? -- View this message in context: http://lucene.472066.n3.nabble.com/SearchComponent-does-not-handle-negative-fq-tp4252688.html Sent from the Solr - User mailing list a

Re: Taking Solr to production

2016-01-22 Thread Toke Eskildsen
Aswath Srinivasan (TMS) wrote: > * Totally about 2.5 million documents to be indexed > * Documents average size is 512 KB - pdfs and htmls > This being said I was thinking I would take the Solr to production with, > * 2 shards, 1 Leader & 3 Replicas > Do you all think th

Taking Solr to production

2016-01-22 Thread Aswath Srinivasan (TMS)
If below is the situation, * 4 Virtual machines with 64 GB RAM - 64bit machines, 512 GB storage for each VM * Totally about 2.5 million documents to be indexed * Documents average size is 512 KB - pdfs and htmls * Expected QPS is 150 * Incremental ind

Re: Scaling DataImportHandler

2016-01-22 Thread Mikhail Khludnev
a mile age can vary http://blog.griddynamics.com/2015/07/how-to-import-structured-data-into-solr.html On Fri, Jan 22, 2016 at 8:29 PM, Brian Narsi wrote: > What are the various ways DataImportHandler can be scaled? > > Thanks > -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dy

RE: How get around solr's spellcheck maxEdit limit of 2?

2016-01-22 Thread Dyer, James
See the old docs at https://wiki.apache.org/solr/SpellCheckComponent#Configuration In particular, you need this line in solrconfig.xml: ./spellchecker James Dyer Ingram Content Group -Original Message- From: Nitin Solanki [mailto:nitinml...@gmail.com] Sent: Friday, January 22, 2

Corrupted Index with custom directory

2016-01-22 Thread Geoff Cooney
Hi, We have a requirement to pre-encrypt an index we are building before it hits disk. We are doing this by using a wrapper around MMapDirectory that wraps the input/output streams(I know the general recommendation is to encrypt the filesystem instead but this option was explicitly rejected by ou

Re: Scaling DataImportHandler

2016-01-22 Thread Shawn Heisey
On 1/22/2016 10:29 AM, Brian Narsi wrote: What are the various ways DataImportHandler can be scaled? I'm not very familiar with how DIH interacts with SolrCloud. I know you can use it with SolrCloud, but nothing else. Assuming you're not running SolrCloud, the following information will app

Re: schemaless vs schema based core

2016-01-22 Thread Steve Rowe
Yes, and also underflow in the case of double/float. -- Steve www.lucidworks.com > On Jan 22, 2016, at 12:25 PM, Shyam R wrote: > > I think, schema-less mode might allocate double instead of float, long > instead of int to guard against overflow, which increases index size. Is my > assumption v

SolrCloud replicas out of sync

2016-01-22 Thread David Smith
I have a SolrCloud v5.4 collection with 3 replicas that appear to have fallen permanently out of sync. Users started to complain that the same search, executed twice, sometimes returned different result counts. Sure enough, our replicas are not identical: >> shard1_replica1: 89867 documents

Scaling DataImportHandler

2016-01-22 Thread Brian Narsi
What are the various ways DataImportHandler can be scaled? Thanks

Re: schemaless vs schema based core

2016-01-22 Thread Shyam R
I think, schema-less mode might allocate double instead of float, long instead of int to guard against overflow, which increases index size. Is my assumption valid? Thanks On Thu, Jan 21, 2016 at 10:48 PM, Erick Erickson wrote: > I guess it's all about whether schemaless really supports > 1>

Re: How get around solr's spellcheck maxEdit limit of 2?

2016-01-22 Thread Nitin Solanki
Ok, But IndexBasedSpellChecker needs a directory where all indexes are stored to do spell check. I don't have any idea about IndexBasedSpellChecker. If you send me snap configuration of that. It will help me.. Thanks On Fri, Jan 22, 2016 at 1:45 AM Dyer, James wrote: > But if you really need mor

Re: Mix Solr 4 and 5?

2016-01-22 Thread Jack Krupansky
To be clear, having separate Solr servers on different versions should definitely not be a problem. The only potential difficulty here is the SolrJ vs. server back-compat issue. -- Jack Krupansky On Fri, Jan 22, 2016 at 10:57 AM, wrote: > Shawn wrote: > > > > If you are NOT running SolrCloud, t

Re: Mix Solr 4 and 5?

2016-01-22 Thread Shawn Heisey
On 1/22/2016 8:57 AM, jimi.hulleg...@svensktnaringsliv.se wrote: > When you talk about not mixing 4.x and 5.x when using SolrCloud, you mean > between the client and the server that talk to each other, right? Or would it > be a problem keeping our existing non cloud solr 4.x server, upgrading the

RE: Mix Solr 4 and 5?

2016-01-22 Thread jimi.hullegard
Oh, one more thing. Would this setup still be possible if we would want to have the new 5.x solr server be the solr cloud version? I'm not saying that SolrCloud is a requirement for us (it might even not be suitable, since our index is not that large), but still would be good to know. /Jimi --

RE: Mix Solr 4 and 5?

2016-01-22 Thread jimi.hullegard
OK, so just to be clear. As far as you know, and from your point of view, you would consider it a better solution to stick with the 4.6 solrj client jar for both the 4.6 and 5.x communication, rather than switching the 4.6 solrj client jar to the 5.x version and hoping that the CMS solr-specific

Re: Mix Solr 4 and 5?

2016-01-22 Thread Shawn Heisey
On 1/22/2016 8:37 AM, Jack Krupansky wrote: > The doc is silent on this issue of SolrJ vs. server version compatibility > in general (e.g., 4 vs. 5.) That's not an absolute assurance, but at least > it's a possibility. And and far as I know, if you had a SolrJ 4 app and > upgraded the server (with

RE: Mix Solr 4 and 5?

2016-01-22 Thread jimi.hullegard
Shawn wrote: > > If you are NOT running SolrCloud, then that should work with no problem. > The HTTP API is fairly static and has not seen any major upheaval recently. > If you're NOT running SolrCloud, you may even be able to replace the > SolrJ jar in your existing system with the 5.4.1 versio

Re: Mix Solr 4 and 5?

2016-01-22 Thread Jack Krupansky
Personally, I think the Solr project should endeavor to commit to guaranteeing that a SolrJ x.y client will be compatible with a Solr x+1.y2 Solr server. AFAICT there currently isn't such a formal compat commitment or promise, but also AFAIK there is no known non-compat issue between SolrJ 4.y and

RE: Mix Solr 4 and 5?

2016-01-22 Thread jimi.hullegard
Yeah, sort of. Solr isn't bundled in the CMS, it is in a separate Tomcat instance. But our code is running on the same Tomcat as the CMS, and the CMS uses solrj 4.x to talk with its solr. And now we want to be able to talk with our own separate solr, running solr 5.x, and would prefer to use sol

Re: Mix Solr 4 and 5?

2016-01-22 Thread Jack Krupansky
The doc is silent on this issue of SolrJ vs. server version compatibility in general (e.g., 4 vs. 5.) That's not an absolute assurance, but at least it's a possibility. And and far as I know, if you had a SolrJ 4 app and upgraded the server (with no change in the index or data model), the app shoul

Re: POST request on slave server & error (Urgent )

2016-01-22 Thread Shawn Heisey
On 1/22/2016 1:14 AM, Midas A wrote: > Please anybody tell me what these request are doing . Is it application > generated error or part of solr master -slave? > > > > b) > 10.20.73.169 - - [22/Jan/2016:08:07:38 +] "POST > /solr/shopclue_prod/select HTTP/1.1" 200 7002 This appears to be the

Re: Mix Solr 4 and 5?

2016-01-22 Thread Shawn Heisey
On 1/21/2016 11:57 PM, jimi.hulleg...@svensktnaringsliv.se wrote: > Long story short, we use a CMS that is integrated with Solr 4.6, with the > solrj jar file in the global/common Tomcat classpath. We currently use a > Google Search Appliance machine for our own freetext search needs, but plan >

More problems (now jetty errorrs) with SolrCloud

2016-01-22 Thread John Smith
Hi, This morning one of the 2 nodes of our SolrCloud went down. I've tried many ways to recover it but to no avail. I've tried to unload all cores on the failed node and reload it after emptying the data directory, hoping it would sync from scratch. The core is still marked as down and no data is

Re: Mix Solr 4 and 5?

2016-01-22 Thread Jack Krupansky
Just to be clear, are you talking about a single app that does SolrJ calls to both your CMS and your free text search index? So, one Java app that is simultaneously sending requests to two Solr instances (once 4, one 5)? -- Jack Krupansky On Fri, Jan 22, 2016 at 1:57 AM, wrote: > Hi, > > Long s

Transaction Log rotation /retention setup

2016-01-22 Thread KNitin
Hi, I was wondering if txn logs obey any log rotation setup rules. Sometimes indexing can get pretty large and txn logs grow upto tens of gigabytes(occupying disk which eventually needs to be cleaned up) or as indexing is progressing and a commit had been made, I want to delete old txn log to save

Re: collection aliasing

2016-01-22 Thread Susheel Kumar
Hi Vidya, if i understood your question correctly you can simply use the original collection name(s) to point to individual collections. Isn't that the case? Thanks, Susheel On Fri, Jan 22, 2016 at 8:10 AM, vidya wrote: > Hi > > I wanted to mainatain two sets of indexes or collections for maint

collection aliasing

2016-01-22 Thread vidya
Hi I wanted to mainatain two sets of indexes or collections for maintaing my large input data for indexing for which i found collection aliasing is helpful. I have created alais for 2 collections. but my problem is , how can i point out my alias to 2 different colletions at 2 different times. Tha

RE: Couple of question about Virtualization and Load Balancer

2016-01-22 Thread Gian Maria Ricci - aka Alkampfer
Yes, this is a common error I've seen in the past even with MongoDB, keeping all the replica on the same Box and on the same storage defice. Even with virtualization I always suggest having at least disks on different and distinct SAN. VM usually runs on vSphere or Hyper-v with SCVMM so they can to

Re: How to achieve exact string match query which includes spaces and quotes

2016-01-22 Thread Alok Bhandari
Hello Binoy , I found that if I am using a StringField and index it using java code/solr-admin it adds a \ before " , i.e. lest say I have string ==> test " , then it gets indexed as test \". For all other special chars it does not do anything , so the trick which worked for me is while searchin

Re: Couple of question about Virtualization and Load Balancer

2016-01-22 Thread Emir Arnautovic
There is other reason to avoid virtualization - fault tolerance. It is common to use virtualization on huge box and keep replications on same box. Such setup will survive VM failure but not HW failure. Regards, Emir On 22.01.2016 11:05, Gian Maria Ricci - aka Alkampfer wrote: Thanks, my actua

Re: Close and open vendor sorting

2016-01-22 Thread davidphilip cherian
Hi Irshad, So, assuming that each vendor information is one solr document, you will have information regarding the vendors open-close hours correct? You should be indexing this content in one of the fields, isn't it? If yes, then you should try something as explained: When the user searches, *capt

Re: Close and open vendor sorting

2016-01-22 Thread irshad siddiqui
Hi , Thanks prateek for your reply. My query is i have multiple opening and closing hours, within the same day. how to manage index and search query to get all opening first than close i don't think below url will solve my problem. https://wiki.apache.org/solr/SpatialForTimeDurations please sug

RE: Couple of question about Virtualization and Load Balancer

2016-01-22 Thread Gian Maria Ricci - aka Alkampfer
Thanks, It is clear that a test is strongly dependent of your data / hardware etc. My question was a little bit more general because I've read on some article in the internet and in book "Apache Solr Enterprise Search server" that virtualization should be avoided. Since this was a general sugg

RE: Couple of question about Virtualization and Load Balancer

2016-01-22 Thread Gian Maria Ricci - aka Alkampfer
Thanks, my actual strategy is using SolrMeter to test with real Virtualized hardware and real result set to gain some number. The customer definitively wants virtualization, and probably we will not test on bare metal installation. As I state in previous mail, the question arise because in some

RE: Close and open vendor sorting

2016-01-22 Thread Prateek Jain J
According to me this is what you are looking for https://wiki.apache.org/solr/QueryElevationComponent Regards, Prateek Jain Team: Totoro Mobile: +353 894 391716 -Original Message- From: irshad siddiqui [mailto:irshad.s...@gmail.com] Sent: 22 January 2016 07:32 AM To: solr-user@lucene

jetty erro

2016-01-22 Thread Midas A
continuously getting following error on one of my solr slave a) null:org.eclipse.jetty.io.EofException

POST request on slave server & error (Urgent )

2016-01-22 Thread Midas A
Please anybody tell me what these request are doing . Is it application generated error or part of solr master -slave? b) 10.20.73.169 - - [22/Jan/2016:08:07:38 +] "POST /solr/shopclue_prod/select HTTP/1.1" 200 7002 10.20.73.164 - - [22/Jan/2016:08:07:38 +] "POST /solr/shopclue_prod