Yo. That is the truth. You can get stuff indexed with an automatic schema, but
if you want to make your customers happy, tune it.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Jan 22, 2016, at 6:22 PM, Erick Erickson wrote:
>
> And, more genera
And, more generally, schemaless makes a series of assumptions, any
of which may be wrong.
You _must_ hand-tweak your schema to squeeze all the performance out of Solr
that you can. If your collection isn't big enough that you need to squeeze,
don't bother
FWIW,
Erick
On Fri, Jan 22, 2016 at
It boils down to whether the response rate when you query a single
shard is "acceptable", plus some overhead for sharding.
So, if you need 100QPS and all you can get after tuning on a single
shard (which you can test with &distrib=false)
is 10QPS, you need 10 replicas.
But if a single shard can o
Concrete details are crucial -- what exactly are you trying, what results
are you getting, how do those results differ from what you expect?
https://wiki.apache.org/solr/UsingMailingLists
Normally, even when someone only gives a small subset of the crucial
details needed to answer thei
Thanks guys for all the responses.
True. What I wanted to convey is 2 shards with 4 replicas.
>> use more shards if the query latency is too high.
Shouldn't we go for more replicas if query latency is too high? You can go for
more shard if you have number of indexing documents and at a much fr
I agree, sharding may hurt more than it helps. And estimate the text size after
the documents are processed.
We all love Solr Cloud, but this could be a good application for traditional
master/slave Solr. That means no Zookeeper nodes and it is really easy to add a
new query slave, just clone t
"1 Leader & 3 Replicas"
SolrCloud does not distinguish leaders from replicas - that's old
master-slave terminology. The leader is just one of the replicas.
So, are you really talking about 2 shards with 4 replicas each or 2 shards
with 2 replicas each?
Putting multiple replica instances on each
>From my experiments looks like SearchComponent does not handle negative fq
correctly.
Does anybody have have such experience ?
--
View this message in context:
http://lucene.472066.n3.nabble.com/SearchComponent-does-not-handle-negative-fq-tp4252688.html
Sent from the Solr - User mailing list a
Aswath Srinivasan (TMS) wrote:
> * Totally about 2.5 million documents to be indexed
> * Documents average size is 512 KB - pdfs and htmls
> This being said I was thinking I would take the Solr to production with,
> * 2 shards, 1 Leader & 3 Replicas
> Do you all think th
If below is the situation,
* 4 Virtual machines with 64 GB RAM - 64bit machines, 512 GB storage
for each VM
* Totally about 2.5 million documents to be indexed
* Documents average size is 512 KB - pdfs and htmls
* Expected QPS is 150
* Incremental ind
a mile age can vary
http://blog.griddynamics.com/2015/07/how-to-import-structured-data-into-solr.html
On Fri, Jan 22, 2016 at 8:29 PM, Brian Narsi wrote:
> What are the various ways DataImportHandler can be scaled?
>
> Thanks
>
--
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dy
See the old docs at
https://wiki.apache.org/solr/SpellCheckComponent#Configuration
In particular, you need this line in solrconfig.xml:
./spellchecker
James Dyer
Ingram Content Group
-Original Message-
From: Nitin Solanki [mailto:nitinml...@gmail.com]
Sent: Friday, January 22, 2
Hi,
We have a requirement to pre-encrypt an index we are building before it
hits disk. We are doing this by using a wrapper around MMapDirectory that
wraps the input/output streams(I know the general recommendation is to
encrypt the filesystem instead but this option was explicitly rejected by
ou
On 1/22/2016 10:29 AM, Brian Narsi wrote:
What are the various ways DataImportHandler can be scaled?
I'm not very familiar with how DIH interacts with SolrCloud. I know you
can use it with SolrCloud, but nothing else. Assuming you're not
running SolrCloud, the following information will app
Yes, and also underflow in the case of double/float.
--
Steve
www.lucidworks.com
> On Jan 22, 2016, at 12:25 PM, Shyam R wrote:
>
> I think, schema-less mode might allocate double instead of float, long
> instead of int to guard against overflow, which increases index size. Is my
> assumption v
I have a SolrCloud v5.4 collection with 3 replicas that appear to have fallen
permanently out of sync. Users started to complain that the same search,
executed twice, sometimes returned different result counts. Sure enough, our
replicas are not identical:
>> shard1_replica1: 89867 documents
What are the various ways DataImportHandler can be scaled?
Thanks
I think, schema-less mode might allocate double instead of float, long
instead of int to guard against overflow, which increases index size. Is my
assumption valid?
Thanks
On Thu, Jan 21, 2016 at 10:48 PM, Erick Erickson
wrote:
> I guess it's all about whether schemaless really supports
> 1>
Ok, But IndexBasedSpellChecker needs a directory where all indexes are
stored to do spell check. I don't have any idea about
IndexBasedSpellChecker. If you send me snap configuration of that. It will
help me.. Thanks
On Fri, Jan 22, 2016 at 1:45 AM Dyer, James
wrote:
> But if you really need mor
To be clear, having separate Solr servers on different versions should
definitely not be a problem. The only potential difficulty here is the
SolrJ vs. server back-compat issue.
-- Jack Krupansky
On Fri, Jan 22, 2016 at 10:57 AM,
wrote:
> Shawn wrote:
> >
> > If you are NOT running SolrCloud, t
On 1/22/2016 8:57 AM, jimi.hulleg...@svensktnaringsliv.se wrote:
> When you talk about not mixing 4.x and 5.x when using SolrCloud, you mean
> between the client and the server that talk to each other, right? Or would it
> be a problem keeping our existing non cloud solr 4.x server, upgrading the
Oh, one more thing. Would this setup still be possible if we would want to have
the new 5.x solr server be the solr cloud version? I'm not saying that
SolrCloud is a requirement for us (it might even not be suitable, since our
index is not that large), but still would be good to know.
/Jimi
--
OK, so just to be clear. As far as you know, and from your point of view, you
would consider it a better solution to stick with the 4.6 solrj client jar for
both the 4.6 and 5.x communication, rather than switching the 4.6 solrj client
jar to the 5.x version and hoping that the CMS solr-specific
On 1/22/2016 8:37 AM, Jack Krupansky wrote:
> The doc is silent on this issue of SolrJ vs. server version compatibility
> in general (e.g., 4 vs. 5.) That's not an absolute assurance, but at least
> it's a possibility. And and far as I know, if you had a SolrJ 4 app and
> upgraded the server (with
Shawn wrote:
>
> If you are NOT running SolrCloud, then that should work with no problem.
> The HTTP API is fairly static and has not seen any major upheaval recently.
> If you're NOT running SolrCloud, you may even be able to replace the
> SolrJ jar in your existing system with the 5.4.1 versio
Personally, I think the Solr project should endeavor to commit to
guaranteeing that a SolrJ x.y client will be compatible with a Solr x+1.y2
Solr server. AFAICT there currently isn't such a formal compat commitment
or promise, but also AFAIK there is no known non-compat issue between SolrJ
4.y and
Yeah, sort of. Solr isn't bundled in the CMS, it is in a separate Tomcat
instance. But our code is running on the same Tomcat as the CMS, and the CMS
uses solrj 4.x to talk with its solr. And now we want to be able to talk with
our own separate solr, running solr 5.x, and would prefer to use sol
The doc is silent on this issue of SolrJ vs. server version compatibility
in general (e.g., 4 vs. 5.) That's not an absolute assurance, but at least
it's a possibility. And and far as I know, if you had a SolrJ 4 app and
upgraded the server (with no change in the index or data model), the app
shoul
On 1/22/2016 1:14 AM, Midas A wrote:
> Please anybody tell me what these request are doing . Is it application
> generated error or part of solr master -slave?
>
>
>
> b)
> 10.20.73.169 - - [22/Jan/2016:08:07:38 +] "POST
> /solr/shopclue_prod/select HTTP/1.1" 200 7002
This appears to be the
On 1/21/2016 11:57 PM, jimi.hulleg...@svensktnaringsliv.se wrote:
> Long story short, we use a CMS that is integrated with Solr 4.6, with the
> solrj jar file in the global/common Tomcat classpath. We currently use a
> Google Search Appliance machine for our own freetext search needs, but plan
>
Hi,
This morning one of the 2 nodes of our SolrCloud went down. I've tried
many ways to recover it but to no avail. I've tried to unload all cores
on the failed node and reload it after emptying the data directory,
hoping it would sync from scratch. The core is still marked as down and
no data is
Just to be clear, are you talking about a single app that does SolrJ calls
to both your CMS and your free text search index? So, one Java app that is
simultaneously sending requests to two Solr instances (once 4, one 5)?
-- Jack Krupansky
On Fri, Jan 22, 2016 at 1:57 AM,
wrote:
> Hi,
>
> Long s
Hi,
I was wondering if txn logs obey any log rotation setup rules. Sometimes
indexing can get pretty large and txn logs grow upto tens of
gigabytes(occupying disk which eventually needs to be cleaned up) or as
indexing is progressing and a commit had been made, I want to delete old
txn log to save
Hi Vidya, if i understood your question correctly you can simply use the
original collection name(s) to point to individual collections. Isn't that
the case?
Thanks,
Susheel
On Fri, Jan 22, 2016 at 8:10 AM, vidya wrote:
> Hi
>
> I wanted to mainatain two sets of indexes or collections for maint
Hi
I wanted to mainatain two sets of indexes or collections for maintaing my
large input data for indexing for which i found collection aliasing is
helpful. I have created alais for 2 collections. but my problem is , how can
i point out my alias to 2 different colletions at 2 different times.
Tha
Yes, this is a common error I've seen in the past even with MongoDB, keeping
all the replica on the same Box and on the same storage defice. Even with
virtualization I always suggest having at least disks on different and
distinct SAN. VM usually runs on vSphere or Hyper-v with SCVMM so they can
to
Hello Binoy ,
I found that if I am using a StringField and index it using java
code/solr-admin it adds a \ before " ,
i.e. lest say I have string ==> test " , then it gets indexed as test \".
For all other special chars it does not do anything , so the trick which
worked for me is
while searchin
There is other reason to avoid virtualization - fault tolerance. It is
common to use virtualization on huge box and keep replications on same
box. Such setup will survive VM failure but not HW failure.
Regards,
Emir
On 22.01.2016 11:05, Gian Maria Ricci - aka Alkampfer wrote:
Thanks, my actua
Hi Irshad,
So, assuming that each vendor information is one solr document, you will
have information regarding the vendors open-close hours correct? You should
be indexing this content in one of the fields, isn't it? If yes, then you
should try something as explained:
When the user searches, *capt
Hi ,
Thanks prateek for your reply.
My query is i have multiple opening and closing hours, within the same day.
how to manage index and search query to get all opening first than close
i don't think below url will solve my problem.
https://wiki.apache.org/solr/SpatialForTimeDurations
please sug
Thanks,
It is clear that a test is strongly dependent of your data / hardware etc. My
question was a little bit more general because I've read on some article in the
internet and in book "Apache Solr Enterprise Search server" that virtualization
should be avoided. Since this was a general sugg
Thanks, my actual strategy is using SolrMeter to test with real Virtualized
hardware and real result set to gain some number. The customer definitively
wants virtualization, and probably we will not test on bare metal
installation.
As I state in previous mail, the question arise because in some
According to me this is what you are looking for
https://wiki.apache.org/solr/QueryElevationComponent
Regards,
Prateek Jain
Team: Totoro
Mobile: +353 894 391716
-Original Message-
From: irshad siddiqui [mailto:irshad.s...@gmail.com]
Sent: 22 January 2016 07:32 AM
To: solr-user@lucene
continuously getting following error on one of my solr slave
a) null:org.eclipse.jetty.io.EofException
Please anybody tell me what these request are doing . Is it application
generated error or part of solr master -slave?
b)
10.20.73.169 - - [22/Jan/2016:08:07:38 +] "POST
/solr/shopclue_prod/select HTTP/1.1" 200 7002
10.20.73.164 - - [22/Jan/2016:08:07:38 +] "POST
/solr/shopclue_prod
45 matches
Mail list logo