Thanks for your explanation.
Right out of your head, are there any other options which prevent
getting a cursorMark?
Yes, that was also my idea to set up a separate request handler
for harvesting without timeAllowed.
As Shawn suggested, a short note about this should go into the documentation.
R
Hi,
I am very new to SOLR, and would appreciate some guidance if anyone has the
time to offer it.
We have very recently upgraded from SOLR 4.1 to 5.2.1, and at the same time
increased the physical RAM from 24Gb to 96Gb. We run multiple cores on this one
server, approximately 20 in total, but
We need to work out why your performance is bad without optimise. What
version of Solr are you using? Can you confirm that your config is using
the TieredMergePolicy?
Upayavira
Oe, Jun 30, 2015, at 04:48 AM, Summer Shire wrote:
> Hi Upayavira and Erick,
>
> There are two things we are talking a
Hi Upayavira and Erick,
There are two things we are talking about here.
First: Why am I optimizing? If I don’t our SEARCH (NOT INDEXING) performance is
100% worst.
The problem lies in the number of total segments. We have to have max segments
1 or 2.
I have done intensive performance related
I see what you mean. Many thanks for the details.
-Original Message-
From: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
Sent: Monday, June 29, 2015 6:36 PM
To: solr-user@lucene.apache.org
Subject: Re: optimize status
Reitzel, Charles wrote:
> Question, Toke: in your "immutable"
The regex replace processor can be used to do this:
https://lucene.apache.org/solr/5_2_0/solr-core/org/apache/solr/update/processor/RegexReplaceProcessorFactory.html
-- Jack Krupansky
On Mon, Jun 29, 2015 at 6:20 PM, Walter Underwood
wrote:
> Yes, do this in an update request processor before
Reitzel, Charles wrote:
> Question, Toke: in your "immutable" cases, don't the benefits of
> optimizing come mostly from eliminating deleted records?
Not for us. We have about 1 deleted document for every 1000 or 10.000 standard
documents.
> Is there any material difference in heap, CPU, etc. b
You can also use the TermsComponent, that'll read the values from the indexed
fields.That gets the raw terms, they aren't grouped.
But you don't get the document. Reconstructing the doc from the
postings lists is
actually quite tedious. The Luke program (not request handler) has a
function that
do
Try not putting it in double quotes?
Best,
Erick
On Mon, Jun 29, 2015 at 12:22 PM, Thomas Michael Engelke
wrote:
>
>
> A friend and I are trying to develop some software using Solr in the
> background, and with that comes alot of changes. We're used to older
> versions (4.3 and below). We espec
Yes, do this in an update request processor before it gets to the analyzer
chain.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
On Jun 29, 2015, at 3:19 PM, Erick Erickson wrote:
> Hmmm, very hard to do currently. The _point_ of stored fields is that
Hmmm, very hard to do currently. The _point_ of stored fields is that
an exact, verbatim
copy of the input is returned in fl lists and this is violating that
promise. I suppose some
kind of custom update processor could work, but it's really "roll your
own" funcitonality
I think.
Best,
Erick
On M
Question, Toke: in your "immutable" cases, don't the benefits of optimizing
come mostly from eliminating deleted records? Is there any material
difference in heap, CPU, etc. between 1, 5 or 10 segments? I.e. at how many
segments/shard do you see a noticeable performance hit?
Also, I curious
Use the schema browser on the admin UI, and click the "load term info"
button. It'll show you the terms in your index.
You can also use the analysis tab which will show you how it would
tokenise stuff for a specific field.
Upayavira
On Mon, Jun 29, 2015, at 06:53 PM, Dinesh Naik wrote:
> Hi Eric
Hi Garth,
Yes, I'm straying from OP's question (I think Steve is all set). But his
question, quite naturally, comes up often and a similar discussion ensues each
time.
I take your point about shards and segments being different things. I
understand that the hash ranges per segment are not k
For the sake of history, somewhere around Solr/Lucene 3.2 a new
"MergePolicy" was introduced. The old one merged simply based upon age,
or "index generation", meaning the older the segment, the less likely it
would get merged, hence needing optimize to clear out deletes from your
older segments.
T
Reitzel, Charles wrote:
> Is there really a good reason to consolidate down to a single segment?
In the scenario spawning this thread it does not seem to be the best choice.
Speaking more broadly there are Solr setups out there that deals with immutable
data, often tied to a point in time, e.g
" Is there really a good reason to consolidate down to a single segment?"
Archiving (as one example). Come July 1, the collection for log
entries/transactions in June will never be changed, so optimizing is
actually a good thing to do.
Kind of getting away from OP's question on this, but I don't
Solr : 4.9.x , with simple solr cloud on jetty.
JDK 1.7
num of replica : 4 , one replica for each shard
num of shards : 1
Hi All,
I have been facing below issues with solr suggester introduced in 4.7.x. Do
any one have good working solution or
buildOnCommit=true property is suggested not to use
Thank you guys, this was very helpful. I was always under the impression
that the index need to be optimize periodically to reclaim disk space
otherwise the index will just keep on growing and growing (was that the
case in Lucene 2.x and prior days?).
I agree with Walter, renaming "optimize" to s
Hi Eric,
By compressed value I meant value of a field after removing special characters
. In my example its "-". Compressed form of red-apple is redapple .
I wanted to know if we can see the analyzed version of fields .
For example if I use ngram on a field , how do I see the analyzed values in
Hi Shawn - Thank you for the quick and detailed response!!
Good to hear that Jetty 8 installation with solr for typical uses does not need
to be modified.
I believe what we have is a "typical" use case. We will be installing solr on 3
nodes in our Hadoop cluster. Will use Hadoop's zookeeper.
: > Have nothing found in the ref guides, docs, wiki, examples about this
mutually
: > exclusive parameters.
: >
: > Is this a bug or a feature and if it is a feature, where is the sense of
this?
The problem is that if a timeAllowed exceeded situation pops up, you won't
get a nextCursorMark to
On 6/29/2015 8:44 AM, Tarala, Magesh wrote:
> We are planning to go to production with Solr 4.10.4. Documentation
> recommends to use full Jetty package that includes JettyPlus. I'm not able to
> find the instructions to do this. Can someone point me in the right direction?
I found the official
Is there really a good reason to consolidate down to a single segment?
Any incremental query performance benefit is tiny compared to the loss of
managability.
I.e. shouldn't segments _always_ be kept small enough to facilitate
re-balancing data across shards? Even in non-cloud instances th
On 6/29/2015 9:12 AM, Bernd Fehling wrote:
> while just trying cursorMark I got the following search response:
>
> "error": {
> "msg": "Can not search using both cursorMark and timeAllowed",
> "code": 400
> }
>
>
> Yes, I'm using timeAllowed which is set in my requestHandler as
> invariant
A friend and I are trying to develop some software using Solr in the
background, and with that comes alot of changes. We're used to older
versions (4.3 and below). We especially have problems with the
autosuggest feature.
This is the field definition (schema.xml) for our autosuggest field:
.
Hi list,
while just trying cursorMark I got the following search response:
"error": {
"msg": "Can not search using both cursorMark and timeAllowed",
"code": 400
}
Yes, I'm using timeAllowed which is set in my requestHandler as
invariant to 6 (60 seconds) as a limit to "killer search
It was because of the issues
Rgds
AJ
> On Jun 29, 2015, at 6:52 PM, Shalin Shekhar Mangar
> wrote:
>
>> On Mon, Jun 29, 2015 at 4:37 PM, Amit Jha wrote:
>> Hi,
>>
>> I setup a SolrCloud with 2 shards each is having 2 replicas with 3
>> zookeeper ensemble.
>>
>> We add and update documents f
We are planning to go to production with Solr 4.10.4. Documentation recommends
to use full Jetty package that includes JettyPlus. I'm not able to find the
instructions to do this. Can someone point me in the right direction?
Thanks,
Magesh
“Optimize” is a manual full merge.
Solr automatically merges segments as needed. This also expunges deleted
documents.
We really need to rename “optimize” to “force merge”. Is there a Jira for that?
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
On Jun
Please bear with me here, I'm pretty new to Solr with most of me DB
experience being of the relational variety. I'm planning a new project,
which I believe Solr (and Nutch) will solve well. Although I've
installed Solr 5.2 and Nutch 1.10 (on Centos) and tinkered about a bit,
I'd be grateful for
On Mon, Jun 29, 2015 at 4:37 PM, Amit Jha wrote:
> Hi,
>
> I setup a SolrCloud with 2 shards each is having 2 replicas with 3
> zookeeper ensemble.
>
> We add and update documents from web app. While updating we delete the
> document and add same document with updated values with same unique id.
Not quite sure what you mean by "compressed values". admin/luke
doesn't show the results of the compression of the stored values, there's
no way I know of to do that.
Best,
Erick
On Mon, Jun 29, 2015 at 8:20 AM, dinesh naik wrote:
> Hi all,
>
> Is there a way to read the indexed data for field o
Steven:
Yes, but
First, here's Mike McCandles' excellent blog on segment merging:
http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
I think the third animation is the TieredMergePolicy. In short, yes an
optimize will reclaim disk space. But as you update, this is
Hi Markus
Thanks for the reply. I'm already using the Synonyms filter and it is
working fine (i.e., when I search for "customer", it also returns documents
containing "cst.").
What the synonyms filter does not do is to actually replace the word "cst."
with "customer" in the document.
Just to be c
Hello - why not just use synonyms or StemmerOverrideFilter?
Markus
-Original message-
> From:hossmaa
> Sent: Monday 29th June 2015 14:08
> To: solr-user@lucene.apache.org
> Subject: Correcting text at index time
>
> Hi everyone
>
> I'm wondering if it's possible in Solr to correct t
Hi all,
Is there a way to read the indexed data for field on which the
analysis/processing has been done ?
I know using admin GUI we can see field wise analysis But how can i get
hold on the complete document using admin/luke? or any other way?
For example, if i have 2 fields called name and co
HI All:I need a pagenigation with facet offset.
There are two or more fields in [facet.pivot], but only one value for
[facet.offset], eg: facet.offset=10&facet.pivot=field_1,field_2. In this
condition, field_2 is 10's offset and then field_1 is 10's offset. But what I
want is field_2
Hi Upayavira,
This is news to me that we should not optimize and index.
What about disk space saving, isn't optimization to reclaim disk space or
is Solr somehow does that? Where can I read more about this?
I'm on Solr 5.1.0 (may switch to 5.2.1)
Thanks
Steve
On Mon, Jun 29, 2015 at 4:16 AM,
Hi everyone
I'm wondering if it's possible in Solr to correct text at indexing time,
based on a synonyms-like list. This would be great for expanding undesirable
abbreviations (for example, "cst." instead of "customer").
I've been searching the Solr docs and the web quite thoroughly I believe,
but
Hi,
I setup a SolrCloud with 2 shards each is having 2 replicas with 3
zookeeper ensemble.
We add and update documents from web app. While updating we delete the
document and add same document with updated values with same unique id.
I am facing a very strange issue that some time 2 documents ha
Hi Erick,
The Contents field contains one sentence only and no "watch" exists in it.
Plus we use quite large snippet size to surely cover the field.
Dmitry
On Sat, Jun 27, 2015 at 6:16 PM, Erick Erickson
wrote:
> Does watch exist in the Contents field somewhere outside the snippet
> size you'v
http://wiki.apache.org/solr/HierarchicalFaceting
On Mon, Jun 29, 2015 at 11:27 AM, Darniz wrote:
> hello
>
> any advice please
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/need-advice-on-parent-child-mulitple-category-tp4214140p4214602.html
> Sent from the Solr
Hi everyone !
I want to import data from orientdb in solr 5.1.0.
here is my configurations
*data-config.xml*
>
>
> > driver="com.orientechnologies.orient.jdbc.OrientJdbcDriver"
>> url="jdbc:orient:remote:localhost/emallates_combine" user="root"
>> password="root" batchSize="-1"/>
>
>
>
hello
any advice please
--
View this message in context:
http://lucene.472066.n3.nabble.com/need-advice-on-parent-child-mulitple-category-tp4214140p4214602.html
Sent from the Solr - User mailing list archive at Nabble.com.
I'm afraid I don't understand. You're saying that optimising is causing
performance issues?
Simple solution: DO NOT OPTIMIZE!
Optimisation is very badly named. What it does is squashes all segments
in your index into one segment, removing all deleted documents. It is
good to get rid of deletes -
Have to cause of performance issues.
Just want to know if there is a way to tap into the status.
> On Jun 28, 2015, at 11:37 PM, Upayavira wrote:
>
> Bigger question, why are you optimizing? Since 3.6 or so, it generally
> hasn't been requires, even, is a bad thing.
>
> Upayavira
>
>> On Su
47 matches
Mail list logo