Great. Thanks for the work on this patch!
Jim
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-Cloud-A-B-Deployment-Issue-tp4302810p4303357.html
Sent from the Solr - User mailing list archive at Nabble.com.
It appears this has all been resolved by the following ticket:
https://issues.apache.org/jira/browse/SOLR-9446
My scenario fails in 6.2.1, but works in 6.3 and Master where this bug has
been fixed.
In the meantime, we can use our workaround to issue a simple delete command
that deletes a non-exi
Also, if we issue a delete by query where the query is "_version_:0", it also
creates a transaction log and then has no trouble transferring leadership
between old and new nodes.
Still, it seems like when we ADDREPLICA, some sort of transaction log should
be started.
Jim
--
View this message
Interestingly, If I simply add one document to the full cluster after all 6
nodes are active, this entire problem goes away. This appears to be because
a transaction log entry is created which in turn prevents the new nodes from
going into full replication recovery upon leader change.
Adding a doc
Perhaps you need to wrap your inner "" and "" tags in the CDATA
structure?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-6-0-Highlighting-Not-Working-tp4302787p4302835.html
Sent from the Solr - User mailing list archive at Nabble.com.
We are running into a timing issue when trying to do a scripted deployment of
our Solr Cloud cluster.
Scenario to reproduce (sometimes):
1. launch 3 clean solr nodes connected to zookeeper.
2. create a 1 shard collection with replicas on each node.
3. load data (more will make the problem worse)
It seems like all the parameters in the PingHandler get processed by the
remote server. So, things like shards=localhost or distrib=false take effect
too late.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-Cloud-prevent-Ping-Request-From-Forwarding-Request-tp429752
Here's the scenario:
Boxes 1,2, and 3 have replicas of collections dogs and cats. Box 4 has only
a replica of dogs.
All of these boxes have a healthcheck file on them that works with the
PingRequestHandler to say whether the box is up or not.
If I hit Box4/cats/admin/ping, Solr forwards the ping
Sadly, that didn't work.
Without a core to hit, the /[COLLECTION]/config returns a 404 error.
The best bet at this point may be for me may be one of the following:
1. Programmatically modify configoverlay.json file to add the runtime libs
when I upload the config.
or
2. Patch solr so that sc
I've run into an orchestration problem while creating collections and loading
plugins via the ConfigAPI in Solr Cloud.
Here's the scenario:
1. I create a configSet that references a custom class in schema.xml.
2. I upload the jar to the BlobStore and issue add-runtimelib using the
Config API. Thi
Thanks Shawn. I'm leaning towards a retry as well.
So, there's no mechanism that currently exists within Solr that would allow
me to automatically retry the zookeeper connection on launch?
My options then would be:
1. Externally monitor the status of Solr (eg
/solr/admin/collections?action=CLUST
When I try to launch Solr 6.0 in cloud mode and connect it to a specific
chroot in zookeeper that doesn't exist, I get an error in my solr.log.
That's expected, but the solr process continues to launch and succeeds.
Why wouldn't we want the start process simply to fail and exit?
There's no mechan
So, the problem I found that's driving this is that I have several phrase
synonyms set up. For example, "ipod mini" into "ipad mini". This synonym is
only applied if you submit it as a phrase in quotes.
So, the pf param doesn't help because it's not the right phrase in the first
place.
I can fix
Never got a response on this ... Just looking for the best way to handle it?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Using-a-RequestHandler-to-expand-query-parameter-tp4155596p4157613.html
Sent from the Solr - User mailing list archive at Nabble.com.
I would like to send only one query to my custom request handler and have the
request handler expand that query into a more complicated query.
Example:
*/myHandler?q=kids+books*
... would turn into a more complicated EDismax query of:
*"kids books" kids books*
Is this achievable via a Request
Hi there,
We're trying to evaluate whether to use the CloudSolrServer in SolrJ or to
use the HttpSolrServer that is pointed at a software or hardware load
balancer such as haproxy or f5. This would be in production.
Can anyone provide any experiential pros or cons on these? In addition to
perform
Hi,
I found a few threads out there dealing with this problem, but there didn't
really seem to be much detail to the solution.
I have large xml files (500M to 2+ G) with a complex nested structure. It's
impossible for me to import the exact structure into a solr representation,
and, honestly, I d
> When you set your cache (solrconfig.xml) to size=0, you are not using a
> cache. so you can debug more easily
>
> roman
>
>
> On Thu, Aug 1, 2013 at 1:12 PM, jimtronic <[hidden
> email]<http://user/SendEmail.jtp?type=node&node=4082014&i=0>>
> wrot
M, Mikhail Khludnev [via Lucene] <
ml-node+s472066n4082044...@n3.nabble.com> wrote:
> Hello Jim,
>
> Does q={!cache=false}lorem ipsum works for you?
>
>
> On Thu, Aug 1, 2013 at 9:12 PM, jimtronic <[hidden
> email]<http://user/SendEmail.jtp?type=node&node=408204
I have a query that runs slow occasionally. I'm having trouble debugging it
because once it's cached, it runs fast -- under 10 ms. But throughout the
day it occasionally takes up to 3 secs. It seems like it could be one of the
following:
1. My autoCommit (30 and openSearcher=false) and softAut
exc = e;
>> }
>>
>> if ("active".equals(replicaState)) {
>> log.info(String.format("%s at %s for %s in the %s
>> collection is active.",
>> role, nodeName, shardName, collectionName));
>
tive".equals(replicaState)) {
> log.info(String.format("%s at %s for %s in the %s
> collection is active.",
> role, nodeName, shardName, collectionName));
> } else {
> // fail the ping by raising an exception
>
I've encountered an OOM that seems to come after the server has been up for a
few weeks.
While I would love for someone to just tell me "you did X wrong", I'm more
interested in trying to debug this. So, given the error below, where would I
look next? The only odd thing that sticks out to me is t
hing
> hacky but was out of time I could devote to such unproductive
> endeavors ;-)
>
> On Mon, Jul 22, 2013 at 10:49 AM, jimtronic <[hidden
> email]<http://user/SendEmail.jtp?type=node&node=4079518&i=0>>
> wrote:
>
> > I'm not sure why it went down exac
I'm not sure why it went down exactly -- I restarted the process and lost the
logs. (d'oh!)
An OOM seems likely, however. Is there a setting for killing the processes
when solr encounters an OOM?
Thanks!
Jim
--
View this message in context:
http://lucene.472066.n3.nabble.com/Node-down-but-n
I've run into a problem recently that's difficult to debug and search for:
I have three nodes in a cluster and this weekend one of the nodes went
partially down. It no longer responds to distributed updates and it is
marked as GONE in the Cloud view of the admin screen. That's not ideal, but
there
Krupansky
>
> -Original Message-
> From: jimtronic
> Sent: Thursday, June 13, 2013 11:31 AM
> To: [hidden email] <http://user/SendEmail.jtp?type=node&node=4070262&i=0>
> Subject: Best way to match umlauts
>
> I'm trying to make Brüno come up in
I'm trying to make Brüno come up in my results when the user types in
"Bruno".
What's the best way to accomplish this?
Using Solr 4.2
--
View this message in context:
http://lucene.472066.n3.nabble.com/Best-way-to-match-umlauts-tp4070256.html
Sent from the Solr - User mailing list archive at
Is this a bug? I can create the ticket in Jira if it is, but it's not clear
to me what should be happening.
I noticed that if it is using the value set in the home directory, but that
value does not get updated, so my imports get slower and slower.
I guess I could create a cron job to update tha
My data-config files use the dataimporter.last_index_time variable, but it
seems to have stopped working when I upgraded to 4.2.
In previous 4.x versions, I saw that it was being written to zookeeper, but
now there's nothing there.
Did anything change? Or should I be doing something differently?
I'm doing fairly frequent changes to my data-config.xml files on some of my
cores in a solr cloud setup. Is there anyway to to get these files active
and up to Zookeeper without restarting the instance?
I've noticed that if I just launch another instance of solr with the
bootstrap_conf flag set to
Created:
https://issues.apache.org/jira/browse/SOLR-4639
Thanks!
On Fri, Mar 22, 2013 at 5:01 PM, Mark Miller-3 [via Lucene] <
ml-node+s472066n405060...@n3.nabble.com> wrote:
>
> On Mar 22, 2013, at 5:54 PM, jimtronic <[hidden
> email]<http://user/SendEmail.jtp?type=
Ok, this is very bizzare.
If I insert more than one document at a time using the update handler like
so:
[{"id":"1","foo_ap":"bar|50"}},{"id":"2","foo_ap":"bar|75"}]
It actually stores the same payload value "50" for both docs.
That seems like a bug, no?
There was a core change in 4.1 to how p
Something has definitely changed at 4.1. I've installed 4.0, 4.1, and 4.2
side by side and conducted the same tests on each one. Only 4.0 is returning
the expected results.
Apologies for cross posting this here and in the Lucene forum, but I really
can't tell if this is a Solr or a Lucene issue.
Ok, Yes, I have now recompiled against the 4.2.0 libraries. I needed to
change a few things, but the problem still exists using the new libraries.
I think the problem may actually be on the indexing side of things. Here's
why:
1. I had an old index created under 4.0, running 4.0. Works as expecte
Actually, this is more like the code I've got in place:
http://sujitpal.blogspot.com/2011/01/payloads-with-solr.html
Jim
--
View this message in context:
http://lucene.472066.n3.nabble.com/Did-something-change-with-Payloads-tp4049561p4049566.html
Sent from the Solr - User mailing list archive
I've been using Payloads through several versions of Solr including 4.0, but
now they are no longer working correctly on 4.2
I had originally followed Grant's article here:
http://searchhub.org/2009/08/05/getting-started-with-payloads/
I have a custom query plugin {!payload} that will return the
What are the likely ramifications of having a stored field with millions of
"words"?
For example, If I had an article and wanted to store the user id of every
user who has read it and stuck it into a simple white space delimited field.
What would go wrong and when?
My tests lead me to believe thi
I understand this may be a better question for the zookeeper list, but I'm
asking here because I'm not completely clear how much load zookeeper takes
on in a solr cloud setup.
I'm trying to determine what specs my zookeeper boxes should be. I'm on EC2,
so what I'm curious about is whether zookeepe
I'm curious how people are using DIH with SolrCloud.
I have cron jobs set up to trigger the dataimports which come from both xml
files and a sql database. Some are frequent small delta imports while others
are larger daily xml imports.
Here's what I've tried:
1. Set up a micro box that sends the
The load test was fairly heavy (ie lots of users) and designed to mimic a
fully operational system with lots of users doing normal things.
There were two things I gleaned from the logs:
PERFORMANCE WARNING: Overlapping onDeckSearchers=2 appeared for several of
my more active cores
and
The non-l
I was doing some rolling updates of my cluster ( 12 cores, 4 servers ) and I
ended up in a situation where one node was elected leader by all the cores.
This seemed very taxing to that one node. It was also still trying to serve
query requests so it slowed everything down. I'm trying to do a lot of
The notes for maxWarmingSearchers in solrconfig.xml state:
"Recommend values of 1-2 for read-only slaves, higher for masters w/o cache
warming."
Since solr cloud nodes could be both a leader and non-leader depending on
the current state of the cloud, what would be the optimal setting here?
Thank
It seems like I could could accomplish this by following the
JoinQParserPlugin logic. I can actually get pretty close using the join
query, but I need to do some extra math in the middle.
The difference in my case is that I need to access the id and the score. I
*think* the logic would go somethin
I've written a custom query parser that we'll call {!doFoo } which takes two
parameters: a field name and a space delimited list of values. The parser
does some calculations between the list of values and the field in question.
In some cases, the list is quite long and as it turns out, the core al
Ok, I'm a little confused.
I had originally bootstrapped zookeeper using a solr.xml file which
specified the following cores:
cats
dogs
birds
In my /solr/#/cloud?view=tree view I see that I have
/collections
/cats
/dogs
/birds
/configs
/cats
/dogs
/birds
When I launch a new server and co
Hi,
I have a solrcloud cluster running several cores and pointing at one
zookeeper.
For performance reasons, I'd like to move one of the cores on to it's own
dedicated cluster of servers. Can I use the same zookeeper to keep track of
both clusters.
Thanks!
Jim
--
View this message in context
>
>
> Currently, a leader does an update locally before sending in parallel to
> all replicas. If we can't send an update to a replica, because it crashed,
> or because of some other reason, we ask that replica to recover if we can.
> In that case, it's either gone and will come back and recover, o
;t work on the other nodes.
Jim
On Wed, Feb 27, 2013 at 1:06 PM, Mark Miller-3 [via Lucene] <
ml-node+s472066n4043462...@n3.nabble.com> wrote:
> You are working off trunk?
>
> Do you have any interesting info in the logs?
>
> - Mark
>
> On Feb 27, 2013, at 12:55 PM, jim
solrspec: 5.0.0.2012.12.03.13.10.02
--
View this message in context:
http://lucene.472066.n3.nabble.com/Nodes-out-of-sync-deletes-fail-tp4043433p4043437.html
Sent from the Solr - User mailing list archive at Nabble.com.
I'm not sure how it happened, but one of my nodes has different data than the
others.
When I try to delete the offending document by posting json to the /update
url, it hangs and after a minute it just fails with no reply.
I disconnected the offending node from the cloud and was able to delete t
Yes, these are good points. I'm using solr to leverage user preference data
and I need that data available real time. SQL just can't do the kind of
things I'm able to do in solr, so I have to wait until the write (a user
action, a user preference, etc) gets to solr from the db anyway.
I'm kind of
Now that I've been running Solr Cloud for a couple months and gotten
comfortable with it, I think it's time to revisit this subject.
When I search for the topic of using Solr as a primary db online, I get lots
of discussions from 2-3 years ago and usually they point out a lot of
hurdles that have
I'm confused about the behavior of clean=true using the DataImportHandler.
When I use clean=true on just one instance, it doesn't blow all the data out
until the import succeeds. In a cluster, however, it appears to blow all the
data out of the other nodes first, then starts adding new docs.
Am I
I have a simple cluster of three servers and a dedicated zookeeper server
running separately. If I make a change to my solrconfig.xml file on one of
the servers and restart the server with the bootstrap_conf=true option, will
that change be sent to the other nodes?
Or, will I have to log into eac
I'm using solr function queries to generate my own custom score. I achieve
this using something along these lines:
q=_val_:"my_custom_function()"
This populates the score field as expected, but it also includes documents
that score 0. I need a way to filter the results so that scores below zero
ar
For multi-valued fields, you can use "add" to add a value to the list. If the
value already exists, it will be there twice.
"set" will replace the entire list with the one value that you specify.
There's currently no method to remove a value, although the issue has been
logged: https://issues.apa
I'm not sure if this will be relevant for you, but this is roughly what I do.
Apologies if it's too basic.
I have a complex view that normalizes all the data that I need to be
together -- from over a dozen different tables. For one to many and many to
many relationships, I have sql turn the data
I'm thinking about catastrophic failure and recovery. If, for some reason,
the cluster should go down or become unusable and I simply want to bring it
back up as quickly as possible, what's the best way to accomplish that?
Maybe I'm thinking about this incorrectly? Is this not a concern?
--
Just added this today.
https://issues.apache.org/jira/browse/SOLR-3862
--
View this message in context:
http://lucene.472066.n3.nabble.com/deleting-a-single-value-from-multivalued-field-tp4009092p4009292.html
Sent from the Solr - User mailing list archive at Nabble.com.
I'm trying to determine my options for backing up data from a SolrCloud
cluster.
For me, bringing up my cluster from scratch can take several hours. It's way
faster to take snapshots of the index periodically and then use one of these
when booting a new instance. Since I use static xml files and d
I've got a setup like yours -- lots of cores and replicas, but no need for
shards -- and here's what I've found so far:
1. Zookeeper is tiny. I would think network I/O is going to be the biggest
concern.
2. I think this is more about high availability than performance. I've been
expirementing wit
Hi,
I've got a set up as follows:
- 13 cores
- 2 servers
- running Solr 4.0 Beta with numShards=1 and an embedded zookeeper.
I'm trying to figure out why some complex queries are running so slowly in
this setup versus quickly in a standalone mode.
Given a query like: /select?q=(some complex qu
Actually, the correct method appears to be this:
an atomic update in JSON:
{
"id" : "book1",
"author" : {"set":"Neal Stephenson"}
}
the same in XML:
book1
Neal Stephenson
Jim
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to
Figured it out.
in JSON:
{"id" : "book1",
"author" : {"set":"Neal Stephenson"}
}
in XML:
book1
This seems to work.
Jim
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-post-atomic-updates-using-xml-tp4007323p4007325.html
Sent from the Solr - User mai
There's a good intro to atomic updates here:
http://yonik.com/solr/atomic-updates/ but it does not describe how to
structure the updates using xml.
Anyone have any idea on how these would look?
Thanks! Jim
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-post-atomic-
Hi,
I'm using payloads to tie a value to an attribute for a document -- eg a
user's rating for a document. I do not store this data, but I index it and
access the value through function queries.
I was really excited about atomic updates, but they do not work for me
because they are blowing out al
I was able to use solr 3.1 functions to accomplish this logic:
/solr/select?q=_val_:sum(query("{!dismax qf=text v='solr
rocks'}"),product(map(query("{!dismax qf=text v='solr
rocks'}",-1),0,100,0,1), product(this_field,that_field)))
--
View this message in context:
http://lucene.472066.n3.nab
Hi,
For the solr function query(subquery, default) I'd like to be able to
specify the value of another field or even a function as the default.
For example, I might have:
/solr/select?q=_val_:query("{!dismax qf=text v='solr rocks'}",
product(this_field, that_field))
Is this possible?
I see tha
I solved this problem using the flatten="true" attribute.
Given this schema
Joe
Smith
attr_names is a multiValued field in my schema.xml. The flatten attribute
tells solr to take all the text from the specified node and below.
--
View this message in context:
htt
70 matches
Mail list logo