The QTime's are from the updates.
We don't have the resource right now to switch to SolrJ, but I would
assume only sending updates to the leaders would take some redirects out
of the process, I can regularly query for the collection status to know
who's who.
I'm now more interested in the ca
thanks Binoy for replying ,
i am giving you few use cases
a) shoes in nike or nike shoes
Here "nike " is brand and in this case my query entity is shoe and entity
type is brand
and my result should only pink nike shoes
b) " 32 inch LCD TV sony "
32 inch is size , LCD is entity type a
Could you describe your problem in more detail with examples of your use
cases.
On Wed, 6 Apr 2016, 11:03 Midas A, wrote:
> i have to do entity and entity type mapping with help of search query
> while building solr query.
>
> how i should i design with the solr for search.
>
> Please guide me
i have to do entity and entity type mapping with help of search query
while building solr query.
how i should i design with the solr for search.
Please guide me .
Thanks man. I'd love to learn more about the Talend OpenStudio project
you're working on. Is it based on Lucene/Solr or a different project?
On Tuesday, April 5, 2016, Davis, Daniel (NIH/NLM) [C]
wrote:
> Yangrui,
>
> Let me clarify - to have multiple data imports run concurrently, my
> impressi
bq: Apart from the obvious delay, I'm also seeing QTime's of 1,000 to 5,000
QTimes for what? The update? Queries? If for queries, autowarming may help,
especially as your soft commit is throwing away all the top-level
caches (i.e. the
ones configured in solrconfig.xml) every minute. It shouldn
Hi,
I'm trying to use the new MLT query parser in a SolrCloud mode. As per
the documentation, here's the syntax,
{!mlt qf=name}1
where "1" is the id.
What I'm trying to undertsand is whether "id" is a mandatory field in
making this work? Right now,I'm getting mlt documents based on a "keyword
On 4/5/2016 1:16 PM, Yangrui Guo wrote:
> So if I implement multiple dataimporthandler and do a full import, does
> Solr perform import of all handlers at once or can just specify which
> handler to import? Thank you
Each handler has a name, which starts with a forward slash. Normally
it's named
A few thoughts...
>From a black-box testing perspective, you might try changing that
softCommit time frame to something longer and see if it makes a difference.
The size of your documents will make a difference too - so the comparison
to 300 - 500 on other cloud setups may or may not be compari
Yangrui,
Let me clarify - to have multiple data imports run concurrently, my impression
is that you must have different requestHandlers declared in your solrconfig.xml
By default, Data Import Handler is not multi-threaded; having multiple
requestHandlers for it is a workaround to this, not a fix
This is all good stuff. Thank you all for your insight.
Steve
On Mon, Apr 4, 2016 at 6:15 PM, Yonik Seeley wrote:
> On Mon, Apr 4, 2016 at 6:06 PM, Chris Hostetter
> wrote:
> > :
> > : Not sure I understand... _version_ is time based and hence will give
> > : roughly the same accuracy as some
Yangrui,
Solr will just do one data import.You can have a script invoke more than
one, and they will run concurrently. There are some risks with that,
depending on what you are doing. If it's just pulling from a database, I
think you are all right. I've even had 4 run concurrently to
In terms of #2, this might be of use...
https://wiki.apache.org/solr/HowToReindex
On Tue, Apr 5, 2016 at 3:08 PM, Anuj Lal wrote:
> I am new to solr. Need some advice from more experienced solr team
> members
>
> I am upgrading 4.4 solr cluster to 5.5
>
>
> One of the step I am doing for upgr
I am new to solr. Need some advice from more experienced solr team members
I am upgrading 4.4 solr cluster to 5.5
One of the step I am doing for upgrade is to bootstrap from existing 4.4 solr
home ( after upgrading solr installation to 5.5)
All of the nodes comes up correctly and I can query
Hi,
I'm currently posting updates via cURL, in batches of 1,000 docs in JSON
files.
My setup consists of 2 shards, 1 replica each, 50m docs in total.
These updates are hitting a node at random, from a server across the
Internet.
Apart from the obvious delay, I'm also seeing QTime's of 1,00
Hi Vijay,
Can you provide more information about what you were trying to do and why
do you think this isn't working? The more details you can provide, the
better.
* What's your SolrCloud setup
* How did you enable security
* What do you expect ?
* What do you see ?
On Tue, Apr 5, 2016 at 1:02 PM
Hi All,We are recently start leveraging the Solr 5.5 version in the Cloud mode.
Even enabling the security in the SolrCloud. Its not working looking your
advice to debug the issue.
cat security.json{"authentication":{ "class":"solr.BasicAuthPlugin",
"blockUnknown": true,
"credentials":{
Hi Daniel,
So if I implement multiple dataimporthandler and do a full import, does
Solr perform import of all handlers at once or can just specify which
handler to import? Thank you
Yangrui
On Tuesday, April 5, 2016, Davis, Daniel (NIH/NLM) [C]
wrote:
> If Shawn is correct, and you are using D
>From some docs I'm working on - this command (against one solr box) got me
the entire cluster's state...
Don't know if it'll work for you, but just in case... There may be an api
command that is similar - not sure. I'm mostly operating on the command
line right now.
(statdx is the name of my c
My own choices were driven mostly by the usage of the data - from a more
architectural perspective.
I have "appDocuments" and "appImages" for one of the applications I'm
supporting. Because they are so closely connected (an appDocuments can
have N number of appImages and appImages can belong to m
Some more input, before I call it a day. Just for the heck of it, I tried
changing minClauseSize to 0 using the Eclipse debugger, so that it didn't
return null at line 1203, but instead returned the TermQuery on line 1205. Then
everything worked exactly as it should. The matching document got bo
I now used the Eclipse debugger, to try and see if I can understand what is
happening, I it seems like the ExtendedDismaxQParser simply ignores my pf
parameter, since it doesn't interpret it as a phrase query.
https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.6.0/solr/core/src/ja
Thank you both for the clarification and proposals!
This solrcloud_manager looks very promising. I'll try it out, the shared
filesystem requirement is no issue for me.
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCloud-backup-restore-tp4267954p4268197.html
Sent from
OK. Interesting. But... I added a solr.TrimFilterFactory at the end of my
analyzer definition. Shouldn't that take care of the added space at the end?
The admin analysis page indicates that it works as it should, but I still can't
get edismax to boost.
-Original Message-
From: Jack Krup
You have choices:
- Use a separate collection for each data import
- Use the same collection for each data import, differentiating them using a
field you can query
The choice depends on the objects and how they will be use, and I trust others
on this list to have better advise on how to choose
Hi,
Is there any contribution(open source contrib module) that routes documents
to shards based on document similarity technique? Or any suggestions that
integrates mahout to solr for this use case?
>From what I know, currently there are two document route strategies as
explained here
https://luc
Hi thanks for the answer. Yes I will be using DIH to import data from
different database connections. Do I have to create a collection for each
connection?
On Tuesday, April 5, 2016, Shawn Heisey wrote:
> On 4/5/2016 8:12 AM, Yangrui Guo wrote:
> > I'm using Solr Cloud to index a number of datab
There is some automation around this process in the backup commands here:
https://github.com/whitepages/solrcloud_manager
It’s been tested with 5.4, and will restore arbitrary replication factors.
Ever assuming the shared filesystem for backups, of course.
On 4/5/16, 3:18 AM, "Reth RM" wrot
I recall I had some luck fixing a leader-less shard (after a ZK quorum failure)
by forcably removing the records for the down-state replicas from the leader
election list, and then forcing an election.
The ZK path looks like collections//leader_elect/shardX/election.
Usually you’ll find the dow
If Shawn is correct, and you are using DIH, then I have done this by
implementing multiple requestHandlers each of them using Data Import Handler,
and have each specify a different XML file for the data config. Instead of
using data-config.xml, I've used a large number of files such as:
On 4/5/2016 8:12 AM, Yangrui Guo wrote:
> I'm using Solr Cloud to index a number of databases. The problem is there
> is unknown number of databases and each database has its own configuration.
> If I create a single collection for every database the query would
> eventually become insanely long. I
It looks like the code constructing the boost phrase for pf will always add
a trailing blank, which is never a problem when a normal tokenizer is used
that removes white space, but the keyword tokenizer will preserve that
extra space, which prevents an exact match.
See line 531:
https://github.com
Hello - i would certainly go for edismax' boost parameter, as it multiplies
scores. You can always do a regular boost query via {!boost ..} but edismax
makes it much easier.
Markus
-Original message-
> From:John Blythe
> Sent: Tuesday 5th April 2016 15:36
> To: solr-user
> Subject:
You could use the zkcli.sh script to directly query your zookeeper ensemble
and get the cluster status.
See if that works for you.
On Tue, 5 Apr 2016, 17:28 preeti kumari, wrote:
> Hi Reth,
>
> I had already checked this but issue is it gives me info about shards/cores
> hosted on one server whe
Hello
I'm using Solr Cloud to index a number of databases. The problem is there
is unknown number of databases and each database has its own configuration.
If I create a single collection for every database the query would
eventually become insanely long. Is it possible to upload different config
hi all,
i'm trying to do something similar to a simple fq=x on my query. i'm using
the regular ol' select handler. instead of blocking out all items not
related to x via the filter query i'd like to simply give them priority.
is there a way to do this? it seems like the dismax bq function may be
Thanks for the reply.
Yes, I did built "master" branch. I will try branch_6_0.
However, the same code worked in Solr-4.10 and the SolrInputDocument created
out of a Lucene Document had the same stuff (stored/indexed/tokenized). It did
not complain or break.
I will first try this in Solr-6.0 and
On 4/5/2016 6:07 AM, Rohana Rajapakse wrote:
> I am trying to update the value of one field in an existing document, and it
> throws me the exception given below.
> For the update, I am using my own update handler which created a
> SolrInputDocument from and Existing Solr Document.
>
> I am using
On 4/5/2016 4:09 AM, abhi Abhishek wrote:
> Thanks MIkhail.
> is there a way to have a push Replication. any Contributions or
> Anything what could in this case?
The master-slave replication in ALL versions of Solr (including 5.x) is
pull, as already mentioned. This cannot be changed.
Solr
Hi,
I am trying to update the value of one field in an existing document, and it
throws me the exception given below.
For the update, I am using my own update handler which created a
SolrInputDocument from and Existing Solr Document.
I am using Solr6.x built from the source code obtained from g
Hi Reth,
I had already checked this but issue is it gives me info about shards/cores
hosted on one server where i am hitting the query not the whole cluster
info hosted on different servers.
What i need is whole info about all shards/cores hosted on different
servers forming my collection.
Thank
Hi,
I'm trying to boost documents using a phrase field boosting (ie the pf
parameter for edismax), but I can't get it to work (ie boosting documents where
the pf field match the query as a phrase).
As far as I can tell, solr, or more specifically the edismax handler, does
*something* when I ad
Yes. It should be backing up each shard leader of collection. For each
collection, for each shard, find the leader and request a backup command on
that. Further, restore this on new collection, in its respective shard and
then go on adding new replica which will duly pull it from the newly added
sh
Hi all, I have an 8 node SolrCloud 5.5 cluster with 11 collections,
most of them in a 1 shard x 8 replicas configuration. We have 5 ZK
nodes.
During the night, we attempted to reindex one of the larger
collections. We reindex by pushing json docs to the update handler
from a number of processes. I
Have you already looked at cluster status api?
https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api18
On Tue, Apr 5, 2016 at 10:09 AM, preeti kumari
wrote:
> Hi,
>
> I am using solr 5.2.1 . We need to configure F5 load balancer with
> zookeepers.
> For that we nee
Thanks MIkhail.
is there a way to have a push Replication. any Contributions or
Anything what could in this case?
Thanks,
Abhishek
On Tue, Apr 5, 2016 at 1:29 AM, Mikhail Khludnev wrote:
> It's pull, but you can trigger pulling.
>
> On Mon, Apr 4, 2016 at 9:19 PM, abhi Abhishek wrote:
>
46 matches
Mail list logo