I use a router.field so docs that I join from/to are always in the same
shard. See
https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud#ShardsandIndexingDatainSolrCloud-DocumentRouting
There is an open ticket SOLR-8297
https://issues.apache.org/jira/browse/SOLR-8
Hi,
There doesn't seem to be any Tokenizer / Analyzer for Vietnamese built in
to Lucene at the moment. Does anyone know if something like this exists
today or is planned for? We found this
https://github.com/CaoManhDat/VNAnalyzer made by Cao Mahn Dat, but not sure
if it's up to date. Any info high
I tried many settings with "Rule-based Replica Placement" on Solr 6.5.1
and came to the conclusion that it is not working at all.
My test setup is 6 nodes on 3 servers (port 8983 and 7574 on each server).
The call to create a new collection is
"http://localhost:8983/solr/admin/collections?action=
On 2017-05-22 02:25 AM, Derek Poh wrote:
Hi
Due to the source data structure, I need to concatenate the values of
2 fields ('supplier_id' and 'product_id') to form the unique 'id' of
each document.
However there are cases where some documents only have 'supplier_id'
field.
This will result in
Dear ,
Please can you guide can we config. multiple solr master server ?
if someone confige then we are able to check that configuration in solr
admin dashboard ?
if solr multiple server configuration we complete then how its sync. with
index?
thanks,
Arvind
--
View this message in cont
I'm struggle with a situation that I think can be a bug
The LukeRequestHandler is not returning all fields that exists in one
collection with 12 shards on 12 nodes (1 shard on each node)
Running this request "http://localhost:8983/solr/collection/admin/luke"; in
each node the list of fields are t
Hi,
Before applying tokenizer, you can replace your special symbols with some
phrase to preserve it and after tokenized you can replace it back.
For example:
Thanks,
Zahid iqbal
On Mon, May 22, 2017 at 12:57 AM, Fundera Developer <
funderadevelo...@outlook.com> wrote:
> Hi all,
>
> I am a b
Hi all,
Setup: Standalone Solr with a "demo" collection name
In HttpSolrCall.getAuthCtx - it is creating a CollectionRequest where
collectionRequest.collectionName == null when I call:
curl http://localhost:8983/solr/demo/get?id=xyz
Is there a reason why it doesn't try to extract the collection
Rick Leir-2 wrote
> Yes! And the join queries get complicated. Yonick has some good blogs on
> this.
>
> On May 19, 2017 11:05:52 AM EDT, biplobbiswas <
> revolutionisme+solr@
> > wrote:
>>Wait, if I understand correctly, the documents would be indexed like
>>that but
>>we can get back the docum
Hi Colm, what do you mean when you refer to "Standalone Solr". Did you
setup Solr Cloud or just a single Solr instance?
Thnx
On Mon, May 22, 2017 at 8:50 AM, Colm O hEigeartaigh
wrote:
> Hi all,
>
> Setup: Standalone Solr with a "demo" collection name
>
> In HttpSolrCall.getAuthCtx - it is cr
Just a single Solr instance.
Colm.
On Mon, May 22, 2017 at 2:47 PM, Susheel Kumar
wrote:
> Hi Colm, what do you mean when you refer to "Standalone Solr". Did you
> setup Solr Cloud or just a single Solr instance?
>
> Thnx
>
> On Mon, May 22, 2017 at 8:50 AM, Colm O hEigeartaigh
> wrote:
>
>>
Hello!
Since you are talking about Banana, you might be interested in faceting.
You probably can have child docs in results and facets them, but this gives
child level counts. If you need to have parent level counts by child fields
you have two ways to do so: see
http://blog-archive.griddynamics.co
You can also use any of the other tokenizers. WhitespaceTokenizer for
instance. There are a couple that use regular expressions. Etc. See:
https://cwiki.apache.org/confluence/display/solr/Tokenizers
Each one has it's considerations. WhitespaceTokenizer won't, for
instance, separate out punctuation
Luke really doesn't operate at a level that knows about collections
and the like, see: https://issues.apache.org/jira/browse/SOLR-8127.
So far there hasn't been much interest in extending it to the
collection level particularly because it's intended to get you
low-level index characteristics.
Not
Eirik:
That code is 4 years old and for Lucene 4. I doubt it applies cleanly
to the current code base, but feel free to give it a try but it's not
guaranteed.
I know of no other Vietnamese analyzers available.
Dat is active in the community, don't know whether he has plans to
update/commit that
this will likely be "interesting" from a performance perspective. You
might try Streaming, especially StreamingExpressions and ParallelSQL
depending on what you need this for.
Best,
Erick
On Mon, May 22, 2017 at 12:05 AM, Damien Kamerman wrote:
> I use a router.field so docs that I join from/to
Ok ... then I have no way to know the full list of fields in my collection
without doing a LukeRequest to all of the shards and do a merge in the end,
isn't it?
Streaming expressions doesn't allow * wildcard, the LukeRequest doesn't
return all fields .. no way to pull all data from a collection in
: I've been using cursorMark for quite a while, but I noticed that sometimes
: the value is huge (more than 8K). It results in Request-URI Too Long
FWIW: cursorMark values are simple "string safe" encoded forms of sort
fields -- so my guess is you are sorting on some really long string
values?
i have an english book which i have indexed its contents successfully into
field called 'content, with the following properties:
so if need to return the number of a specific term regex e.g. '*olomo*' then my
document should contain 2 and give me 'Solomon' with a term frequency = 2.
I've t
Thank you Zahid and Erik,
I was going to try the CharFilter suggestion, but then I doubted. I see the
indexing process, and how the appearance of 'i+d' would be handled, but, what
happens at query time? If I use the same filter, I could remove '+' chars that
are added by the user to identify co
Fundera,
You need a regex which matches a '+' with non-blank chars before and after. It
should not replace a '+' preceded by white space, that is important in Solr.
This is not a perfect solution, but might improve matters for you.
Cheers -- Rick
On May 22, 2017 1:58:21 PM EDT, Fundera Develope
I have a solrcloud collection with 2 shards and 4 replicas. The replicas
for shard 1 have different numbers of records, so different queries will
return different numbers of records.
I am not certain how this occurred, it happened in a collection that was a
cdcr target.
Is there a way to limit a
You can ping individual replicas by addressing to a specific replica
and setting distrib=false, something like
http://SOLR_NODE:port/solr/collection1_shard1_replica1/query?distrib=false&q=..
But one thing to check first is that you've committed. I'd:
1> turn off indexing on the source c
If you want all the replicas for shard1 on the same port then I think the
rule is: 'shard:shard1,replica:port:8983'
On 22 May 2017 at 18:47, Bernd Fehling
wrote:
> I tried many settings with "Rule-based Replica Placement" on Solr 6.5.1
> and came to the conclusion that it is not working at all.
Hi Rick
Myapologies I didnot make myself clearon the value of the fields. There
are numbers.
I used 'ts1', 'sup1' and 'pdt1' for simplicity and for ease of
understanding instead of the actual numbers.
You mentioned this design has the potential for (in error cases)
concatenating id's incorre
Thanks for bringing up performance perspective. Is there any bench mark on
join performance when number of shards is more than 10 where documents are
indexed based on router.field.
Are you suggesting instead of router.field go for streaming expressions or
use join with router.field and then go for
Hi, Sorry that this reply is not an answer for your post, but want to know
whether graph is working fine for you as expected. is that traverse working
fine in the graph ?
I posted a question over here,
http://lucene.472066.n3.nabble.com/Graph-traversel-td4331207.html#a4331799
but no response.
S
hello,my name is weixiaofeng. I'm from China, I'm a java developer.Recently we
use the technology of solr to complete search big data.we were in trouble in
module CDCR(Cross Data Center Replication) of solrCloud。
The goal of the project is to replicate data to multiple Data Centers to
support
Okay. Thanks Shawn.
I am using Chef for deploying SolrCloud as a service. The chef-client runs
every 30 minutes and hence the script "install_solr_service" runs every 30
minutes. I changed that.
On Fri, May 19, 2017 at 5:20 PM, Shawn Heisey wrote:
> On 5/19/2017 5:05 PM, Chetas Joshi wrote:
> >
No, that is way off, because:
1. you have no "tag" defined.
shard and replica can be omitted and they will default to wildcard,
but a "tag" must be defined.
2. replica must be an integer or a wildcard.
Regards
Bernd
Am 23.05.2017 um 01:17 schrieb Damien Kamerman:
> If you want all the repli
30 matches
Mail list logo