sort=“id asc”
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Feb 10, 2020, at 9:50 PM, Tim Casey wrote:
>
> Walter,
>
> When you do the query, what is the sort of the results?
>
> tim
>
> On Mon, Feb 10, 2020 at 8:44 PM Walter Underwood
> wrote
Hi Pratik,
Shingle filter should do that.
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> On 10 Feb 2020, at 18:57, Pratik Patel wrote:
>
> Thanks for the reply Emir.
>
> I will be exploring the opti
Walter,
When you do the query, what is the sort of the results?
tim
On Mon, Feb 10, 2020 at 8:44 PM Walter Underwood
wrote:
> I’ll back up a bit, since it is sort of an X/Y problem.
>
> I have an index with four shards and 17 million documents. I want to dump
> all the docs in JSON, label each
I’ll back up a bit, since it is sort of an X/Y problem.
I have an index with four shards and 17 million documents. I want to dump all
the docs in JSON, label each one with a classifier, then load them back in with
the labels. This is a one-time (or rare) bootstrap of the classified data. This
w
Possibly worth mentioning, although it might not be appropriate for
your use case: if the fields you're interested in are configured with
docValues, you could use streaming expressions (or directly handle
thread-per-shard connections to the /export handler) and get
everything in a single shot witho
Any field that’s unique per doc would do, but yeah, that’s usually an ID.
Hmmm, I don’t see why separate queries for 0-f are necessary if you’re firing
at individual replicas. Each replica should have multiple UUIDs that start with
0-f.
Unless I misunderstand and you’re just firing off, say, 16
> On Feb 10, 2020, at 2:24 PM, Walter Underwood wrote:
>
> Not sure if range queries work on a UUID field, ...
A search for id:0* took 260 ms, so it looks like they work just fine. I’ll try
separate queries for 0-f.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org
I’ll give that a shot.
Not sure if range queries work on a UUID field, but I have thought of
segmenting the ID space and running parallel queries on those.
Right now it is sucking over 1.6 million docs per hour, so that is bearable.
Making it 4X or 16 X faster would be nice, though.
wunder
Wal
Not sure whether cursormark respects distrib=false, although I can easily see
there being “complications” here.
Hmmm, whenever I try to use distrib=false, I usually fire the query at the
specific replica rather than use the shards parameter. IDK whether that’ll make
any difference.
https://nod
Hi All - Getting this error when trying to split a shard. HDFS has
space available, but it looks like it is using the local disk storage
value instead of available HDFS disk space. Is there a workaround?
Thanks!
{
"responseHeader": {
"status": 0,
"QTime": 6
},
"Op
I tried to get fancy and dump our content with one thread per shard, but it did
distributed search anyway. I specified the shard using the “shards” param and
set distrib=false.
Is this a bug or expected behavior in 6.6.2? I did not see it mentioned in the
docs.
It is working fine with a single
Thanks for the reply Emir.
I will be exploring the option of creating a custom filter. It's good to
know that we can consume more than one tokens from previous filter and emit
different number of tokens. Do you know of any existing filter in Solr
which does something similar? It would be greatly h
Hi all
We run Solr 8.2.0
* with Amazon Corretto 11.0.5.10.1 SDK (java arguments shown in [1]),
* on Ubuntu 18.04
* on AWS EC2 m5.2xlarge with 8 CPUs and 32GB of RAM
* with -Xmx16g [1].
We have migrated from Solr 3.5 and in big core (16GB) replicas we have
started to suffer degraded service. The r
You’ve misconfigured the startup. Although looking at the
script help it is a little confusing.
The -z parameter should be the _ensemble_. Pointing each Solr
instance three times at the same ZK instance is not at all what
you need to do.
You should start them up with the “-z” parameter set to so
I have created three different solrcloud instance running on three different
ports with external zookeeper 3 instance link with them and when I load the
data in one solrcloud instance, it successfully can be accessible from three
different solrcloud instance.
E.g:
Zookeeper
./zkServer start zoo.cfg
There isn’t really an “industry standard”, since the reasons someone
wants this kind of behavior vary from situation to situation. That said,
Solr has RerankQParserPlugin that’s designed for this.
Best,
Erick
> On Feb 10, 2020, at 4:23 AM, Nitin Arora wrote:
>
> I am looking for an efficient w
I am looking for an efficient way for setting the MM(minimum should match)
parameter for my solr search queries. As we go from MM=100% to MM=0%, we
move from lots of zero result queries on one hand to too many irrelevant
results (which may then get boosted by other factors) on the other. I can
thin
17 matches
Mail list logo