Hi,
There's no Mongolian stemmer in Snowball, the stemmer project Lucene
uses. I found one paper discussing how one might lemmatize Mongolian:
https://www.researchgate.net/publication/220229332_A_lemmatization_method_for_Mongolian_and_its_application_to_indexing_for_information_retrieval
https:
Hey,
I’m going to bump our schema version from 1.5 to 1.6 to get the implicit
useDocValuesAsStored=true, would this require a reindex?
Thanks
Karl
This e-mail is sent on behalf of Auto Trader Group Plc, Registered Office: 1
Tony Wilson Place, Manchester, Lancashire, M15 4FN (Registered in Englan
Hi Erick,
thanks very much for this information, it was immensely useful, I always
had the same question!
I'm now seeing the Analysis page and finally I don't have to rely on an
external online stemmer to see what solr *probably* stemmed the term to!!
But I still can't make the asterisk and questio
Hi,
We’re periodically seeing an ASYNC task to RELOADCOLLECTION never complete,
it’s just permanently “running”:
❯ curl -s
http://solr.search-solr.prod.k8.atcloud.io/solr/admin/collections\?action\=REQUESTSTATUS\&requestid\=1581585716
| jq .
{
"responseHeader": {
"status": 0,
"QTime":
When performing a rolling restart we see:
09:43:31.890
[OverseerThreadFactory-42-thread-5-processing-n:solr-5.search-solr.prod.k8.atcloud.io:80_solr]
ERROR org.apache.solr.cloud.OverseerTaskProcessor -
:org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode
= Session exp
Folks,
I am seeing very strange (bad) wildcard behavior (solr 8).
"kinase" finds hits as expected.
"kin*ase" and "kin*se" find 0 results. "kinase*" matches only values like
"kinase," and "kinase-" but not "kinase"
I have done the analysis as Erick suggested (thanks!) but it is not helping
Hi,
I could be wrong, but I'm starting to think that it has to do with the
fieldType. In our case, wildcards don't seem to work at all with text_en
types, but they do work with string types.
On Thu, Feb 13, 2020 at 1:52 PM Fischer, Stephen <
sfisc...@pennmedicine.upenn.edu> wrote:
> Folks,
>
> I
Also, if helpful, here is our solarconfig.xml
https://github.com/VEuPathDB/SolrDeployment/blob/master/configsets/site-search/conf/solrconfig.xml
Thanks again, from a Solr Newbie,
steve
-Original Message-
From: Fischer, Stephen
Sent: Thursday, February 13, 2020 7:52 AM
To: solr-user@lu
Thanks Erick, a follow-up question for RerankQParser:
How complex can the rerank query itself be? Can we add multiple boost
factors based on different conditions - say, if category is X boost by 2,
if brand is Y boost by 3, etc.?
On Mon, 10 Feb 2020 at 18:12, Erick Erickson
wrote:
> There isn’t
It can be basically any thing you can do with a standard Solr query.
> On Feb 13, 2020, at 9:09 AM, Nitin Arora wrote:
>
> Thanks Erick, a follow-up question for RerankQParser:
> How complex can the rerank query itself be? Can we add multiple boost
> factors based on different conditions - say,
That should be OK. There were no code changes necessary for that upgrade. see
SOLR-13363
> On Feb 12, 2020, at 5:34 PM, Rahul Goswami wrote:
>
> Hello,
> We are running a SolrCloud (7.2.1) cluster and upgrading to Solr 7.7.2. We
> run a separate multi node zookeeper ensemble which currently run
Be aware that if you search a field with stemming, then the index will only
contain the stems, i.e. cars, caring may both be indexed as «car», and when you
do a wildcard search, all analysis is skipped, so you are only targeting the
exact tokens that happen to be in that field. Thus a search for
Remove the stopword and stemmer filters from your schema and reindex.
Removing stopwords means you can never match “vitamin a”.
Stemming interferes with wildcard matches. Either stem or do wildcards on a
field, not both.
Also, what do your users expect to get with wildcard matches? Those are a
: We think this is a bug (silently dropping commits even if the client
: requested "waitForSearcher"), or at least a missing feature (commits beging
: the only UpdateRequests not reporting the achieved RF), which should be
: worth a JIRA Ticket.
Thanks for your analysis Michael -- I agree someth
Erick,
Sorry I didn't see this response, for some reason solr-users has stopped being
delivered to my mail box.
The script that adds a field based on the value(s) in some other field doesn't
add a large number of different fields to the index.
The pool_f field only has a total of 11 different v
I had also issues with this factory when creating atomic updates inside there.
They worked, but searcher where never closed and new ones where open and stayed
open with all the issues related to that one. Maybe one needs to look into more
detail into that. However - it is a script in the end so
Solr 5.5.4. I have a collection with a single shard and two replicas. Both
are reporting down. No shard leader exists. Each replica is on a different
node. Should it be safe to attempt an ADDREPLICA command? Since there's no
leader I don't know if that will work. This is the cluster state for the
c
Thanks Eric. Also, thanks for that little pointer about the JIRA. Just to
make sure I also checked for the upgrade JIRAs for other two intermediate
Zookeeper versions 3.4.11 and 3.4.13 between Solr 7.2.1 and Solr 7.7.2 and
they didn't seem to contain any Solr code changes either.
On Thu, Feb 13, 2
Yeah, 3.4.x upgrades were pretty strainght-forward.
The 3.5.5 upgrade was trickier, but the problems were in the
admin UI. The admin UI uses several “4 letter words” to do its
ZooKeeper reporting, and that required explicit permissions, but IIRC
that all only affected the admin UI reporting about
Adding a new replica won’t do you much good. Since there’s
no leader, it won’t (well, shouldn’t) sync the index.
Did you try the collections API FORCELEADER? It was put in as
a last resort for this kind of situation.
Best,
Erick
> On Feb 13, 2020, at 3:22 PM, tedsolr wrote:
>
> Solr 5.5.4. I h
You have a 64GB heap. That is extremely unusual. You can only do that if the
instance has 80 GB or more of RAM. If you don’t have enough RAM, the JVM will
start using swap space and cause extremely long GC pauses.
How much RAM do you have?
How did you choose these GC settings?
We have been usi
Robert:
My concern with fixing by adding memory is that it may just be kicking the can
down the road. Assuming there really is some leak eventually they’ll accumulate
and you’ll hit another OOM. If that were the case, I’d expect a cursory look at
your memory usage to just keep increasing over t
What Walter said. Also, you _must_ leave quite a bit of free RAM for the OS due
to Lucene using MMapDirectory space, see:
https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
Basically until you can get your GC pauses under control, you’ll have an
unstable collection.
How b
Hi there,
We are using AWS EMR as our big data processing cluster. We have like 3TB
of text files where each line denotes a json record which I want to be
indexed into Solr.
I have tried this by batching them and pushing to Solr index using
SolrJClient. But I feel thats really slow.
My doubt is
Indexing rates scale pretty linearly with the number of shards, so one
way to increase throughput is to simply create a collection with
more shards. For the initial bulk-indexing operations, you can
go with a 1-replica-per-shard scenario then ADDREPLICA if you need
to build things out.
However… t
Total memory of server is 256 GB and in this server below application running
Application1 50 GB
Application2 30 GB
Application3 8 GB
Application4 2 GB
Solr shard164 GB
Solr shard2 replica 64 GB
Note: Solr shard2 and shard1 repl
26 matches
Mail list logo