Hi Faraz,
It is a bit worse than that - it also needs to calculate score, so for each
matching doc of one query part it has to check if it appears in results of
other query parts. If you use term query parser, you avoid calculating score -
all doc will have score 1.
Solr is based on lucene, whic
Uff... I See.. thx dir the explanation :)
Am 30.11.2017 3:13 nachm. schrieb "Emir Arnautović" <
emir.arnauto...@sematext.com>:
> Hi Faraz,
> It is a bit worse than that - it also needs to calculate score, so for
> each matching doc of one query part it has to check if it appears in
> results of o
No it doesn’t. The payload parsers currently just simple tokenize with no
special syntax supported.
Erik
> On Nov 30, 2017, at 02:41, John Anonymous wrote:
>
> I would like to use wildcards and fuzzy search with the payload_check query
> parser. Are these supported?
>
> {!payload_check f
Hi,
I am trying to calculate the total number of softCommit , autocommit and
hard commit from the solr logs. Can you please check whether the below
commands are correct ?
Let me know how to find the total softcommit, hardcommit and autocommit
from the logs.
*1. totalcommit=`cat $solrlogfile | g
Hello, I have a use case where I need to dedupe documents in each group based
on a particular field:
example:
doc1 = { field_a=1 field_b=2 }
doc2 = { field_a=1 field_b=2 }
doc3 = { field_a=1 field_b=3 }
doc4 = { field_a=2 field_b=3 }
doc5 = { field_a=2 field_b=3 }
and I want to run "Group by
Can somebody help me understand how Solr Wildcard Search is working?
If I’m doing search for “ship*” term I’m getting in result many strings,
like “Shipping Weight”, “Ship From”, “Shipping Calculator”, etc.
But if I’m searching for “shipp*” I don’t get any result.
In the best we trust
Georgy
I have issue with spellcheck.q parameter. Thinking it is bug.
If I’m doing search without specifying spellcheck.q parameter then I’m
getting spellcheck suggestions.
Query: /select?q=text_en-us:baring&spellcheck.dictionary=en-us&spellcheck=on
Result:
1
11
17
George,
When you get those results it could be due to stemming.
Wildcard processing expands your term to multiple terms, OR'd together. It also
takes you down a different analysis pathway, as many analysis components do not
work with multiple terms. Look into the SolrAdmin console, and use the a
I wish to understand if I can do something to get in result term "shipping"
when search for "shipp*"?
Here field definition:
Anything else can be important? Most configuration parameters are default to
Apache Solr 7.1.0.
In t
As Rick raised the most important aspect here , that the phrase is broken
into multiple terms ORed together ,
I believe if the use case requires to perform wildcard search on phrases ,
we would need to store the entire phrase as a single term in the index
which probably is not happening right now a
The initial question wasn't about a phrasal search, but I largely agree that
diff q parsers handle the analysis chain differently for multiterms.
Yes, Porter is crazily aggressive. USE WITH CAUTION!
As has been pointed out, use the Solr admin window and the "debug" in the query
option to see
I understand stemming reason. Thank you.
What do you suggest to use for stemming instead of "Porter" ? I guess, it
wasn't chosen intentionally.
In the best we trust
Georgy Nevsky
-Original Message-
From: Allison, Timothy B. [mailto:talli...@mitre.org]
Sent: Thursday, November 30, 2017 8
On 11/30/2017 4:36 AM, Puppy Linux Distros wrote:
I am trying to calculate the total number of softCommit , autocommit and
hard commit from the solr logs. Can you please check whether the below
commands are correct ?
Let me know how to find the total softcommit, hardcommit and autocommit
from th
At the very least the English possessive filter, which you have. Great!
Depending on what your query log analysis finds -- perhaps users are pretty
much only searching on nouns? -- you might consider
EnglishMinimalStemFilterFactory.
I wouldn't say that porter was or wasn't chosen intentionally
A slightly more refined answer... In my experience with the systems I've
worked with, Porter and other stemmers can be useful as a "fallback field" with
a really low boost, but you should be really careful if you're only searching
on one field.
Cannot recommend Doug Turnbull and John Berryman'
I'm running into an issue with the initial CDCR bootstrapping of an existing
index. In short, after turning on CDCR only the leader replica in the target
data center will have the documents replicated and it will not exist in any of
the follower replicas in the target data center. All subsequent
Thanks Shawn, it all mainly made sense.
I took the hint and looked at both solr.in.cmd and solr.in.sh. Clearly setting
ZK_HOST is a first step. I am sure this is explained somewhere, but I
overlooked it.
From here, once I have Solr installed, I can run the Control Script to upload a
config se
Ok, thanks. Do you know if there are any plans to support special syntax
in the future?
On Thu, Nov 30, 2017 at 5:04 AM, Erik Hatcher
wrote:
> No it doesn’t. The payload parsers currently just simple tokenize with
> no special syntax supported.
>
> Erik
>
> > On Nov 30, 2017, at 02:41, John
Hello,
We already discussed this problem five years ago [1]. In short: documents in
foreign languages are scored higher for some terms.
It was solved back then by using docCount instead of maxDoc when calculating
idf, it worked really well! But, probably due to index changes, the problem is
ba
Hi Tom,
I see what you are saying and I too think this is a bug, but I will confirm
once on the code. Bootstrapping should happen on all the nodes of the
target.
Meanwhile can you index more than 100 documents in the source and do the
exact same experiment again. Followers will not copy the entir
I’ve occasionally considered using Unicode language tags (U+E001 and friends)
on each term. That would make a term specific to a language, so we would get
[en]LaserJet, [fr]LaserJet, [de]LaserJet, and so on. But that is a pretty big
hammer, because it restricts matches to the same language. If t
This is unfortunately not what we want. Some customers use filters to restrict
language, but some customers don't. They want to be able to find documents in
all languages, so we use user preference to get their local language on top.
Except for very relevant documents in foreign languages, hence
Hi Amrit,
Starting with more documents doesn't appear to have made a difference. This
time I tried with >1000 docs. Here are the steps I took:
1. Deleted the collection on both the source and target DCs.
2. Recreated the collections.
3. Indexed >1000 documents on source data center, hard commm
Expanding the query to use both the tagged and untagged term might work. I’m
not sure the effect would be a lot different than boosting the preferred
language.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Nov 30, 2017, at 8:35 AM, Markus Jelsma
Tom,
This is very useful:
> I found a way to get the follower replicas to receive the documents from
> the leader in the target data center, I have to restart the solr instance
> running on that server. Not sure if this information helps at all.
You have to issue hardcommit on target after the
Shawn,
Thanks for the response! Yes, that was it, an older version unexpectedly in the
classpath.
And for the benefit of anyone who searches the list archive with a similar
debugging need, it's pretty easy to print out the classpath from ant's
build.xml:
Hi all
I have just been looking at solr-security-proxy, which seems to be a great
little app to put in front of Solr (link below). But would it make more sense
to use a whitelist of Solr parameters instead of a blacklist?
Thanks
Rick
https://github.com/dergachev/solr-security-proxy
solr-securit
Hi Amrit, I tried issuing hard commits to the various nodes in the target
cluster and it does not appear to cause the follower replicas to receive the
initial index. The only way I can get the replicas to see the original index is
by restarting those nodes (and take care not to restart the leade
Hi Walter,
I read the following line in reference docs, what does it mean by as long
as the global similarity allows it:
"
A field type may optionally specify a that will be used when
scoring documents that refer to fields with this type, as long as the
"global" similarity for the collection al
This JIRA also throws some light. There is a discussion of encoding norm
during indexing. The contributor eventually comments that "norms" encoded
by different similarity are compatible to each other.
On Thu, Nov 30, 2017 at 5:12 PM, Nawab Zada Asad Iqbal
wrote:
> Hi Walter,
>
> I read the follo
Tom,
(and take care not to restart the leader node otherwise it will replicate
> from one of the replicas which is missing the index).
How is this possible? Ok I will look more into it. Appreciate if someone
else also chimes in if they have similar issue.
Amrit Sarkar
Search Engineer
Lucidworks,
31 matches
Mail list logo