I changed attributes reloaded the collection but scores are not changing
also (norm(content_text)) is not changing.
i did reindexing of documents but scores are not changing.
steps i followed.
1 Created fields using default similarity.
created content_text field type without similarity section
Hi all,
I am experimenting with different parameters of BM25 and Sweetspot
similarity.
I changed solr field type definition like given below.
I need clarification that changing similarity in field type need reidexing
or not.
{"replace-field-type":{
"name":"content_text&
s/servlet/mobile#issue/SOLR-13500>
> /SOLR-13500
> <https://issues.apache.org/jira/plugins/servlet/mobile#issue/SOLR-13500>
>
> https://issues.apache.org/jira/plugins/servlet/mobile#issue
> <https://issues.apache.org/jira/plugins/servlet/issue/SOLR-14397
> >/SOLR-1439
lugins/servlet/mobile#issue/SOLR-13500>
https://issues.apache.org/jira/plugins/servlet/mobile#issue
<https://issues.apache.org/jira/plugins/servlet/issue/SOLR-14397>/SOLR-14397
<https://issues.apache.org/jira/plugins/servlet/issue/SOLR-14397>
Even though cosine similarity can be us
Hi all,
I've started to look at image similarity. For each image in documents I
have a couple of vectors that represent the color and shape similarity.
As pointer to begin a colleague suggested me to start with this project:
https://github.com/moshebla/solr-vector-scoring
but the pr
Hello,
We are trying to use NGramFilterFactory for approximative search with solr
7.
We usually use a similarity with no tf, no idf (our similarity extends
ClassicSimilarity, with tf and idf functions always returning 1).
For ngram search though, it seems inappropriate since it scores a word
different scores primarily
> influenced
> > by the Term Frequency component. Since I am using a threshold to filter
> the
> > results for a matched record based off the SOLR score, a somewhat
> > normalized score is needed.
> > Are there any similarity classes th
by the Term Frequency component. Since I am using a threshold to filter the
> results for a matched record based off the SOLR score, a somewhat
> normalized score is needed.
> Are there any similarity classes that are more suitable to my needs?
>
> Thanks,
> Tanu
>
--
*Do
normalized score is needed.
Are there any similarity classes that are more suitable to my needs?
Thanks,
Tanu
Can someone please look into the following -
https://stackoverflow.com/questions/51978351/solr-custom-similarity-using-a-field-from-the-indexed-document
Thanks!
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
: So I have the following at the bottom of my schema.xml file
:
:
:
:
:
: The documentation says "top level element" - so should that actually be
outside the schema tag?
No, the schema tag is the "root" level element, it's direct children are
the "top level elements"
(the wording m
I'm using SOLR 7.1 and I'm trying to set the similarity factory back to
ClassicSimilarityFactory so that it will behave like SOLR 6 or before.
In the document
https://lucene.apache.org/solr/guide/7_1/other-schema-elements.html#similarity
It says
This default behavior can be ove
Thanks for your answer Rahul. I think I have explained similarity with the
example, assuming the natural order.
I would assume this is a common action for people who use solr and do search
based systems.
I am basically looking for any design patterns that people use to achieve the
results as
How do you define similarity? There are various different methods that work for
different methods. In solr depending on which index time analyzer / tokenizer
you are using, it will treat one company name as similar in one scenario and
not in another.
This seems like a case of data
Hi Team
This is what I want to do:
1. I have 2 datasets of the schema id-number and company-name
2. I want to ultimately be able to link (join or any other means) the 2 data
sets based on the similarity between the company-name fields of the 2 data set.
Example:
Dataset 1
Id | Company
raries
>Action Technical Temporar
>
>If I search
>
>IDX_CompanyName: (Action AND Technical AND Temporaries AND t/a AND CTR
>AND Corporation)
>
>Under 4.10.2 I see all three in the results
>
>Under 7.1, with the default BF25 similarity. I only see the first
>resul
Hi Rick,
I don't think the issue is BM25 vs TFIDF (the old similarity), it seems more
due to the "matching" logic.
you are asking to match:
"(Action AND Technical AND Temporaries AND t/a AND CTR AND Corporation)"
This (in theory) means that you want to retrieve **
R AND
Corporation)
Under 4.10.2 I see all three in the results
Under 7.1, with the default BF25 similarity. I only see the first result
Someone on the list suggested that make 7.1 to go back to the similarity
factory used in 4.10.2 that I add the following to the schema.xml.
That brings all three re
This JIRA also throws some light. There is a discussion of encoding norm
during indexing. The contributor eventually comments that "norms" encoded
by different similarity are compatible to each other.
On Thu, Nov 30, 2017 at 5:12 PM, Nawab Zada Asad Iqbal
wrote:
> Hi Walter,
&
Hi Walter,
I read the following line in reference docs, what does it mean by as long
as the global similarity allows it:
"
A field type may optionally specify a that will be used when
scoring documents that refer to fields with this type, as long as the
"global" similarity for
Thanks Walter
On Mon, Nov 20, 2017 at 4:59 PM Walter Underwood
wrote:
> Similarity is query time.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/ (my blog)
>
>
> > On Nov 20, 2017, at 4:57 PM, Nawab Zada Asad Iqbal
> wro
Similarity is query time.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
> On Nov 20, 2017, at 4:57 PM, Nawab Zada Asad Iqbal wrote:
>
> Hi,
>
> I want to switch to Classic similarity instead of BM25 (default in solr7).
> Do I n
Hi,
I want to switch to Classic similarity instead of BM25 (default in solr7).
Do I need to reindex all cores after this? Or is it only a query time
setting?
Thanks
Nawab
i think this can be actually a good idea and I think that would require a new
feature type implementation.
Specifically I think you could leverage the existing data structures ( such
TermVector) to calculate the matrix and then perform the calculations you
need.
Or maybe there is space for even a
Hi,
I'm trying to extract several similarity measures from Solr for use in a
learning to rank model. Doing this mathematically involves taking the dot
product of several different matrices, which is extremely fast for non-huge
data sets (e.g., millions of documents and queries). Howeve
In addition to that, I still believe More Like This is a better option for
you.
The reason is that the MLT is able to evaluate the interesting terms from
your document (title is the only field of interest for you), and boost them
accordingly.
Related your "80% of similarity", this is m
\n 11.583146 = sum of:\n 11.583146 = max
>> of:\n 11.583146 = weight(abstract:titl in 1934) [], result of:\n
>> 11.583146 = score(doc=1934,freq=1.0 = termFreq=1.0\n), product of:\n 2.0
>> = boost\n 5.503748 = idf(docFreq=74, docCount=18297)\n 1.0522962 =
>> tfNorm, computed fro
f:\n 3.816711E-5 =
> score(doc=1934,freq=1.0 = termFreq=1.0\n), product of:\n 3.0 = boost\n
> 1.4457239E-5 = idf(docFreq=34584, docCount=34584)\n 0.88 = tfNorm,
> computed from:\n 1.0 = termFreq=1.0\n 1.2 = parameter k1\n 0.75 =
> parameter b\n 3.0 = avgFieldLength\n 4.0 = fieldLength\n"
dding &debug=query to the URL? The parsed
query will be especially illuminating.
Best,
Erick
On Mon, Aug 28, 2017 at 4:37 AM, Emir Arnautovic
wrote:
Hi Darko,
The issue is the wrong expectations: title-1-end is parsed to 3 tokens
(guessing) and mm=99% of 3 tokens is 2.99 and it is rounded dow
u have to play with your tokenizers and define what is similarity match
> percentage (if you want to stick with mm).
>
> Regards,
> Emir
>
>
>
> On 28.08.2017 09:17, Darko Todoric wrote:
>>
>> Hm... I cannot make that this DisMax work on my Solr...
>>
>
l result in zero (or
one match if you do not filter out original document).
You have to play with your tokenizers and define what is similarity
match percentage (if you want to stick with mm).
Regards,
Emir
On 28.08.2017 09:17, Darko Todoric wrote:
Hm... I cannot make that this DisMax wo
ce/display/solr/The+DisMax+Query+Parser
And then set the minimum match ratio of the OR clauses to 90%.
/JZ
-Original Message-
From: Darko Todoric [mailto:todo...@mdpi.com]
Sent: Friday, August 25, 2017 5:49 PM
To: solr-user@lucene.apache.org
Subject: Search by similarity?
Hi,
I have 90
Yes, that is roughly how MLT works as well. You can also do a full OR-search on
the terms using LuceneQParser.
Markus
-Original message-
> From:Junte Zhang
> Sent: Friday 25th August 2017 18:38
> To: solr-user@lucene.apache.org
> Subject: RE: Search by similarity?
then set the minimum match ratio of the OR clauses to 90%.
/JZ
-Original Message-
From: Darko Todoric [mailto:todo...@mdpi.com]
Sent: Friday, August 25, 2017 5:49 PM
To: solr-user@lucene.apache.org
Subject: Search by similarity?
Hi,
I have 90.000.000 documents in Solr and I need to
Hi,
I have 90.000.000 documents in Solr and I need to compare "title" of
this document and get all documents with more than 80% similarity. PHP
have "similar_text" but it's not so smart inserting 90m documents in the
array...
Can I do some query in Solr which wil
Hi Michael,
Using your example, if you have 5 different fields, you could create 5
individual SolrFeatures against those fields. The one tricky thing here is
that you want to use different similarity scoring mechanisms against your
fields. By default, Solr uses a single Similarity class
<ht
roduct-170724_shard1_replica1:
> Can't load schema schema.xml: Error loading class 'com.sial.similarity.
> SialBM25Similarity'
>
> What am I missing? Is this plugins api limited to components and request
> handlers? If so that's a HUGE limitation, makes it just
-170724_shard1_replica1:
Can't load schema schema.xml: Error loading class
'com.sial.similarity.SialBM25Similarity'
What am I missing? Is this plugins api limited to components and request
handlers? If so that's a HUGE limitation, makes it just about useless.
I need it for this simila
Hi all,
I recently prototyped a learning to rank system in Python that produced
promising results, so I'm now looking into how to replicate that process in
our Solr setup. For my Python implementation, I was using a number of
features that were per field text comparisons, e.g.:
1. tfidf_case_t
I have a question about moving a trivial (?) Similarity class from version
4.x to 6.x
I have queries along these lines: field1:somevalue^0.55
field2:anothervalue^1.4
The score for a document is simply the sum of weights. A hit on field1
alone scores 0.55. Field2 alone scores 1.4. Both fields
Farida:
Can you define a dsl? (domain specific model).
If you stand a RESTful api on the DSL, you can write a BM25F similarity, with
conditionals maybe controlled by a func query accessing field payloads?
:-)
On 1/21/2017 6:13 AM, Markus Jelsma wrote:
No, this is not possible. A similarity
No, this is not possible. A similarity is an index time setting because it can
have index time properties. There is no way to control this query time.
M
-Original message-
> From:Farida Sabry
> Sent: Saturday 21st January 2017 9:58
> To: solr-user@lucene.apache.org
> Su
Is there a way to set the similarity in Solrj query search like we do for
lucene IndexSearcher e.g. searcher.setSimilarity(sim);
I need to define a custom similarity and at the same time get the fields
provided by SolrInputDocument in the returned results by
SolrDocumentList results
: Fwiw, here's example of per field similarity
: https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements
Similarities can be defined per field *TYPE* ... not per field (as shown
in the examples on that page)
-Hoss
http://www.lucidworks.com/
Fwiw, here's example of per field similarity
https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements
20 июня 2016 г. 2:49 пользователь "Shawn Heisey"
написал:
> On 6/19/2016 4:09 AM, dirmanhafiz wrote:
> > Hi , Im Dirman and im trying experiment solr w
On 6/19/2016 4:09 AM, dirmanhafiz wrote:
> Hi , Im Dirman and im trying experiment solr with sweetspot similarity,,
> can someone tell me min max in sweetspot similarity is about lengthdoc ?
> i have average lengthdoc 3-205 tokens / doc and why while i wrote in my
> schema.xml paramet
Hi,
Sweet spot is designed to punish too long or too short documents.
Did you reindex?
Can you see the mention of sweet spot in debugQuery=true response?
Ahmet
On Sunday, June 19, 2016 2:18 PM, dirmanhafiz wrote:
Hi , Im Dirman and im trying experiment solr with sweetspot similarity,,
can
Hi , Im Dirman and im trying experiment solr with sweetspot similarity,,
can someone tell me min max in sweetspot similarity is about lengthdoc ?
i have average lengthdoc 3-205 tokens / doc and why while i wrote in my
schema.xml parameter sweetspot similairity didnt work ?
10
Reranking is done after the collapse. So you'll get the original score in
cscore()
Joel Bernstein
http://joelsolr.blogspot.com/
On Thu, May 26, 2016 at 12:56 PM, aanilpala wrote:
> with cscore() in collapse, will I get the similarity score from lucene or
> the
> reranked score b
with cscore() in collapse, will I get the similarity score from lucene or the
reranked score by the raranker if I am using a plugin that reranks the
results? I guess the answer depends on which of fq or rq is applied first.
--
View this message in context:
http://lucene.472066.n3.nabble.com
unctionQueries-UsingFunctionQuery
>
>
>
>
> On Thursday, May 26, 2016 1:59 PM, aanilpala wrote:
> is it allowed to provide a sort function (sortspec) that is using
> similarity
> score. for example something like in the following:
>
> sort=product(2,score) desc
>
>
t function (sortspec) that is using similarity
score. for example something like in the following:
sort=product(2,score) desc
seems that it won't work. is there an alternative way to achieve this?
using solr6
thanks in advance.
--
View this message in context:
http://lucene.472066.n3.nabble.com
is it allowed to provide a sort function (sortspec) that is using similarity
score. for example something like in the following:
sort=product(2,score) desc
seems that it won't work. is there an alternative way to achieve this?
using solr6
thanks in advance.
--
View this message in co
Hi Ahmet,
Thanks for the pointer. I have similar thoughts on the subject. The risk
assumptions are based on not testing your stuff before taking it in. That
risk is still valid with similarity configuration. And sometimes, it may
not be possible to use multiple similarities (custom or otherwise
://www.zettata.com
On Tue, Mar 8, 2016 at 8:59 PM, Markus Jelsma
wrote:
> Hello, you can not change similarities per request, and this is likely
> never going to be supported for good reasons. You need multiple cores, or
> multiple fields with different similarity defined in the same core.
to be supported for good reasons. You need multiple cores, or
> multiple fields with different similarity defined in the same core.
> Markus
>
> -Original message-
> > From:Parvesh Garg
> > Sent: Tuesday 8th March 2016 5:36
> > To: solr-user@lucene.apa
Hello, you can not change similarities per request, and this is likely never
going to be supported for good reasons. You need multiple cores, or multiple
fields with different similarity defined in the same core.
Markus
-Original message-
> From:Parvesh Garg
> Sent: Tuesday 8th
Hi,
We have a requirement where we want to run an A/B test over multiple
Similarity implementations. Is it possible to define multiple similarity
tags in schema.xml file and chose one using the URL parameter? We are using
solr 4.7
Currently, we are planning to have different cores with different
March 2016 15:02
> > To: solr-user@lucene.apache.org
> > Subject: Re: Override Default Similarity and SolrCloud
> >
> > Thanks Markus - do you just SCP / copy them manually to your solr nodes
> and
> > not through Zookeeper (if you use that)?
> >
> >
Hi - config is stored in ZK, libs must be present on each node and are rsync
there via provisioning.
Markus
-Original message-
> From:Joshan Mahmud
> Sent: Thursday 3rd March 2016 15:02
> To: solr-user@lucene.apache.org
> Subject: Re: Override Default Similarity a
Sent: Thursday 3rd March 2016 14:54
> > To: solr-user@lucene.apache.org
> > Subject: Override Default Similarity and SolrCloud
> >
> > Hi group!
> >
> > I'm having an issue of deploying a custom jar in SolrCloud (v 5.3.0).
> >
> > I have a
We store them server/solr/lib/.
-Original message-
> From:Joshan Mahmud
> Sent: Thursday 3rd March 2016 14:54
> To: solr-user@lucene.apache.org
> Subject: Override Default Similarity and SolrCloud
>
> Hi group!
>
> I'm having an issue of deploying a cus
Hi group!
I'm having an issue of deploying a custom jar in SolrCloud (v 5.3.0).
I have a working local Solr environment (v 5.3.0 - NOT SolrCloud) whereby I
have:
- a jar containing one class CustomSimilarity which
extends org.apache.lucene.search.similarities.DefaultSimilarity
- the ja
Sorry clicked send to early :-)
with the above additional code the calculation is done by the default
similarity and the behaviour is as expected.
I think this is an issue of the implementation but I didn't find one in
jira. Should I create one?
Cheers
Sascha
On Thu, Feb 25, 2016 at 11:
Hi,
I finally found the source of the problem I'm having with the custom
similarity.
The setting:
- Solr 5.4.1
- the SpecialSimilarity extends ClassicSimilarity
- for one field this similarity is configured. Everything else uses
ClassicSimilarity because of
Result:
- most calculation is do
Hi,
I created a custom similarity and factory which extends
DefaultSimilarity/-Factory to have
to achive this I my similarity overwrites idfExplain like this and also the
method for an array of terms.
public Explanation idfExplain(CollectionStatistics collectionStats,
TermStatistics termStats
would be impractical to create a dedicated/separate index for each
term weighting model (similarity).
Only assumption here is, their discountOverlap should remain same with the one
used at index time.
But of course we cannot make any assumptions regarding custom similarity
implementations.
As
There should probably be some doc notes about this stuff, at a minimum
alerting the user to the prospect that changing the similarity for a field
(or the default for all fields) can require reindexing and when it is
likely to require reindexing. The Lucene-level Javadoc should probably say
these
try Kan
> Sent: Tuesday 15th December 2015 19:07
> To: solr-user@lucene.apache.org; Ahmet Arslan
> Subject: Re: similarity as a parameter
>
> Markus, Jack,
>
> I think Ahmet nails it pretty nicely: the similarity functions in question
> are compatible on the index level
a special case...
1) Solr shouldn't make any naive assumptions about whatever
arbitrary (custom) Similarity class a user might provide -- particularly
when it comes to field norms, since the all of the Similarity base classes
/ callers have been setup to make it trivial for people to write
Hostetter
wrote:
>
> : I think this is a legitimate request. Majority of the similarities are
> : compatible index wise. I think the only exception is sweet spot
> : similarity.
>
> I think you are grossly underestimating the risk of arbitrarily using diff
> Similarities b
Markus, Jack,
I think Ahmet nails it pretty nicely: the similarity functions in question
are compatible on the index level. So it is not necessary to create a
separate search field.
Ahmet, I like your idea. Will take a look, thanks.
Rgds,
Dmitry
On Tue, Dec 15, 2015 at 7:58 PM, Ahmet Arslan
Arslan
wrote:
Hi Dmitry,
I think this is a legitimate request. Majority of the similarities are
compatible index wise. I think the only exception is sweet spot similarity.
In Lucene, it can be changed on the fly with a new Searcher. It should be
possible to do so in solr.
Thanks,
Ahmet
: I think this is a legitimate request. Majority of the similarities are
: compatible index wise. I think the only exception is sweet spot
: similarity.
I think you are grossly underestimating the risk of arbitrarily using diff
Similarities between index time and query time -- particulaly in
Hi Dmitry,
I think this is a legitimate request. Majority of the similarities are
compatible index wise. I think the only exception is sweet spot similarity.
In Lucene, it can be changed on the fly with a new Searcher. It should be
possible to do so in solr.
Thanks,
Ahmet
On Tuesday
You would need to define an alternate field which copied a base field but
then had the desired alternate similarity, using SchemaSimilarityFactory.
See:
https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements
-- Jack Krupansky
On Tue, Dec 15, 2015 at 10:02 AM, Dmitry Kan wrote
solr-user@lucene.apache.org
> Subject: similarity as a parameter
>
> Hi guys,
>
> Is there a way to alter the similarity class at runtime, with a parameter?
>
> --
> Dmitry Kan
> Luke Toolbox: http://github.com/DmitryKey/luke
> Blog: http://dmitrykan.blogspot.com
Hi guys,
Is there a way to alter the similarity class at runtime, with a parameter?
--
Dmitry Kan
Luke Toolbox: http://github.com/DmitryKey/luke
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan
SemanticAnalyzer: www.semanticanalyzer.info
Hi again,
Here is a relevant/past discussion :
http://search-lucene.com/m/eHNlTDHKb17MW532
Ahmet
On Thursday, August 20, 2015 2:28 AM, Ahmet Arslan
wrote:
Hi Tom,
computeNorm(FieldInvertState) method is the only place where similarity is tied
to indexing process.
If you want to switch
Hi Tom,
computeNorm(FieldInvertState) method is the only place where similarity is tied
to indexing process.
If you want to switch between different similarities, they should share the
same implementation for the method. For example, subclasses of SimilarityBase
can be used without re-indexing
warning: I'm no expert on other similarities.
Having said that, I'm not aware of similarities being used in the
indexing process - during indexing term frequency, document frequency,
field norms, and so on are all recorded. These are things that the
default similarity (TF/IDF) uses to
Hello all,
The last time I worked with changing Simlarities was with Solr 4.1 and at
that time, it was possible to simply change the schema to specify the use
of a different Similarity without re-indexing. This allowed me to
experiment with several different ranking algorithms without having to
ed, that the scorePayload method is never called. What could be
> wrong?
> >>
> >>
> >> [code]
> >>
> >> class PayloadSimilarity extends DefaultSimilarity {
> >> @Override
> >> public float scorePayload(int doc, int sta
t;>
>> [code]
>>
>> class PayloadSimilarity extends DefaultSimilarity {
>> @Override
>> public float scorePayload(int doc, int start, int end, BytesRef
>> payload) {
>> float payloadValue = PayloadHelper.decodeFloat(payload.bytes);
>&
stem.out.println("payloadValue = " + payloadValue);
> return payloadValue;
> }
> }
>
> [/code]
>
>
> Here is how the similarity is injected during indexing:
>
> [code]
>
> PayloadEncoder encoder = new FloatEncoder();
> IndexWriterConfig indexWriterCo
eFloat(payload.bytes);
> System.out.println("payloadValue = " + payloadValue);
> return payloadValue;
> }
> }
>
> [/code]
>
>
> Here is how the similarity is injected during indexing:
>
> [code]
>
> PayloadEncoder encoder
);
System.out.println("payloadValue = " + payloadValue);
return payloadValue;
}
}
[/code]
Here is how the similarity is injected during indexing:
[code]
PayloadEncoder encoder = new FloatEncoder();
IndexWriterConfig indexWriterConfig = new
IndexWriterConfig(Version.LUC
is how the similarity is injected during indexing:
[code]
PayloadEncoder encoder = new FloatEncoder();
IndexWriterConfig indexWriterConfig = new
IndexWriterConfig(Version.LUCENE_4_10_4, new
PayloadAnalyzer(encoder));
payloadSimilarity = new PayloadSimilarity();
indexWriterConfig.set
from all the examples of what you've described, i'm fairly certain all you
really need is a TFIDF based Similarity where coord(), idf(), tf() and
queryNorm() return 1 allways, and you omitNorms from all fields.
Yeah, that's what I did in the very first iteration. It works only fo
, for example, method SimScorer.score is called 3 times for each of
: these documents, total 6 times for both. I have added a
: ThreadLocal> to my custom similarity, which is cleared every
: time before new scoring session (after each query execution). This HashSet
: stores strings consisting of fie
lgorithm .. from what you described, matches on fieldA should
get score "A" matches on fieldB should get score "B" ... why does it mater
which doc is which?
For case #3, for example, method SimScorer.score is called 3 times for
each of these documents, total 6 times for b
: Sure, sorry I did not do it before, I just wanted to take minimum of your
: valuable time. So in my custom Similarity class I am trying to implement such
: a logic, where score calculation is only based on field weight and a field
: match - that's it. In other words, if a field matche
Thank you for your answer, Chris. I will reply with inline comments as
well. Please see below.
: I need to uniquely identify a document inside of a Similarity class during
: scoring. Is it possible to get value of unique key of a document at this
: point?
Can you tell us a bit more about your
: I need to uniquely identify a document inside of a Similarity class during
: scoring. Is it possible to get value of unique key of a document at this
: point?
Can you tell us a bit more about your usecase ... your problem description
is a bit vague, and sounds like it may be an "XY Pr
Good afternoon.
I need to uniquely identify a document inside of a Similarity class
during scoring. Is it possible to get value of unique key of a document
at this point?
For some time I though I can use internal docID for achieving that.
Method score(int doc, float freq) is called after
o a setting higher than the most frequent misspellings but low
> enough to find rare terms.
>
> It depends on your index.
>
> -Original message-
> > From:Ali Nazemian
> > Sent: Wednesday 4th February 2015 11:15
> > To: solr-user@lucene.apache.org
> > Su
solr-user@lucene.apache.org
> Subject: More Like This similarity tuning
>
> Hi,
> I am looking for a best practice on More Like This parameters. I really
> appreciate if somebody can tell me what is the best value for these
> parameters in MLT query? Or at lease the proper methodology f
Hi,
I am looking for a best practice on More Like This parameters. I really
appreciate if somebody can tell me what is the best value for these
parameters in MLT query? Or at lease the proper methodology for finding the
best value for each of these parameters:
mlt.mintf
mlt.mindf
mlt.maxqt
Thank y
to calculate the similarity.
> It is implemented on the idea of cosine measurement but it modifies the
> cosine formula.
> Please take a look at "Lucene Practical Scoring Function" in the following
> Javadoc:
>
> http://lucene.apache.org/core/4_10_3/core/org/apach
Lucene uses TFIDFSimilarity class to calculate the similarity.
It is implemented on the idea of cosine measurement but it modifies the cosine
formula.
Please take a look at "Lucene Practical Scoring Function" in the following
Javadoc:
http://lucene.apache.org/core/4_10_3/core/org/apa
1 - 100 of 264 matches
Mail list logo