Fwd: Does change in similarity class needs reindexing

2020-09-02 Thread YOGENDRA SONI
I changed attributes reloaded the collection but scores are not changing also (norm(content_text)) is not changing. i did reindexing of documents but scores are not changing. steps i followed. 1 Created fields using default similarity. created content_text field type without similarity section

Does change in similarity class needs reindexing

2020-09-02 Thread YOGENDRA SONI
Hi all, I am experimenting with different parameters of BM25 and Sweetspot similarity. I changed solr field type definition like given below. I need clarification that changing similarity in field type need reidexing or not. {"replace-field-type":{ "name":"content_text&

Re: Vector Scoring Plugin for Solr : Dot Product and Cosine Similarity

2020-06-22 Thread Vincenzo D'Amore
s/servlet/mobile#issue/SOLR-13500> > /SOLR-13500 > <https://issues.apache.org/jira/plugins/servlet/mobile#issue/SOLR-13500> > > https://issues.apache.org/jira/plugins/servlet/mobile#issue > <https://issues.apache.org/jira/plugins/servlet/issue/SOLR-14397 > >/SOLR-1439

Re: Vector Scoring Plugin for Solr : Dot Product and Cosine Similarity

2020-06-20 Thread Edward Ribeiro
lugins/servlet/mobile#issue/SOLR-13500> https://issues.apache.org/jira/plugins/servlet/mobile#issue <https://issues.apache.org/jira/plugins/servlet/issue/SOLR-14397>/SOLR-14397 <https://issues.apache.org/jira/plugins/servlet/issue/SOLR-14397> Even though cosine similarity can be us

Vector Scoring Plugin for Solr : Dot Product and Cosine Similarity

2020-06-19 Thread Vincenzo D'Amore
Hi all, I've started to look at image similarity. For each image in documents I have a couple of vectors that represent the color and shape similarity. As pointer to begin a colleague suggested me to start with this project: https://github.com/moshebla/solr-vector-scoring but the pr

NGramFilterFactory and Similarity

2018-12-11 Thread elisabeth benoit
Hello, We are trying to use NGramFilterFactory for approximative search with solr 7. We usually use a similarity with no tf, no idf (our similarity extends ClassicSimilarity, with tf and idf functions always returning 1). For ngram search though, it seems inappropriate since it scores a word

Re: Similarity plugins which are normalized

2018-11-29 Thread Tanya Bompi
different scores primarily > influenced > > by the Term Frequency component. Since I am using a threshold to filter > the > > results for a matched record based off the SOLR score, a somewhat > > normalized score is needed. > > Are there any similarity classes th

Re: Similarity plugins which are normalized

2018-11-29 Thread Doug Turnbull
by the Term Frequency component. Since I am using a threshold to filter the > results for a matched record based off the SOLR score, a somewhat > normalized score is needed. > Are there any similarity classes that are more suitable to my needs? > > Thanks, > Tanu > -- *Do

Similarity plugins which are normalized

2018-11-29 Thread Tanya Bompi
normalized score is needed. Are there any similarity classes that are more suitable to my needs? Thanks, Tanu

Solr Custom Similarity - Using a field from the indexed document

2018-08-22 Thread jagadeeshm
Can someone please look into the following - https://stackoverflow.com/questions/51978351/solr-custom-similarity-using-a-field-from-the-indexed-document Thanks! -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: SOLR 7.1 Top level element for similarity factory

2018-07-16 Thread Chris Hostetter
: So I have the following at the bottom of my schema.xml file : : : : : : The documentation says "top level element" - so should that actually be outside the schema tag? No, the schema tag is the "root" level element, it's direct children are the "top level elements" (the wording m

SOLR 7.1 Top level element for similarity factory

2018-07-16 Thread Hodder, Rick
I'm using SOLR 7.1 and I'm trying to set the similarity factory back to ClassicSimilarityFactory so that it will behave like SOLR 6 or before. In the document https://lucene.apache.org/solr/guide/7_1/other-schema-elements.html#similarity It says This default behavior can be ove

Re: Text Similarity

2018-07-15 Thread Aroop Ganguly
Thanks for your answer Rahul. I think I have explained similarity with the example, assuming the natural order. I would assume this is a common action for people who use solr and do search based systems. I am basically looking for any design patterns that people use to achieve the results as

Re: Text Similarity

2018-07-15 Thread Rahul Singh
How do you define similarity? There are various different methods that work for different methods. In solr depending on which index time analyzer / tokenizer you are using, it will treat one company name as similar in one scenario and not in another. This seems like a case of data

Text Similarity

2018-07-11 Thread Aroop Ganguly
Hi Team This is what I want to do: 1. I have 2 datasets of the schema id-number and company-name 2. I want to ultimately be able to link (join or any other means) the 2 data sets based on the similarity between the company-name fields of the 2 data set. Example: Dataset 1 Id | Company

Re: SOLR Similarity Difference

2018-02-27 Thread Rick Leir
raries >Action Technical Temporar > >If I search > >IDX_CompanyName: (Action AND Technical AND Temporaries AND t/a AND CTR >AND Corporation) > >Under 4.10.2 I see all three in the results > >Under 7.1, with the default BF25 similarity. I only see the first >resul

Re:SOLR Similarity Difference

2018-02-27 Thread Diego Ceccarelli (BLOOMBERG/ LONDON)
Hi Rick, I don't think the issue is BM25 vs TFIDF (the old similarity), it seems more due to the "matching" logic. you are asking to match: "(Action AND Technical AND Temporaries AND t/a AND CTR AND Corporation)" This (in theory) means that you want to retrieve **

SOLR Similarity Difference

2018-02-26 Thread Hodder, Rick
R AND Corporation) Under 4.10.2 I see all three in the results Under 7.1, with the default BF25 similarity. I only see the first result Someone on the list suggested that make 7.1 to go back to the similarity factory used in 4.10.2 that I add the following to the schema.xml. That brings all three re

Re: Do i need to reindex after changing similarity setting

2017-11-30 Thread Nawab Zada Asad Iqbal
This JIRA also throws some light. There is a discussion of encoding norm during indexing. The contributor eventually comments that "norms" encoded by different similarity are compatible to each other. On Thu, Nov 30, 2017 at 5:12 PM, Nawab Zada Asad Iqbal wrote: > Hi Walter, &

Re: Do i need to reindex after changing similarity setting

2017-11-30 Thread Nawab Zada Asad Iqbal
Hi Walter, I read the following line in reference docs, what does it mean by as long as the global similarity allows it: " A field type may optionally specify a that will be used when scoring documents that refer to fields with this type, as long as the "global" similarity for

Re: Do i need to reindex after changing similarity setting

2017-11-22 Thread Nawab Zada Asad Iqbal
Thanks Walter On Mon, Nov 20, 2017 at 4:59 PM Walter Underwood wrote: > Similarity is query time. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > > On Nov 20, 2017, at 4:57 PM, Nawab Zada Asad Iqbal > wro

Re: Do i need to reindex after changing similarity setting

2017-11-20 Thread Walter Underwood
Similarity is query time. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Nov 20, 2017, at 4:57 PM, Nawab Zada Asad Iqbal wrote: > > Hi, > > I want to switch to Classic similarity instead of BM25 (default in solr7). > Do I n

Do i need to reindex after changing similarity setting

2017-11-20 Thread Nawab Zada Asad Iqbal
Hi, I want to switch to Classic similarity instead of BM25 (default in solr7). Do I need to reindex all cores after this? Or is it only a query time setting? Thanks Nawab

Re: How to Efficiently Extract Learning to Rank Similarity Features From Solr?

2017-10-24 Thread alessandro.benedetti
i think this can be actually a good idea and I think that would require a new feature type implementation. Specifically I think you could leverage the existing data structures ( such TermVector) to calculate the matrix and then perform the calculations you need. Or maybe there is space for even a

How to Efficiently Extract Learning to Rank Similarity Features From Solr?

2017-10-23 Thread Michael Alcorn
Hi, I'm trying to extract several similarity measures from Solr for use in a learning to rank model. Doing this mathematically involves taking the dot product of several different matrices, which is extremely fast for non-huge data sets (e.g., millions of documents and queries). Howeve

Re: Search by similarity?

2017-09-19 Thread alessandro.benedetti
In addition to that, I still believe More Like This is a better option for you. The reason is that the MLT is able to evaluate the interesting terms from your document (title is the only field of interest for you), and boost them accordingly. Related your "80% of similarity", this is m

Re: Search by similarity?

2017-08-29 Thread Josh Lincoln
\n 11.583146 = sum of:\n 11.583146 = max >> of:\n 11.583146 = weight(abstract:titl in 1934) [], result of:\n >> 11.583146 = score(doc=1934,freq=1.0 = termFreq=1.0\n), product of:\n 2.0 >> = boost\n 5.503748 = idf(docFreq=74, docCount=18297)\n 1.0522962 = >> tfNorm, computed fro

Re: Search by similarity?

2017-08-29 Thread Josh Lincoln
f:\n 3.816711E-5 = > score(doc=1934,freq=1.0 = termFreq=1.0\n), product of:\n 3.0 = boost\n > 1.4457239E-5 = idf(docFreq=34584, docCount=34584)\n 0.88 = tfNorm, > computed from:\n 1.0 = termFreq=1.0\n 1.2 = parameter k1\n 0.75 = > parameter b\n 3.0 = avgFieldLength\n 4.0 = fieldLength\n"

Re: Search by similarity?

2017-08-29 Thread Darko Todoric
dding &debug=query to the URL? The parsed query will be especially illuminating. Best, Erick On Mon, Aug 28, 2017 at 4:37 AM, Emir Arnautovic wrote: Hi Darko, The issue is the wrong expectations: title-1-end is parsed to 3 tokens (guessing) and mm=99% of 3 tokens is 2.99 and it is rounded dow

Re: Search by similarity?

2017-08-28 Thread Erick Erickson
u have to play with your tokenizers and define what is similarity match > percentage (if you want to stick with mm). > > Regards, > Emir > > > > On 28.08.2017 09:17, Darko Todoric wrote: >> >> Hm... I cannot make that this DisMax work on my Solr... >> >

Re: Search by similarity?

2017-08-28 Thread Emir Arnautovic
l result in zero (or one match if you do not filter out original document). You have to play with your tokenizers and define what is similarity match percentage (if you want to stick with mm). Regards, Emir On 28.08.2017 09:17, Darko Todoric wrote: Hm... I cannot make that this DisMax wo

Re: Search by similarity?

2017-08-28 Thread Darko Todoric
ce/display/solr/The+DisMax+Query+Parser And then set the minimum match ratio of the OR clauses to 90%. /JZ -Original Message- From: Darko Todoric [mailto:todo...@mdpi.com] Sent: Friday, August 25, 2017 5:49 PM To: solr-user@lucene.apache.org Subject: Search by similarity? Hi, I have 90

RE: Search by similarity?

2017-08-25 Thread Markus Jelsma
Yes, that is roughly how MLT works as well. You can also do a full OR-search on the terms using LuceneQParser. Markus -Original message- > From:Junte Zhang > Sent: Friday 25th August 2017 18:38 > To: solr-user@lucene.apache.org > Subject: RE: Search by similarity?

RE: Search by similarity?

2017-08-25 Thread Junte Zhang
then set the minimum match ratio of the OR clauses to 90%. /JZ -Original Message- From: Darko Todoric [mailto:todo...@mdpi.com] Sent: Friday, August 25, 2017 5:49 PM To: solr-user@lucene.apache.org Subject: Search by similarity? Hi, I have 90.000.000 documents in Solr and I need to

Search by similarity?

2017-08-25 Thread Darko Todoric
Hi, I have 90.000.000 documents in Solr and I need to compare "title" of this document and get all documents with more than 80% similarity. PHP have "similar_text" but it's not so smart inserting 90m documents in the array... Can I do some query in Solr which wil

Re: Per Text Field Similarity Measures for Learning to Rank

2017-08-22 Thread Michael Nilsson
Hi Michael, Using your example, if you have 5 different fields, you could create 5 individual SolrFeatures against those fields. The one tricky thing here is that you want to use different similarity scoring mechanisms against your fields. By default, Solr uses a single Similarity class <ht

Re: Problem loading custom similarity class via blob API

2017-08-10 Thread Webster Homer
roduct-170724_shard1_replica1: > Can't load schema schema.xml: Error loading class 'com.sial.similarity. > SialBM25Similarity' > > What am I missing? Is this plugins api limited to components and request > handlers? If so that's a HUGE limitation, makes it just

Problem loading custom similarity class via blob API

2017-08-10 Thread Webster Homer
-170724_shard1_replica1: Can't load schema schema.xml: Error loading class 'com.sial.similarity.SialBM25Similarity' What am I missing? Is this plugins api limited to components and request handlers? If so that's a HUGE limitation, makes it just about useless. I need it for this simila

Per Text Field Similarity Measures for Learning to Rank

2017-08-04 Thread Michael Alcorn
Hi all, I recently prototyped a learning to rank system in Python that produced promising results, so I'm now looking into how to replicate that process in our Solr setup. For my Python implementation, I was using a number of features that were per field text comparisons, e.g.: 1. tfidf_case_t

Upgrading Similarity from Solr 4.x to 6.x

2017-05-18 Thread Lynn Monson
I have a question about moving a trivial (?) Similarity class from version 4.x to 6.x I have queries along these lines: field1:somevalue^0.55 field2:anothervalue^1.4 The score for a document is simply the sum of weights. A hit on field1 alone scores 0.55. Field2 alone scores 1.4. Both fields

Re: Solrj similarity setting

2017-01-22 Thread Will Martin
Farida: Can you define a dsl? (domain specific model). If you stand a RESTful api on the DSL, you can write a BM25F similarity, with conditionals maybe controlled by a func query accessing field payloads? :-) On 1/21/2017 6:13 AM, Markus Jelsma wrote: No, this is not possible. A similarity

RE: Solrj similarity setting

2017-01-21 Thread Markus Jelsma
No, this is not possible. A similarity is an index time setting because it can have index time properties. There is no way to control this query time. M -Original message- > From:Farida Sabry > Sent: Saturday 21st January 2017 9:58 > To: solr-user@lucene.apache.org > Su

Solrj similarity setting

2017-01-21 Thread Farida Sabry
Is there a way to set the similarity in Solrj query search like we do for lucene IndexSearcher e.g. searcher.setSimilarity(sim); I need to define a custom similarity and at the same time get the fields provided by SolrInputDocument in the returned results by SolrDocumentList results

Re: Can someone explain about Sweetspot Similarity ?

2016-06-20 Thread Chris Hostetter
: Fwiw, here's example of per field similarity : https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements Similarities can be defined per field *TYPE* ... not per field (as shown in the examples on that page) -Hoss http://www.lucidworks.com/

Re: Can someone explain about Sweetspot Similarity ?

2016-06-19 Thread Mikhail Khludnev
Fwiw, here's example of per field similarity https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements 20 июня 2016 г. 2:49 пользователь "Shawn Heisey" написал: > On 6/19/2016 4:09 AM, dirmanhafiz wrote: > > Hi , Im Dirman and im trying experiment solr w

Re: Can someone explain about Sweetspot Similarity ?

2016-06-19 Thread Shawn Heisey
On 6/19/2016 4:09 AM, dirmanhafiz wrote: > Hi , Im Dirman and im trying experiment solr with sweetspot similarity,, > can someone tell me min max in sweetspot similarity is about lengthdoc ? > i have average lengthdoc 3-205 tokens / doc and why while i wrote in my > schema.xml paramet

Re: Can someone explain about Sweetspot Similarity ?

2016-06-19 Thread Ahmet Arslan
Hi, Sweet spot is designed to punish too long or too short documents. Did you reindex? Can you see the mention of sweet spot in debugQuery=true response? Ahmet On Sunday, June 19, 2016 2:18 PM, dirmanhafiz wrote: Hi , Im Dirman and im trying experiment solr with sweetspot similarity,, can

Can someone explain about Sweetspot Similarity ?

2016-06-19 Thread dirmanhafiz
Hi , Im Dirman and im trying experiment solr with sweetspot similarity,, can someone tell me min max in sweetspot similarity is about lengthdoc ? i have average lengthdoc 3-205 tokens / doc and why while i wrote in my schema.xml parameter sweetspot similairity didnt work ? 10

Re: sort by custom function of similarity score

2016-05-26 Thread Joel Bernstein
Reranking is done after the collapse. So you'll get the original score in cscore() Joel Bernstein http://joelsolr.blogspot.com/ On Thu, May 26, 2016 at 12:56 PM, aanilpala wrote: > with cscore() in collapse, will I get the similarity score from lucene or > the > reranked score b

Re: sort by custom function of similarity score

2016-05-26 Thread aanilpala
with cscore() in collapse, will I get the similarity score from lucene or the reranked score by the raranker if I am using a plugin that reranks the results? I guess the answer depends on which of fq or rq is applied first. -- View this message in context: http://lucene.472066.n3.nabble.com

Re: sort by custom function of similarity score

2016-05-26 Thread Joel Bernstein
unctionQueries-UsingFunctionQuery > > > > > On Thursday, May 26, 2016 1:59 PM, aanilpala wrote: > is it allowed to provide a sort function (sortspec) that is using > similarity > score. for example something like in the following: > > sort=product(2,score) desc > >

Re: sort by custom function of similarity score

2016-05-26 Thread Ahmet Arslan
t function (sortspec) that is using similarity score. for example something like in the following: sort=product(2,score) desc seems that it won't work. is there an alternative way to achieve this? using solr6 thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com

sort by custom function of similarity score

2016-05-26 Thread aanilpala
is it allowed to provide a sort function (sortspec) that is using similarity score. for example something like in the following: sort=product(2,score) desc seems that it won't work. is there an alternative way to achieve this? using solr6 thanks in advance. -- View this message in co

Re: Multiple custom Similarity implementations

2016-03-10 Thread Parvesh Garg
Hi Ahmet, Thanks for the pointer. I have similar thoughts on the subject. The risk assumptions are based on not testing your stuff before taking it in. That risk is still valid with similarity configuration. And sometimes, it may not be possible to use multiple similarities (custom or otherwise

Re: Multiple custom Similarity implementations

2016-03-10 Thread Ahmet Arslan
://www.zettata.com On Tue, Mar 8, 2016 at 8:59 PM, Markus Jelsma wrote: > Hello, you can not change similarities per request, and this is likely > never going to be supported for good reasons. You need multiple cores, or > multiple fields with different similarity defined in the same core.

Re: Multiple custom Similarity implementations

2016-03-09 Thread Parvesh Garg
to be supported for good reasons. You need multiple cores, or > multiple fields with different similarity defined in the same core. > Markus > > -Original message- > > From:Parvesh Garg > > Sent: Tuesday 8th March 2016 5:36 > > To: solr-user@lucene.apa

RE: Multiple custom Similarity implementations

2016-03-08 Thread Markus Jelsma
Hello, you can not change similarities per request, and this is likely never going to be supported for good reasons. You need multiple cores, or multiple fields with different similarity defined in the same core. Markus -Original message- > From:Parvesh Garg > Sent: Tuesday 8th

Multiple custom Similarity implementations

2016-03-07 Thread Parvesh Garg
Hi, We have a requirement where we want to run an A/B test over multiple Similarity implementations. Is it possible to define multiple similarity tags in schema.xml file and chose one using the URL parameter? We are using solr 4.7 Currently, we are planning to have different cores with different

Re: Override Default Similarity and SolrCloud

2016-03-03 Thread Joshan Mahmud
March 2016 15:02 > > To: solr-user@lucene.apache.org > > Subject: Re: Override Default Similarity and SolrCloud > > > > Thanks Markus - do you just SCP / copy them manually to your solr nodes > and > > not through Zookeeper (if you use that)? > > > >

RE: Override Default Similarity and SolrCloud

2016-03-03 Thread Markus Jelsma
Hi - config is stored in ZK, libs must be present on each node and are rsync there via provisioning. Markus -Original message- > From:Joshan Mahmud > Sent: Thursday 3rd March 2016 15:02 > To: solr-user@lucene.apache.org > Subject: Re: Override Default Similarity a

Re: Override Default Similarity and SolrCloud

2016-03-03 Thread Joshan Mahmud
Sent: Thursday 3rd March 2016 14:54 > > To: solr-user@lucene.apache.org > > Subject: Override Default Similarity and SolrCloud > > > > Hi group! > > > > I'm having an issue of deploying a custom jar in SolrCloud (v 5.3.0). > > > > I have a

RE: Override Default Similarity and SolrCloud

2016-03-03 Thread Markus Jelsma
We store them server/solr/lib/. -Original message- > From:Joshan Mahmud > Sent: Thursday 3rd March 2016 14:54 > To: solr-user@lucene.apache.org > Subject: Override Default Similarity and SolrCloud > > Hi group! > > I'm having an issue of deploying a cus

Override Default Similarity and SolrCloud

2016-03-03 Thread Joshan Mahmud
Hi group! I'm having an issue of deploying a custom jar in SolrCloud (v 5.3.0). I have a working local Solr environment (v 5.3.0 - NOT SolrCloud) whereby I have: - a jar containing one class CustomSimilarity which extends org.apache.lucene.search.similarities.DefaultSimilarity - the ja

Re: Why QueryWeight with Custom Similarity

2016-02-25 Thread Markus, Sascha
Sorry clicked send to early :-) with the above additional code the calculation is done by the default similarity and the behaviour is as expected. I think this is an issue of the implementation but I didn't find one in jira. Should I create one? Cheers Sascha On Thu, Feb 25, 2016 at 11:

Re: Why QueryWeight with Custom Similarity

2016-02-25 Thread Markus, Sascha
Hi, I finally found the source of the problem I'm having with the custom similarity. The setting: - Solr 5.4.1 - the SpecialSimilarity extends ClassicSimilarity - for one field this similarity is configured. Everything else uses ClassicSimilarity because of Result: - most calculation is do

Why QueryWeight with Custom Similarity

2016-02-15 Thread Markus, Sascha
Hi, I created a custom similarity and factory which extends DefaultSimilarity/-Factory to have to achive this I my similarity overwrites idfExplain like this and also the method for an array of terms. public Explanation idfExplain(CollectionStatistics collectionStats, TermStatistics termStats

Re: similarity as a parameter

2015-12-16 Thread Ahmet Arslan
would be impractical to create a dedicated/separate index for each term weighting model (similarity). Only assumption here is, their discountOverlap should remain same with the one used at index time. But of course we cannot make any assumptions regarding custom similarity implementations. As

Re: similarity as a parameter

2015-12-15 Thread Jack Krupansky
There should probably be some doc notes about this stuff, at a minimum alerting the user to the prospect that changing the similarity for a field (or the default for all fields) can require reindexing and when it is likely to require reindexing. The Lucene-level Javadoc should probably say these

RE: similarity as a parameter

2015-12-15 Thread Markus Jelsma
try Kan > Sent: Tuesday 15th December 2015 19:07 > To: solr-user@lucene.apache.org; Ahmet Arslan > Subject: Re: similarity as a parameter > > Markus, Jack, > > I think Ahmet nails it pretty nicely: the similarity functions in question > are compatible on the index level

RE: similarity as a parameter

2015-12-15 Thread Chris Hostetter
a special case... 1) Solr shouldn't make any naive assumptions about whatever arbitrary (custom) Similarity class a user might provide -- particularly when it comes to field norms, since the all of the Similarity base classes / callers have been setup to make it trivial for people to write

Re: similarity as a parameter

2015-12-15 Thread Dmitry Kan
Hostetter wrote: > > : I think this is a legitimate request. Majority of the similarities are > : compatible index wise. I think the only exception is sweet spot > : similarity. > > I think you are grossly underestimating the risk of arbitrarily using diff > Similarities b

Re: similarity as a parameter

2015-12-15 Thread Dmitry Kan
Markus, Jack, I think Ahmet nails it pretty nicely: the similarity functions in question are compatible on the index level. So it is not necessary to create a separate search field. Ahmet, I like your idea. Will take a look, thanks. Rgds, Dmitry On Tue, Dec 15, 2015 at 7:58 PM, Ahmet Arslan

Re: similarity as a parameter

2015-12-15 Thread Ahmet Arslan
Arslan wrote: Hi Dmitry, I think this is a legitimate request. Majority of the similarities are compatible index wise. I think the only exception is sweet spot similarity. In Lucene, it can be changed on the fly with a new Searcher. It should be possible to do so in solr. Thanks, Ahmet

Re: similarity as a parameter

2015-12-15 Thread Chris Hostetter
: I think this is a legitimate request. Majority of the similarities are : compatible index wise. I think the only exception is sweet spot : similarity. I think you are grossly underestimating the risk of arbitrarily using diff Similarities between index time and query time -- particulaly in

Re: similarity as a parameter

2015-12-15 Thread Ahmet Arslan
Hi Dmitry, I think this is a legitimate request. Majority of the similarities are compatible index wise. I think the only exception is sweet spot similarity. In Lucene, it can be changed on the fly with a new Searcher. It should be possible to do so in solr. Thanks, Ahmet On Tuesday

Re: similarity as a parameter

2015-12-15 Thread Jack Krupansky
You would need to define an alternate field which copied a base field but then had the desired alternate similarity, using SchemaSimilarityFactory. See: https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements -- Jack Krupansky On Tue, Dec 15, 2015 at 10:02 AM, Dmitry Kan wrote

RE: similarity as a parameter

2015-12-15 Thread Markus Jelsma
solr-user@lucene.apache.org > Subject: similarity as a parameter > > Hi guys, > > Is there a way to alter the similarity class at runtime, with a parameter? > > -- > Dmitry Kan > Luke Toolbox: http://github.com/DmitryKey/luke > Blog: http://dmitrykan.blogspot.com

similarity as a parameter

2015-12-15 Thread Dmitry Kan
Hi guys, Is there a way to alter the similarity class at runtime, with a parameter? -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info

Re: Changing Similarity without re-indexing (for example from default to BM25)

2015-08-19 Thread Ahmet Arslan
Hi again, Here is a relevant/past discussion : http://search-lucene.com/m/eHNlTDHKb17MW532 Ahmet On Thursday, August 20, 2015 2:28 AM, Ahmet Arslan wrote: Hi Tom, computeNorm(FieldInvertState) method is the only place where similarity is tied to indexing process. If you want to switch

Re: Changing Similarity without re-indexing (for example from default to BM25)

2015-08-19 Thread Ahmet Arslan
Hi Tom, computeNorm(FieldInvertState) method is the only place where similarity is tied to indexing process. If you want to switch between different similarities, they should share the same implementation for the method. For example, subclasses of SimilarityBase can be used without re-indexing

Re: Changing Similarity without re-indexing (for example from default to BM25)

2015-08-19 Thread Upayavira
warning: I'm no expert on other similarities. Having said that, I'm not aware of similarities being used in the indexing process - during indexing term frequency, document frequency, field norms, and so on are all recorded. These are things that the default similarity (TF/IDF) uses to

Changing Similarity without re-indexing (for example from default to BM25)

2015-08-19 Thread Tom Burton-West
Hello all, The last time I worked with changing Simlarities was with Solr 4.1 and at that time, it was possible to simply change the schema to specify the use of a different Similarity without re-indexing. This allowed me to experiment with several different ranking algorithms without having to

Re: payload similarity

2015-04-25 Thread Dmitry Kan
ed, that the scorePayload method is never called. What could be > wrong? > >> > >> > >> [code] > >> > >> class PayloadSimilarity extends DefaultSimilarity { > >> @Override > >> public float scorePayload(int doc, int sta

Re: payload similarity

2015-04-24 Thread Erick Erickson
t;> >> [code] >> >> class PayloadSimilarity extends DefaultSimilarity { >> @Override >> public float scorePayload(int doc, int start, int end, BytesRef >> payload) { >> float payloadValue = PayloadHelper.decodeFloat(payload.bytes); >&

Re: payload similarity

2015-04-24 Thread Dmitry Kan
stem.out.println("payloadValue = " + payloadValue); > return payloadValue; > } > } > > [/code] > > > Here is how the similarity is injected during indexing: > > [code] > > PayloadEncoder encoder = new FloatEncoder(); > IndexWriterConfig indexWriterCo

Re: payload similarity

2015-04-24 Thread Dmitry Kan
eFloat(payload.bytes); > System.out.println("payloadValue = " + payloadValue); > return payloadValue; > } > } > > [/code] > > > Here is how the similarity is injected during indexing: > > [code] > > PayloadEncoder encoder

Re: payload similarity

2015-04-24 Thread Ahmet Arslan
); System.out.println("payloadValue = " + payloadValue); return payloadValue; } } [/code] Here is how the similarity is injected during indexing: [code] PayloadEncoder encoder = new FloatEncoder(); IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUC

payload similarity

2015-04-24 Thread Dmitry Kan
is how the similarity is injected during indexing: [code] PayloadEncoder encoder = new FloatEncoder(); IndexWriterConfig indexWriterConfig = new IndexWriterConfig(Version.LUCENE_4_10_4, new PayloadAnalyzer(encoder)); payloadSimilarity = new PayloadSimilarity(); indexWriterConfig.set

Re: Getting unique key of a document inside of a Similarity class.

2015-02-20 Thread J-Pro
from all the examples of what you've described, i'm fairly certain all you really need is a TFIDF based Similarity where coord(), idf(), tf() and queryNorm() return 1 allways, and you omitNorms from all fields. Yeah, that's what I did in the very first iteration. It works only fo

Re: Getting unique key of a document inside of a Similarity class.

2015-02-19 Thread Chris Hostetter
, for example, method SimScorer.score is called 3 times for each of : these documents, total 6 times for both. I have added a : ThreadLocal> to my custom similarity, which is cleared every : time before new scoring session (after each query execution). This HashSet : stores strings consisting of fie

Re: Getting unique key of a document inside of a Similarity class.

2015-02-19 Thread J-Pro
lgorithm .. from what you described, matches on fieldA should get score "A" matches on fieldB should get score "B" ... why does it mater which doc is which? For case #3, for example, method SimScorer.score is called 3 times for each of these documents, total 6 times for b

Re: Getting unique key of a document inside of a Similarity class.

2015-02-19 Thread Chris Hostetter
: Sure, sorry I did not do it before, I just wanted to take minimum of your : valuable time. So in my custom Similarity class I am trying to implement such : a logic, where score calculation is only based on field weight and a field : match - that's it. In other words, if a field matche

Re: Getting unique key of a document inside of a Similarity class.

2015-02-19 Thread J-Pro
Thank you for your answer, Chris. I will reply with inline comments as well. Please see below. : I need to uniquely identify a document inside of a Similarity class during : scoring. Is it possible to get value of unique key of a document at this : point? Can you tell us a bit more about your

Re: Getting unique key of a document inside of a Similarity class.

2015-02-19 Thread Chris Hostetter
: I need to uniquely identify a document inside of a Similarity class during : scoring. Is it possible to get value of unique key of a document at this : point? Can you tell us a bit more about your usecase ... your problem description is a bit vague, and sounds like it may be an "XY Pr

Getting unique key of a document inside of a Similarity class.

2015-02-19 Thread J-Pro
Good afternoon. I need to uniquely identify a document inside of a Similarity class during scoring. Is it possible to get value of unique key of a document at this point? For some time I though I can use internal docID for achieving that. Method score(int doc, float freq) is called after

Re: More Like This similarity tuning

2015-02-04 Thread Ali Nazemian
o a setting higher than the most frequent misspellings but low > enough to find rare terms. > > It depends on your index. > > -Original message- > > From:Ali Nazemian > > Sent: Wednesday 4th February 2015 11:15 > > To: solr-user@lucene.apache.org > > Su

RE: More Like This similarity tuning

2015-02-04 Thread Markus Jelsma
solr-user@lucene.apache.org > Subject: More Like This similarity tuning > > Hi, > I am looking for a best practice on More Like This parameters. I really > appreciate if somebody can tell me what is the best value for these > parameters in MLT query? Or at lease the proper methodology f

More Like This similarity tuning

2015-02-04 Thread Ali Nazemian
Hi, I am looking for a best practice on More Like This parameters. I really appreciate if somebody can tell me what is the best value for these parameters in MLT query? Or at lease the proper methodology for finding the best value for each of these parameters: mlt.mintf mlt.mindf mlt.maxqt Thank y

Re: Lucene cosine similarity score for more like this query

2015-02-03 Thread Ali Nazemian
to calculate the similarity. > It is implemented on the idea of cosine measurement but it modifies the > cosine formula. > Please take a look at "Lucene Practical Scoring Function" in the following > Javadoc: > > http://lucene.apache.org/core/4_10_3/core/org/apach

Re: Lucene cosine similarity score for more like this query

2015-02-03 Thread Koji Sekiguchi
Lucene uses TFIDFSimilarity class to calculate the similarity. It is implemented on the idea of cosine measurement but it modifies the cosine formula. Please take a look at "Lucene Practical Scoring Function" in the following Javadoc: http://lucene.apache.org/core/4_10_3/core/org/apa

  1   2   3   >