I decided to go for function query and implementing function query to read term frequency for each document from index. Anyway I did not find any tutorial which is matched my problem well. I really appreciate if somebody could provide me some useful tutorial or example for this case. Thank you very much.
On Tue, Jan 13, 2015 at 4:21 PM, Jack Krupansky <jack.krupan...@gmail.com> wrote: > A function query or an update processor to create a separate field are > still your best options. > > -- Jack Krupansky > > On Tue, Jan 13, 2015 at 4:18 AM, Ali Nazemian <alinazem...@gmail.com> > wrote: > > > Dear Markus, > > > > Unfortunately I can not use payload since I want to retrieve this score > to > > each user as a simple field alongside other fields. Unfortunately payload > > does not provide that. Also I dont want to change the default similarity > > method of Lucene, I just want to have this filed to do the sorting in > some > > cases. > > Best regards. > > > > On Mon, Jan 12, 2015 at 10:26 PM, Markus Jelsma < > > markus.jel...@openindex.io> > > wrote: > > > > > Hi - You mention having a list with important terms, then using > payloads > > > would be the most straightforward i suppose. You still need a custom > > > similarity and custom query parser. Payloads work for us very well. > > > > > > M > > > > > > > > > > > > -----Original message----- > > > > From:Ahmet Arslan <iori...@yahoo.com.INVALID> > > > > Sent: Monday 12th January 2015 19:50 > > > > To: solr-user@lucene.apache.org > > > > Subject: Re: Extending solr analysis in index time > > > > > > > > Hi Ali, > > > > > > > > Reading your example, if you could somehow replace idf component with > > > your "importance weight", > > > > I think your use case looks like TFIDFSimilarity. Tf component > remains > > > same. > > > > > > > > > > > > > > https://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html > > > > > > > > I also suggest you ask this in lucene mailing list. Someone familiar > > > with similarity package can give insight on this. > > > > > > > > Ahmet > > > > > > > > > > > > > > > > On Monday, January 12, 2015 6:54 PM, Jack Krupansky < > > > jack.krupan...@gmail.com> wrote: > > > > Could you clarify what you mean by "Lucene reverse index"? That's > not a > > > > term I am familiar with. > > > > > > > > -- Jack Krupansky > > > > > > > > > > > > On Mon, Jan 12, 2015 at 1:01 AM, Ali Nazemian <alinazem...@gmail.com > > > > > wrote: > > > > > > > > > Dear Jack, > > > > > Thank you very much. > > > > > Yeah I was thinking of function query for sorting, but I have to > > > problems > > > > > in this case, 1) function query do the process at query time which > I > > > dont > > > > > want to. 2) I also want to have the score field for retrieving and > > > showing > > > > > to users. > > > > > > > > > > Dear Alexandre, > > > > > Here is some more explanation about the business behind the > question: > > > > > I am going to provide a field for each document, lets refer it as > > > > > "document_score". I am going to fill this field based on the > > > information > > > > > that could be extracted from Lucene reverse index. Assume I have a > > > list of > > > > > terms, called important terms and I am going to extract the term > > > frequency > > > > > for each of the terms inside this list per each document. To be > > honest > > > I > > > > > want to use the term frequency for calculating "document_score". > > > > > "document_score" should be storable since I am going to retrieve > this > > > field > > > > > for each document. I also want to do sorting on "document_store" in > > > case of > > > > > preferred by user. > > > > > I hope I did convey my point. > > > > > Best regards. > > > > > > > > > > > > > > > On Mon, Jan 12, 2015 at 12:53 AM, Jack Krupansky < > > > jack.krupan...@gmail.com > > > > > > > > > > > wrote: > > > > > > > > > > > Won't function queries do the job at query time? You can add or > > > multiply > > > > > > the tf*idf score by a function of the term frequency of arbitrary > > > terms, > > > > > > using the tf, mul, and add functions. > > > > > > > > > > > > See: > > > > > > > https://cwiki.apache.org/confluence/display/solr/Function+Queries > > > > > > > > > > > > -- Jack Krupansky > > > > > > > > > > > > On Sun, Jan 11, 2015 at 10:55 AM, Ali Nazemian < > > > alinazem...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > Dear Jack, > > > > > > > Hi, > > > > > > > I think you misunderstood my need. I dont want to change the > > > default > > > > > > > scoring behavior of Lucene (tf-idf) I just want to have another > > > field > > > > > to > > > > > > do > > > > > > > sorting for some specific queries (not all the search > business), > > > > > however > > > > > > I > > > > > > > am aware of Lucene payload. > > > > > > > Thank you very much. > > > > > > > > > > > > > > On Sun, Jan 11, 2015 at 7:15 PM, Jack Krupansky < > > > > > > jack.krupan...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > You would do that with a custom similarity (scoring) class. > > > That's an > > > > > > > > expert feature. In fact a SUPER-expert feature. > > > > > > > > > > > > > > > > Start by completely familiarizing yourself with how TF*IDF > > > > > similarity > > > > > > > > already works: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://lucene.apache.org/core/4_10_3/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html > > > > > > > > > > > > > > > > And to use your custom similarity class in Solr: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/solr/Other+Schema+Elements#OtherSchemaElements-Similarity > > > > > > > > > > > > > > > > > > > > > > > > -- Jack Krupansky > > > > > > > > > > > > > > > > On Sun, Jan 11, 2015 at 9:04 AM, Ali Nazemian < > > > alinazem...@gmail.com > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi everybody, > > > > > > > > > > > > > > > > > > I am going to add some analysis to Solr at the index time. > > > Here is > > > > > > > what I > > > > > > > > > am considering in my mind: > > > > > > > > > Suppose I have two different fields for Solr schema, field > > "a" > > > and > > > > > > > field > > > > > > > > > "b". I am going to use the created reverse index in a way > > that > > > some > > > > > > > terms > > > > > > > > > are considered as important ones and tell lucene to > > calculate a > > > > > value > > > > > > > > based > > > > > > > > > on these terms frequency per each document. For example let > > the > > > > > word > > > > > > > > > "hello" considered as important word with the weight of > > "2.0". > > > > > > Suppose > > > > > > > > the > > > > > > > > > term frequency for this word at field "a" is 3 and at field > > > "b" is > > > > > 6 > > > > > > > for > > > > > > > > > document 1. Therefor the score value would be 2*3+(2*6)^2. > I > > > want > > > > > to > > > > > > > > > calculate this score based on these fields and put it in > the > > > index > > > > > > for > > > > > > > > > retrieving. My question would be how can I do such thing? > > > First I > > > > > did > > > > > > > > > consider using term component for calculating this value > from > > > > > outside > > > > > > > and > > > > > > > > > put it back to Solr index, but it seems it is not efficient > > > enough. > > > > > > > > > > > > > > > > > > Thank you very much. > > > > > > > > > Best regards. > > > > > > > > > > > > > > > > > > -- > > > > > > > > > A.Nazemian > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > A.Nazemian > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > A.Nazemian > > > > > > > > > > > > > > > > > > > > -- > > A.Nazemian > > > -- A.Nazemian