Controlling the order of partial matches based on the position
Hi All, I'm using SOLR/Lucene to index few keywords in a "multivalued" field. The data that is being stored in the indexes is already mined to remove the noise and occurrences and is very precise. All the text mining and filtering steps are already performed before indexing. Whenever a user search for a specific keyword e.g. query: "user feedback" and if a partial match happens like "user testing" and "search user" in 2 documents. I want to control the rank of the documents based on the position where the match has happened. User testing should always appear 1st than search user. Also I would like to understand the possibility of dropping the partial match happening at a position other than 1st. Can somebody point out how can we achieve both of the things? Thanks Nitin -- View this message in context: http://lucene.472066.n3.nabble.com/Controlling-the-order-of-partial-matches-based-on-the-position-tp3413867p3413867.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Controlling the order of partial matches based on the position
Guys, It's been almost a week but there are no replies to the question that I posted. If its a small problem and already answered somewhere, please point me to that post. Otherwise please suggest any pointer to handle the requirement mentioned in the question, Nitin -- View this message in context: http://lucene.472066.n3.nabble.com/Controlling-the-order-of-partial-matches-based-on-the-position-tp3413867p3429823.html Sent from the Solr - User mailing list archive at Nabble.com.
Incorrect Search Results showing up
Hi Group, I've the defined a type "text" in the SOLR schema as shown below. A multi valued field is defined to use the type defined above I index some content such as - Google REST API - Facebook REST API - Software Architecture - Design Documents - Xml Web Services - Web API design When I issue a search query like content:"rest api"~4, the matches that I get are - Google REST API (which is fine) - Facebook REST API (which is fine) - *Web API design* (which is not fine, because the query was a phrase query and rest and api should be within 4 words of each other) Does any body see the 3rd search result as a correct search result to be returned? If yes, then what is explanation for that result based on the schema defined. According to me 3rd result should not be returned as part of the search result. If somebody can point out anything wrong in my schema it will be great help to me. Thanks Nitin -- View this message in context: http://lucene.472066.n3.nabble.com/Incorrect-Search-Results-showing-up-tp3452810p3452810.html Sent from the Solr - User mailing list archive at Nabble.com.
Sorting results within the fields
I need to implement sorting of search results where sorting needs to be done based on the fields that are matched for a query and the score associated with each term in the field which is generated by application logic. e.g if there are 3 fields which are being queried and the final query after applying the boosts is field1^8 OR field2^4 OR field3^2 and there are 2 documents D1, D2 matching field1, 2 documents D3,D4 matching field2 and 3 documents D5, D6, D7 matching field3 The 2 search results in field1 needs to be sorted by score which is generated by my application, then the next 2 search results matching field2 needs to be sorted the score value which is generate by my application and 3 search results matching field3 needs to be sorted the score value which is generate by my application. So, that the final results of the query will look like (D1, D2) (D3,D4) (D5,D6,D7). Fields which are being matched are dynamic fields. Score value which is generated by the application is score of each term which is added to the dynamic field of the document. Can somebody suggest how this can be achieved using SOLR and Lucene?? Thanks Nitin -- View this message in context: http://lucene.472066.n3.nabble.com/Sorting-results-within-the-fields-tp3656049p3656049.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Sorting results within the fields
It's been almost a week and there is no response to the question that I asked. Is the question has less details or there is no way to achieve the same in Lucene? -- View this message in context: http://lucene.472066.n3.nabble.com/Sorting-results-within-the-fields-tp3656049p3666983.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Sorting results within the fields
Hi Jan, Thanks for the reply. Here is the concrete explanation of the problem that I'm trying to solve. *SOLR Schema* Here is the definition of the SOLR schema *There are 3 dynamic fields* *There are 4 searchable fields* *Description*: Data in this field is Whitespace Tokenized, Stemmed, Lowercased *Description*: Data in this field is only lowercase and Keyword Tokenizer is applied. So, data is not changed when stored in this field. *Description*: Head terms are encoded in the format HEAD$Value *Description*: Tail terms are encoded in the format TAIL$Value The data that we store in these fields is cleaned up data from large text: generally 1 word, 2 words, 3 words values D1 -> UI, UI Design, UI Programming , UI Design Document, D2 -> UI Mockup, UI development D3 -> UI When somebody queries *UI*, internal query that is generated is concepts_headtermencoded_concept:HEAD$ui^100.0 concepts:ui^50.0 concepts_tailtermencoded_concept:TAIL$ui^10.0 So, that head term matched document is ranked higher than partial match. Current Implementation without score ranks the document like: D1 > D2 > D3 (because Lucene use Tf, IDF while scoring the document) Now, we have created *application specific score* for each concept and want to sort the results based on that score but preserving the boost on the field defined in the query. e.g. D1 -> UI=90, UI Design = 45, UI Programming = 40, UI Design Document = 85, Project Wolverine=40 D2 -> UI Mockup=55, UI Development=74, Project Management=39 D3 -> UI=95, Project Wolverine=35 D4 -> UI Dev = 75, Video Project=42 1. If a match is found and only exact match was found then sorting will happen based on the score value for the term that we have defined. 2. If a match is found and exact and partial matches are there. Then sorting should happen based on the exact matched documents on top and then partially matched documents sorted within themselves based on score. *Examples* *Search*: UI *Desired Results*: D3 > D1 > D4 > D2 where (D3, D1) contains exact match and hence scored within themselves. (D4, D2 both have head match but score of head match in D4 > D2) *Search*: Project *Desired Results*: D1 > D2 > D3 > D4 Where D1, D2 and D3 are head term matches and sorted within (D1, D2, D3) based on score and D4 is tail term match (even though has better score tail term boost is 1/10th of head term boost). So, in all we can override the TF, IDF of Lucene scoring and want do the scoring based on our concept specific score but preserving giving the higher preference to exact match and then partial matches. Hope I explained the problem. Let me know if you have any specific question. Thanks Nitin -- View this message in context: http://lucene.472066.n3.nabble.com/Sorting-results-within-the-fields-tp3656049p3668047.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Highlighting more than 1 term
Hi Tim, Can you share the "text_en" type definition? Do check if your have Stemmer configured in the type definition. If not then that might be the reason of scheduled not matching with scheduling. Thanks Nitin -- View this message in context: http://lucene.472066.n3.nabble.com/Highlighting-more-than-1-term-tp3670862p3671004.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to boost the relevancy of a field
Hi Dean, You can use Query Time boosting where you specify the boost value in the query itself that title:solr^2 OR body:solr Thanks Nitin -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-boost-the-relevancy-of-a-field-tp3671020p3671118.html Sent from the Solr - User mailing list archive at Nabble.com.