Thanks for the support Erick. Not using the “qf" parameter at all seems to give 
me valid query results now. The query debug information:

"debug":{ "rawquerystring":"claims_en:(An English sentence) description_en:(An 
English sentence) claims_de:(Ein Deutscher Satz) description_de:(Ein Deutscher 
Satz)", "querystring":"claims_en:(An English sentence) description_en:(An 
English sentence) claims_de:(Ein Deutscher Satz) description_de:(Ein Deutscher 
Satz)", "parsedquery":"+((claims_en:english claims_en:sentenc) 
(description_en:english description_en:sentenc) (claims_de:deutsch 
claims_de:satz) (description_de:deutsch description_de:satz))", 
"parsedquery_toString":"+((claims_en:english claims_en:sentenc) 
(description_en:english description_en:sentenc) (claims_de:deutsch 
claims_de:satz) (description_de:deutsch description_de:satz))"

But this way it now seems like the “tie” parameter has no impact anymore. The 
fact that I wanted something between a sum and a max query was the original 
reason why I intend to use a edismax query. Also since I do have full sentences 
as query, I thought it would be a good idea to use the phrase query feature at 
a later stage.

If the edismax query is not the way to achieve my goal, do you see a proper way 
to do this? The only alternative I see is running 2 seperate edismax query, one 
for the English fields and one for the German fields and then recombining the 
results. But that way I don’t know if the resulting scores are comparable? Can 
I assume a score of 15 from the English edismax is better than a score of 13 
from the German edismax?

Best regards
David


On 5 Jun 2020, at 19:39, Erick Erickson 
<erickerick...@gmail.com<mailto:erickerick...@gmail.com>> wrote:

Let’s see the results of adding &debug=query to the query, in particular the 
parsed version.

Because what you’re reporting doesn’t really make sense. edismax should be 
totally
ignoring the “qf” parameter since you’re specifically qualifying all the 
clauses with
a field. Unless you’re not really enclosing the search text in parentheses (or 
quotes
if they should be phrases).

Also, if you’re willing to form separate clauses like this, there's no reason 
to even
use edismax since its purpose is to automatically distribute search terms over 
multiple
fields and you’re explicitly specifying the fields..

Best,
Erick

On Jun 5, 2020, at 10:10 AM, David Zimmermann 
<david.zimmerm...@usi.ch<mailto:david.zimmerm...@usi.ch>> wrote:

I could need some advice on how to handle a particular cross language search 
with Solr. I posted it on Stackoverflow 2 months ago, but could not find a 
solution.
I have documents in 3 languages (English, German, French). For simplicity let's 
assume it's just two languages (English and German). The documents are 
standardised in the sense that they contain the same parts (text_part1 and 
text_part2), just the language they are written in is different. The language 
of the documents is known. In my index schema I use one core with different 
fields for each language.

For a German document the index will look something like this:

*   text_part1_en: empty
*   text_part2_en: empty
*   text_part1_de: German text
*   text_part2_de: Another German text

For an English document it will be the other way around.

What I want to achieve: A user entering a query in English should receive both, 
English and German documents that are relevant to his search. Further 
conditions are:

*   I want results with hits in text_part1 and text_part2 to be higher ranked 
than results with hits only in one field (tie value > 0).
*   The queries will not be single words, but full sentences (stop word removal 
needed and partial hits [only a few words out of the sentences] must be valid).
*   English and German documents must output into one ranking. I need to be 
able to compare the relevance of an English document to the relevance of a 
German document.
*   the text parts need to stay separate, I want to boost the importance of 
(let's say part1) over the other.

My general approach so far has been to get a German translation of the user's 
query by sending it to a translation API. Then I want use an edismax query, 
since it seems to fulfill all of my requirements. The problem is that I cannot 
manage to search for the German query in the German fields and the English 
query in the English fields only. The Solr edismax 
documentation<https://lucene.apache.org/solr/guide/6_6/the-extended-dismax-query-parser.html>
 states that it supports the full Lucene query parser syntax, but I can't find 
a way to address different fields with different inputs. I tried:

q=text_part1_en: (A sentence in English) text_part1_de: (Ein Satz auf Deutsch) 
text_part2_en: (A sentence in English) text_part2_de: (Ein Satz auf Deutsch)
qf=text_part1_en text_part2_en text_part1_de text_part2_de


This syntax should be in line with what MatsLindh wrote in this 
thread<https://stackoverflow.com/questions/53371028/different-search-term-on-different-fields-using-edismax-query-parser-in-solr>.
 I tried different versions of writing this q, but whatever I do Solr always 
search for the full q string in all four fields given by qf, which totally 
messes up the result. Am I just making mistakes in the query syntax or is it 
even possible to do what I'm trying to do using edismax?

Any help would be highly appreciated.

Reply via email to