Hi Martin, fq means filter query. May be you want to use qf (query fields) parameter of edismax?
On Wednesday, March 25, 2015 9:23 PM, Martin Wunderlich <martin...@gmx.net> wrote: Hi all, I am wondering what the process is for applying Tokenizers and Filter (as defined in the FieldType definition) to field contents that result from CopyFields. To be more specific, in my Solr instance, Iwould like to support query expansion by two means: removing stop words and adding inflected word forms as synonyms. To use a specific example, let’s say I have the following sentence to be indexed (from a Wittgenstein manuscript): "Was zum Wesen der Welt gehört, kann die Sprache nicht ausdrücken.“ This sentence will be indexed in a field called „original“ that is defined as follows: <field name="original" type="text_original" indexed="true" stored="true" required="true“/> <fieldType name="text_windex_original" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> </analyzer> </fieldType> Then, in order to create fields for the two types of query expansion, I have set up specific fields for this: - one field where stopwords are removed both on the indexed content and the query. So, if the users is searching for a phrase like „der Sprache“, Solr should still find the segment above, because the determiners („der“ and „die“) are removed prior to indexing and prior to querying, respectively. This field is defined as follows: <field name="stopwords_removed" type="text_stopwords_removed" indexed="true" stored="true" required="true“/> <fieldType name="text_stopwords_removed" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words=„stopwords_de.txt" format="snowball"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_de.txt" format="snowball"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> - a second field where synonyms are added to the query so that more segments will be found. For instance, if the user is searching for the plural form „Sprachen“, Solr should return the segment above, due to this entry in the synonyms file: "Sprache,Sprach,Sprachen“. This field is defined as follows: <field name="expanded" type="text_multiplied" indexed="true" stored="true" required="true“/>expanded <fieldType name="text_expanded" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_de.txt" format="snowball"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_de.txt" format="snowball"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms_de.txt" ignoreCase="true" expand="true"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> Finally, to avoid having to specify three fields with identical content in the import documents, I am defining the two fields for query expansion as copyFields: <copyField source="original" dest="stopwords_removed"/> <copyField source="original" dest="expanded“/> Now, my expectation would be as follows: - during import, two temporary fields are created by copying content from the original field - these two temporary fields are then pre-processed as per the definitions above - the pre-processed version of the text is added to the index - then, the user can search for „Sprache“, „sprache“, „Sprachen“ or „der Sprache“ and will always get the segment above as a matching result. However, what happens actually is that I get matches only for „Sprache“ and „sprache“. The other thing that strikes as odd, is that when I restrict the search to one of the fields only using the „fq“ parameter, I get no results. For instance: http://localhost:8983/solr/windex/select?q=Sprache&fq=original&wt=json&indent=true <http://localhost:8983/solr/windex/select?q=Sprache&fq=original&wt=json&indent=true> will return no matches. I would expected that using the fq parameter the user can specify what type of search (s)he would like to carry out: A standard search (field original) or an expanded search (one of the other two fields). For debugging, I have checked the analysis and results seem ok (posted below). Apologies for the long post, but I am really a bit stuck here (even after doing a lot of reading and googling). It is probably something simple that I missing. Thanks a lot in advance for any help. Cheers, Martin ST Was zum Wesen der Welt gehört kann die Sprache nicht ausdrücken SF Was zum Wesen Welt gehört kann die Sprache nicht ausdrücken LCF was zum wesen welt gehört kann die sprache nicht ausdrücken