expand=true in ManagedSynonymFilterFactory

2014-06-27 Thread Mingchun Zhao
Hello,

How can I add equivalent synonyms separated with commas
 into ManagedSynonymFilterFactory via REST API?
Further, I'd like to set expand=true to make a synonym expanded
 to all equivalent synonyms.

Best regards,
Mingchun Zhao


Re: (Edge)NGramFilterFactory and highlight

2014-12-20 Thread Mingchun Zhao
Hi Bjørn,

>From solr4.4, the behavior of end offsets in EdgeNGramFilterFactory
was changed due to the following issue,
https://issues.apache.org/jira/browse/LUCENE-3907
The related source code in this patch as below,
==
+  if (version.onOrAfter(Version.LUCENE_44)) {
+// Never update offsets
+updateOffsets = false;
+  } else {
+// if length by start + end offsets doesn't match the
term text then assume
+// this is a synonym and don't adjust the offsets.
+updateOffsets = (tokStart + curTermLength) == tokEnd;
+  }
==

It seems that there is no any property for specifying the previous
behavior of offsets as in LUCENE_43.
Therefore, you might have to set luceneMatchVersion to deal with it as
you mentioned.
However, it would be better to apply luceneMatchVersion just on the
EdgeNGramFilterFactory as below,
==

==
The setting of LUCENE_43 in
solrconfig.xml
will also affect other configurations.

Regards,
Mingchun


2014-12-19 23:26 GMT+09:00 Bjørn Hjelle :
> Hi,
>
> based on this example:
> http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/
> I have earlier successfully implemented highlight of terms in
> (Edge)NGram-analyzed fields.
>
> In a new project, however, with Solr 4.10.2 it does not work.
>
> In the Solr admin analysis page I see the following in Solr 4.10.2 
> (simplified):
>
> ENGTF  text  t  te  tes  test
>start 0  0   00
>end   4  4   44
>
> But if I change to LUCENE_43 in solrconfig.xml, and reload the
> analysis page I get this:
>
> ENGTF  text  t  te  tes  test
>start 0  0   00
>end   1  2   34
>
> So, in 4.10.2 it is not able to find the correct end-positions and the
> highlighter will instead highlight the complete word ("test" in this
> case).
>
>
> To reproduce  this:
> 1. download Solr 4.10.2
> 2. In the collection1 schema.xml, add field type:
>
>
> 
> 
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>  generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
> 
>  maxGramSize="20" minGramSize="1"/>
>  pattern="([^\w\d\*æøåÆØÅ ])" replacement="" replace="all"/>
> 
> 
>  mapping="mapping-ISOLatin1Accent.txt"/>
> 
>  generateWordParts="0" generateNumberParts="0" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
> 
>  pattern="([^\w\d\*æøåÆØÅ ])" replacement="" replace="all"/>
>  pattern="^(.{20})(.*)?" replacement="$1" replace="all"/>
> 
> 
>
> 3. Start solr and in analysis page add "Test" to Field Value (Index)
> -field and check the output.
> 4. Then change to this in solrconfig.xml
>
>   LUCENE_43
>
> 5. reload the core and reload the analyis page.
> 6. you will now see that the end-positions are correct.
>
>
>
> Any ideas on how to make this work with Solr 4.10.2 without resorting
> to changing lucene version in solrconfig.xml?
>
>
> Thanks,
> Bjørn