Hi,

I'm wondering whether solr ttf functionQuery support (compound words) ngram
(n>2) ?

I'm using "
http://localhost:8983/solr/collection1/select?q=*:*&fl=ttf(content,%22apple%20banana%22)&rows=1"
to query total term frequency of bigram tokens in "content" field in the
whole index.

However, the result (returned with 20) is not consistent with the result
queried via
http://localhost:8983/solr/tatasteel/select?q=content:%22apple%20banana%22.
I manually checked the actual occurrence is 15.

What is the actual behaviour of the ttf function query (i'm using solr
5.3.0)? The reference guide does not explain the details.

Does it perform full text index query on this field ? or it relies on the
tf values stored by tvComponent?

I have configured the content field with the following textField type:

<fieldType name="text_tr_general" class="solr.TextField"
positionIncrementGap="100">
        <analyzer type="index">
            <tokenizer class="solr.StandardTokenizerFactory" />
            <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
            <filter class="solr.LowerCaseFilterFactory" />
            <filter class="solr.ShingleFilterFactory"
minShingleSize="2" maxShingleSize="5"
                    outputUnigrams="true"
outputUnigramsIfNoShingles="false" tokenSeparator=" "/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
            <filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true" />
            <filter class="solr.LowerCaseFilterFactory" />
            <filter class="solr.ASCIIFoldingFilterFactory"/>
        </analyzer>
    </fieldType>

Any ideas ?

Thanks,
Jerry

Reply via email to