Thanks.
Haven't I done this here ?
<fieldType name="text_field" class="solr.TextField"
positionIncrementGap="100" omitNorms="false" >
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.ClassicFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="2" max="20"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
</analyzer>
> On 5 Nov 2019, at 14:15, David Hastings <[email protected]> wrote:
>
> Fwd to another server
>
> The first thing you should do is remove any reference to stop words and
> never use them, then re-index your data and try it again.
>
> On Tue, Nov 5, 2019 at 9:14 AM Guilherme Viteri <[email protected]> wrote:
>
>> Hi,
>>
>> I am performing a search to match a name (text_field), however this term
>> contains 'and' and 'a' and it doesn't return any records. If i remove 'a'
>> then it works.
>> e.g
>> Search Term: lymphoid and a non-lymphoid cell
>> doesn't work:
>> https://dev.reactome.org/content/query?q=lymphoid+and+a+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true
>> <
>> https://dev.reactome.org/content/query?q=lymphoid+and+a+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true
>>>
>>
>> Search term: lymphoid and non-lymphoid cell
>> works:
>> https://dev.reactome.org/content/query?q=lymphoid+and+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true
>> <
>> https://dev.reactome.org/content/query?q=lymphoid+and+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true
>>>
>> interested in the first result
>>
>> schema.xml
>> <field name="name" type="text_field"
>> indexed="true" stored="true" omitNorms="false" required="true"
>> multiValued="false"/>
>>
>> <analyzer type="query">
>> <tokenizer class="solr.PatternTokenizerFactory"
>> pattern="[^a-zA-Z0-9/._:]"/>
>> <filter class="solr.PatternReplaceFilterFactory"
>> pattern="^[/._:]+" replacement=""/>
>> <filter class="solr.PatternReplaceFilterFactory"
>> pattern="[/._:]+$" replacement=""/>
>> <filter class="solr.PatternReplaceFilterFactory"
>> pattern="[_]" replacement=" "/>
>> <filter class="solr.LengthFilterFactory" min="2" max="20"/>
>> <filter class="solr.LowerCaseFilterFactory"/>
>> <filter class="solr.StopFilterFactory" ignoreCase="true"
>> words="stopwords.txt"/>
>> </analyzer>
>>
>> <fieldType name="text_field" class="solr.TextField"
>> positionIncrementGap="100" omitNorms="false" >
>> <analyzer type="index">
>> <tokenizer class="solr.StandardTokenizerFactory"/>
>> <filter class="solr.ClassicFilterFactory"/>
>> <filter class="solr.LengthFilterFactory" min="2" max="20"/>
>> <filter class="solr.LowerCaseFilterFactory"/>
>> <filter class="solr.StopFilterFactory" ignoreCase="true"
>> words="stopwords.txt"/>
>> </analyzer>
>> <analyzer type="query">
>> <tokenizer class="solr.PatternTokenizerFactory"
>> pattern="[^a-zA-Z0-9/._:]"/>
>> <filter class="solr.PatternReplaceFilterFactory"
>> pattern="^[/._:]+" replacement=""/>
>> <filter class="solr.PatternReplaceFilterFactory"
>> pattern="[/._:]+$" replacement=""/>
>> <filter class="solr.PatternReplaceFilterFactory"
>> pattern="[_]" replacement=" "/>
>> <filter class="solr.LengthFilterFactory" min="2" max="20"/>
>> <filter class="solr.LowerCaseFilterFactory"/>
>> <filter class="solr.StopFilterFactory" ignoreCase="true"
>> words="stopwords.txt"/>
>> </analyzer>
>> </fieldType>
>>
>> stopwords.txt
>> #Standard english stop words taken from Lucene's StopAnalyzer
>> a
>> b
>> c
>> ....
>> an
>> and
>> are
>>
>> Running SolR 6.6.2.
>>
>> Is there anything I could do to prevent this ?
>>
>> Thanks
>> Guilherme