Thanks. Haven't I done this here ? <fieldType name="text_field" class="solr.TextField" positionIncrementGap="100" omitNorms="false" > <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.ClassicFilterFactory"/> <filter class="solr.LengthFilterFactory" min="2" max="20"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> </analyzer>
> On 5 Nov 2019, at 14:15, David Hastings <hastings.recurs...@gmail.com> wrote: > > Fwd to another server > > The first thing you should do is remove any reference to stop words and > never use them, then re-index your data and try it again. > > On Tue, Nov 5, 2019 at 9:14 AM Guilherme Viteri <gvit...@ebi.ac.uk> wrote: > >> Hi, >> >> I am performing a search to match a name (text_field), however this term >> contains 'and' and 'a' and it doesn't return any records. If i remove 'a' >> then it works. >> e.g >> Search Term: lymphoid and a non-lymphoid cell >> doesn't work: >> https://dev.reactome.org/content/query?q=lymphoid+and+a+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true >> < >> https://dev.reactome.org/content/query?q=lymphoid+and+a+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true >>> >> >> Search term: lymphoid and non-lymphoid cell >> works: >> https://dev.reactome.org/content/query?q=lymphoid+and+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true >> < >> https://dev.reactome.org/content/query?q=lymphoid+and+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true >>> >> interested in the first result >> >> schema.xml >> <field name="name" type="text_field" >> indexed="true" stored="true" omitNorms="false" required="true" >> multiValued="false"/> >> >> <analyzer type="query"> >> <tokenizer class="solr.PatternTokenizerFactory" >> pattern="[^a-zA-Z0-9/._:]"/> >> <filter class="solr.PatternReplaceFilterFactory" >> pattern="^[/._:]+" replacement=""/> >> <filter class="solr.PatternReplaceFilterFactory" >> pattern="[/._:]+$" replacement=""/> >> <filter class="solr.PatternReplaceFilterFactory" >> pattern="[_]" replacement=" "/> >> <filter class="solr.LengthFilterFactory" min="2" max="20"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.StopFilterFactory" ignoreCase="true" >> words="stopwords.txt"/> >> </analyzer> >> >> <fieldType name="text_field" class="solr.TextField" >> positionIncrementGap="100" omitNorms="false" > >> <analyzer type="index"> >> <tokenizer class="solr.StandardTokenizerFactory"/> >> <filter class="solr.ClassicFilterFactory"/> >> <filter class="solr.LengthFilterFactory" min="2" max="20"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.StopFilterFactory" ignoreCase="true" >> words="stopwords.txt"/> >> </analyzer> >> <analyzer type="query"> >> <tokenizer class="solr.PatternTokenizerFactory" >> pattern="[^a-zA-Z0-9/._:]"/> >> <filter class="solr.PatternReplaceFilterFactory" >> pattern="^[/._:]+" replacement=""/> >> <filter class="solr.PatternReplaceFilterFactory" >> pattern="[/._:]+$" replacement=""/> >> <filter class="solr.PatternReplaceFilterFactory" >> pattern="[_]" replacement=" "/> >> <filter class="solr.LengthFilterFactory" min="2" max="20"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.StopFilterFactory" ignoreCase="true" >> words="stopwords.txt"/> >> </analyzer> >> </fieldType> >> >> stopwords.txt >> #Standard english stop words taken from Lucene's StopAnalyzer >> a >> b >> c >> .... >> an >> and >> are >> >> Running SolR 6.6.2. >> >> Is there anything I could do to prevent this ? >> >> Thanks >> Guilherme