The first thing you should do is remove any reference to stop words and never use them, then re-index your data and try it again.
On Tue, Nov 5, 2019 at 9:14 AM Guilherme Viteri <gvit...@ebi.ac.uk> wrote: > Hi, > > I am performing a search to match a name (text_field), however this term > contains 'and' and 'a' and it doesn't return any records. If i remove 'a' > then it works. > e.g > Search Term: lymphoid and a non-lymphoid cell > doesn't work: > https://dev.reactome.org/content/query?q=lymphoid+and+a+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true > < > https://dev.reactome.org/content/query?q=lymphoid+and+a+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true > > > > Search term: lymphoid and non-lymphoid cell > works: > https://dev.reactome.org/content/query?q=lymphoid+and+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true > < > https://dev.reactome.org/content/query?q=lymphoid+and+non-lymphoid+cell&species=Homo+sapiens&species=Entries+without+species&cluster=true > > > interested in the first result > > schema.xml > <field name="name" type="text_field" > indexed="true" stored="true" omitNorms="false" required="true" > multiValued="false"/> > > <analyzer type="query"> > <tokenizer class="solr.PatternTokenizerFactory" > pattern="[^a-zA-Z0-9/._:]"/> > <filter class="solr.PatternReplaceFilterFactory" > pattern="^[/._:]+" replacement=""/> > <filter class="solr.PatternReplaceFilterFactory" > pattern="[/._:]+$" replacement=""/> > <filter class="solr.PatternReplaceFilterFactory" > pattern="[_]" replacement=" "/> > <filter class="solr.LengthFilterFactory" min="2" max="20"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt"/> > </analyzer> > > <fieldType name="text_field" class="solr.TextField" > positionIncrementGap="100" omitNorms="false" > > <analyzer type="index"> > <tokenizer class="solr.StandardTokenizerFactory"/> > <filter class="solr.ClassicFilterFactory"/> > <filter class="solr.LengthFilterFactory" min="2" max="20"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.PatternTokenizerFactory" > pattern="[^a-zA-Z0-9/._:]"/> > <filter class="solr.PatternReplaceFilterFactory" > pattern="^[/._:]+" replacement=""/> > <filter class="solr.PatternReplaceFilterFactory" > pattern="[/._:]+$" replacement=""/> > <filter class="solr.PatternReplaceFilterFactory" > pattern="[_]" replacement=" "/> > <filter class="solr.LengthFilterFactory" min="2" max="20"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt"/> > </analyzer> > </fieldType> > > stopwords.txt > #Standard english stop words taken from Lucene's StopAnalyzer > a > b > c > .... > an > and > are > > Running SolR 6.6.2. > > Is there anything I could do to prevent this ? > > Thanks > Guilherme