Schema.xml, copyField, Slash, ignoreCase ?

Bruno Mannina Fri, 11 Jan 2019 01:40:15 -0800

Hello,



Im facing a problem concerning the default field text (SOLR 5.4) and
queries which contains / (slash)



I need to have default text field with:

- ignoreCase,

- no auto truncation,

- process slash char



I would like to perform only query on the field text

Queries can contain:  code or keywords or both.



I have 2 fields named symbol and title, and 1 alias ti (old field that I
cant delete or modify)



* Symbol contains code with slash (i.e A62C21/02)

<field name="symbol" type="string_ci" multiValued="false" indexed="true"
required="true" stored="true"/>



* Title contains English text and also symbol

    <field name="title" type="text_en" multiValued="true" indexed="true"
stored="true" termVectors="true" termPositions="true" termOffsets="true"/>



{ "symbol": "B65D81/20",

"title": [

 "under vacuum or superatmospheric pressure, or in a special atmosphere,
e.g. of inert gas  {(B65D81/28  takes precedence; containers with
pressurising means for maintaining ball pressure A63B39/025)} "

]}



* Ti is an alias of title

    <field name="ti" type="text_general" multiValued="true" indexed="true"
stored="true" termVectors="true" termPositions="true" termOffsets="true"/>



* Text is

<field name="text" type="text_general" indexed="true" stored="false"
multiValued="true"/>



- Alias are:



    <copyField source="title"  dest="ti"/>

    <!-- ALIAS TEXT -->

    <copyField source="title"  dest="text"/>

    <copyField source="symbol" dest="text"/>





If I do these queries :



* ti:airbag                           à its ok

* title:airbag                      à not good for me because it found
airbags

* ti:b65D81/28                  à not good, debug shows ti:b65d81 OR ti:28

* ti:b65D81/28              à its ok

* symbol:b65D81/28      à its ok (even without  )



NOW with text field

* b65D81/28                      à not good, debug shows text:b65d81 OR
text:28

* airbag                               à its ok

* b65D81/28                  à its ok



It will be great if I can enter symbol without  



Could you help me to have a text field which solve this problem ? (please
find below all def of my fields)



Many thanks for your help.



String_ci is my own definition



    <fieldType name="string_ci" class="solr.TextField"
sortMissingLast="true" omitNorms="true">

    <analyzer>

      <tokenizer class="solr.KeywordTokenizerFactory"/>

      <filter class="solr.LowerCaseFilterFactory"/>

    </analyzer>

    </fieldType>



    <fieldType name="text_general" class="solr.TextField"
positionIncrementGap="100" multiValued="true">

      <analyzer type="index">

        <tokenizer class="solr.StandardTokenizerFactory"/>

        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" />

        <filter class="solr.LowerCaseFilterFactory"/>

      </analyzer>

      <analyzer type="query">

        <tokenizer class="solr.StandardTokenizerFactory"/>

        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" />

        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>

        <filter class="solr.LowerCaseFilterFactory"/>

      </analyzer>

    </fieldType>



    <fieldType name="text_en" class="solr.TextField"
positionIncrementGap="100">

      <analyzer type="index">

        <tokenizer class="solr.StandardTokenizerFactory"/>

        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="lang/stopwords_en.txt"/>

        <filter class="solr.LowerCaseFilterFactory"/>

        <filter class="solr.EnglishPossessiveFilterFactory"/>

        <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>

        <filter class="solr.PorterStemFilterFactory"/>

      </analyzer>

      <analyzer type="query">

        <tokenizer class="solr.StandardTokenizerFactory"/>

        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>

        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="lang/stopwords_en.txt"/>

        <filter class="solr.LowerCaseFilterFactory"/>

        <filter class="solr.EnglishPossessiveFilterFactory"/>

       <filter class="solr.KeywordMarkerFilterFactory"
protected="protwords.txt"/>

        <filter class="solr.PorterStemFilterFactory"/>

      </analyzer>

    </fieldType>





Best Regards

Bruno





---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

Schema.xml, copyField, Slash, ignoreCase ?

Reply via email to