Erick: I have tried what you said. I needed clarification on this.. Below is
my doubt added:
Say If i have field type :
<fieldType name="Synonymdata" class="solr.TextField"
positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="org.apache.solr.orchsynonym.OrchSynonymFilter"
synonyms="BODYTaxonomy.txt,PalpClinLocObsTaxo.txt,MacroscopicTaxonomy.txt,MicroscopicTaxonomy.txt,SpecimenTaxonomy.txt,ParameterTaxonomy.txt,StrainTaxonomy.txt"
ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English"
protected="protwords.txt"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="org.apache.solr.orchsynonym.OrchSynonymFilter"
synonyms="BODYTaxonomy.txt,PalpClinLocObsTaxo.txt,MacroscopicTaxonomy.txt,MicroscopicTaxonomy.txt,SpecimenTaxonomy.txt,ParameterTaxonomy.txt,StrainTaxonomy.txt"
ignoreCase="true" expand="false"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English"
protected="protwords.txt"/>
</analyzer>
</fieldType>
The data indexed in this field is :
sentence 1 : " tissue devitalization was noted in hepalocytes of liver"
sentence 2 : "Necrosis not found in liver"
Synonyms:
necrosis , tissue devitalization, cellular necrosis
How does the white space and synonym filter behave?I am not able to
understand in analysis page..Please let me know if it is like this that
works? Correct me if i am wrong..
sentence 1 : " tissue devitalization was noted in hepalocytes of liver"
white space :
tissue
devitalization
was
noted
in
hepalocytes
of
liver
Synoyms for token words:
No synonyms for tissue , no synonym for devitalization and so
on.........................
So does the "tissue devitalization" word will not become synonym for
Necrosis ?(since it is mentioned in synonym)
If it adds as the synonym, Then how is it splitting the sentence and adding
the filter? Which is happening first?
Sentence 2: Necrosis not found in liver
white space
Necrosis
not
found
in
liver
Synoyms for token words:
synonyms for Necrosis: tissue devitalization,cellular necrosis, no synonym
for not, no synonym for found and so on.........................
Is this correct?
My main concern is when i have 3 set of data like this:
tissue devitalization was observed in hepalocytes of liver
necrosis was observed in liver
Necrosis not found in liver
When i search "Necrosis not found" I need to get only the last sentence.
I am not able to find out the list of tokens and analysers that i need to
apply in order to acheieve this desired output
Awaiting reply
Rajani Maski
On Tue, Jun 14, 2011 at 3:13 PM, roySolr <[email protected]> wrote:
> Maybe you can try to escape the synonyms so it's no tokized by whitespace..
>
> Private\ schools,NGO\ Schools,Unaided\ schools
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Query-on-Synonyms-feature-in-Solr-tp3058197p3062392.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>