Good Morning - I will explain my current config/fucntionality.

I have 4 fields in my index...

1) Doc Title - a text field
2) Keyword Phrase, e.g. fund manager, a text field (with some edge n gram 
functionality at index time)
3) Keyword Phrase, e.g. fund manager, a string field (for facetting)
4) Content Field, i.e. my full document text, a text field

I have a nice bit of auto-complete functionality in my UI which works as 
follows.......

user searches -> fund ma

and my service layer calls SOLR to say please find all docs with fund and ma in 
it. My search results are fine, I also ask for facets and counts in this same 
query so I can use them in my auto-complete (I ask for field (3) above when 
facetting).

This allows me to use the facets and counts to show a nice auto-complete each 
time a user hits a key.

Ok so far. I have a nice auto-complete based upon business domain Keyword 
Phrases.

Now.....on to synonyms, for example fund manager and fund lead are the same 
thing in my business domain.

I was planning on simply adding the synonyms as normal entries into fields 2 
and 3 (both multi-valued fileds) so that they would be inserted into the index 
and be available for my auto-complete. This would be OK and to clarify, nothing 
to do with the synonyms.txt file at this point.

However, as SOLR has synonym processing I should take advantage of it (also at 
this point my synonym fund lead would not have found its way into field 4 (full 
text off the document) where fund manager was in the content).

SO I belive I should so something like...

fund manager, fund lead 

...in my synonym file that I only want to process at index time (so it appears 
in my autocomplete) with expansion on. I want wherever fund manager or fund 
lead is found, for the index to have fund manager and fund lead.

As I have expansion on and have multi word synonyms (phrases as both a source 
and target) then to use the synonym file at index time seems best.

However, I am very confused at this point.

I can see how the synonym file would be processed correctly for field 3 (a 
string field) and both terms fund maanger and fund lead should go into the 
index OK.

But I can't see how it would work for the text fields (2 and 4).

My Index time filter chain has synonym processing as per the default text field 
processing (after whitespace tokenisation), so I cant see how my terms fund 
manager and fund lead can be found by the synonym filter. 

I've looked in the book by Eric Pugh and they say that for multi-word synonyms 
to work you must use synonyms at index time and with expansion - they say you 
cant do synonym processing at query time as synonym phrases aren't recognised 
after whitespace parsing - but my index chain (and the defauly SOLR config for 
text fields ) also whitespace parses.

it would be great to take advantage of synonym processing by SOLR instead of 
mty original plan - but am confused how multi-word synonms can be recognised at 
index time and added to the index - am I missing something about inde time 
processign of synonyms here?

Many Thanks for any help/advice.

Jason.






If you wish to view the St. James's Place email disclaimer, please use the link 
below

http://www.sjp.co.uk/portal/internet/SJPemaildisclaimer

Reply via email to