It may be a tokenization thing -- the apostrophe is causing a word break  
so your custom stem is never matched.

What does this give you: cts:tokenize(cts:stem("Int'l"))?

Do things work as you expect for a custom stem that doesn't have a  
punctuation character in it?

A workaround for that is to create a field custom tokenization override  
making apostrophe a word character. That will be confined to that specific  
field, however, and not to word queries in general.

Regardless, you should probably report a bug to ML support.

//Mary

On Wed, 22 Jul 2015 08:02:33 -0700, Rhodes, David (LNG-CON)  
<[email protected]> wrote:

> I am trying to use a custom dictionary to extend the set of stemmed  
> words.
>
> I am using MarkLogic 7.0, and have been following the documentation  
> guides in Chapters 17 and 18:
> http://docs.marklogic.com/7.0/guide/search-dev/stemming
> http://docs.marklogic.com/7.0/guide/search-dev/custom-dictionaries
>
> I noted that there are two ways to see if words are resolving to their  
> stems:
>
> cts:stem(word) returns the stems of word
>
> and
>
> cts:contains(word, stem) returns true if these two terms resolve to the  
> same stem
>
> I confirmed that both of these work for terms that are in the default  
> dictionary (e.g., run and running, bite and bitten)
>
> I have added a custom dictionary that adds "Int'l" as a word with  
> "International" as its stem.
>
> cdict:dictionary-write("en",$dict)
>
> With that dictionary added as the custom dictionary for English,  
> cts:stem works but cts:contains does not.
> cts:stem("Int'l") returns International
> cts:contains("Int'l", "International") returns false
>
> I reindexed my database, since I understand that my dictionary entry  
> means that all documents containing "Int'l" should now be indexed under  
> "International".
>
> cts:contains("Int'l", "International") still returns false
> Furthermore, in the real search work flow that I am doing, searches for  
> "Int'l" do not return documents containing "International" (But searches  
> for "bitten" do return documents containing "bite").
>
> My database indexes are set to Stemmed Searches = Basic, and Word  
> Searches = False.
>
> I think that stemming can be a powerful feature for my work flow, if I  
> can just get it to work. Thank you for any advice you can offer.
>
> David


-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to