Thanks Erick, I will explain the detail scenario so you might give me a solution: I want to annotate a medical document base on only medical dictionary. I don't need to annotate non medical words of document at all. The medical dictionary contains terms which contains multiple words, and these terms all together has a specific medical meanings. For example "back Pain", "back" and "pain" are two separate words but together they have another meaning. these terms might be using in different orders in a sentences but all with a same meaning. Ex "breast cancer" or "cancer in breast" should be consider the same... We have terms even more than 6 words also.
So the question is that "I have a document with around 700 words and i need to annotate this document base on medical terminology of 3 million size in records" any idea how to do this? -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-index-document-with-multiple-words-phrases-and-words-permutation-tp4224919p4224970.html Sent from the Solr - User mailing list archive at Nabble.com.