On 30-May-08, at 9:51 PM, Dallan Quass wrote:

One more clarification -- I don't need to do this for every token in the text; just for "place" fields in the document. Each document has 1-3 place
fields that need to be converted to standard form when the document is
indexed.

There is a special set of (~1M) "Place" documents that contain information about alternative/abbreviated place names, how places are nested inside each other, etc. Either before or during tokenization of the regular documents I want to query these "Place" documents to determine how to standardize the
place fields in the regular documents.

Perhaps you could separate the problem, putting this info in separate index or solr core.

-Mike

Reply via email to