On 30-May-08, at 9:51 PM, Dallan Quass wrote:
One more clarification -- I don't need to do this for every token in
the
text; just for "place" fields in the document. Each document has
1-3 place
fields that need to be converted to standard form when the document is
indexed.
There is a special set of (~1M) "Place" documents that contain
information
about alternative/abbreviated place names, how places are nested
inside each
other, etc. Either before or during tokenization of the regular
documents I
want to query these "Place" documents to determine how to
standardize the
place fields in the regular documents.
Perhaps you could separate the problem, putting this info in separate
index or solr core.
-Mike