In 4.x and trunk is a close() method on Tokenizers and Filters. In
currently released up to 4.3, there is instead a reset(stream) method
which is how it resets a Tokenizer&Filter for a following document in
the same upload.
In both cases I had to track the first time the tokens are consumed, and
do all of the setup then. If you do this, then reset(stream) can clear
the native resources, and let you re-load them on the next consume.
Look at LUCENE-2899 in OpenNLPTokenizer and OpenNLPFilter.java to see
what I had to do.
But yes, to be absolutely sure, you need to add a finalizer.
On 06/12/2013 04:34 AM, Benson Margulies wrote:
Could I have some help on the combination of these two? Right now, it
appears that I'm stuck with a finalizer to chase after native
resources in a Tokenizer. Am I missing something?