[ 
https://issues.apache.org/jira/browse/LUCENE-5572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17426035#comment-17426035
 ] 

Dawid Weiss commented on LUCENE-5572:
-------------------------------------

bq. A GUI application running outside our context is interrupting the thread in 
order to cancel a long-running operation. This is still, to my knowledge at 
least, the only remaining way to do this in Java. 

My experience is that anything involving Thread.interrupt() will cause you 
headaches either due to bugs in other code, like resources not released 
properly (for example thread pools, open file handles), or due to infrequent 
corner cases like this one. Whether you call it a design issue or an 
unfortunate series of events is really secondary to the fact that I don't think 
there is a reliable way to ensure everything works correctly then. Please read 
on.

bq. There's a general expectation in Java that code will behave correctly when 
an interrupt occurs.

Maybe there is such an expectation. My experience says it's not the case. If 
you interrupt threads at unexpected places, things will go wild. We also use 
interrupts - to try to harness deadlocked tests, after a timeout passes. 
Typically this leads to a situation when the main test thread returns but there 
are tons of forked threads that just happily hang there - thread pools are the 
typical offenders. If you do this repeatedly, your will run out of resources 
eventually.

bq. Our library is shielding the GUI application from needing to know that 
they're using Lucene. What is Lucene doing to shield its users from this quirky 
interrupt behaviour of Java?

The code in this Lucene class initializes its (required) resources in a static 
initializer and uses its own class loader to do so. I do think it is a 
reasonable assumption that classpath resources are always available for classes 
- an I/O exception there to me is unrecoverable (for whatever reason). If we 
wanted to "fix" this then an alternative to a static initializer is lazy 
initialization but this entails implementing some form of singleton creation - 
either racy static variable initialization or a lock somewhere. Neither is 
pretty and neither is really required in 99.99% of cases (your use case 
accounts for the rest). 

What I'm saying is that I still don't think it's a bug - I understand your use 
case and frustration but I don't think it requires fixing on Lucene side. If 
you know your library is used in circumstances you describe above, shield your 
users by preloading those classes that have I/O in static initializers - this 
is a very easy thing to do via Class.forName and will ensure everything is in 
place before those GUI threads even have a chance to interrupt anything.

Finally, I don't mean to preach since I know you're a seasoned engineer... but 
in reality thread interrupts won't really do much if your blocked code is 
purely computational - not touching the I/O or monitors. Whenever I had to 
implement interrupting "long running operations" I resorted to delegating jobs 
to a background thread and returning the calling thread to the application 
immediately when the user canceled the operation, leaving the background job to 
run its course (and hopefully release the resources!). If this wasn't feasible 
or too costly, we broke up the job and checked some form of cancellation flag 
manually - this can be done even for third-party libraries with tools like 
bytecode injectors (aspectj or the like) but then you know where you insert 
cancellation checks and have some form of control over what's happening.


> JapaneseTokenizer is sensitive to interrupts
> --------------------------------------------
>
>                 Key: LUCENE-5572
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5572
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 3.6.2
>            Reporter: Anthony Rasmussen
>            Priority: Minor
>
> The constructor for JapaneseTokenizer gets the following singleton instances: 
> TokenInfoDictionary, UnknownDictionary, and ConnectionCosts. I am finding 
> that the associated getInstanceMethods are particularly sensitive to 
> IOExceptions.
> Perhaps, in the static initializers of these  3 singletons, there could be 
> some sort of retry effort before throwing a RunTimeException?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to