stevedlawrence commented on code in PR #1514:
URL: https://github.com/apache/daffodil/pull/1514#discussion_r2229265305
##########
daffodil-schematron/src/main/scala/org/apache/daffodil/validation/schematron/Schematron.scala:
##########
@@ -48,11 +48,15 @@ object Schematron {
}
final class Schematron private (reader: XMLReader, templates: Templates) {
Review Comment:
I think this XMLReader actually isn't used in a thread-safe way?
`SchematronValidatorFactory.makeValidator` calls `Schematron.fromRules(...)`
to create a single `Schematron` instance that is used whenever `validateXML` is
called. But `fromRules` calls `xmlReader.get()`, so that single `Schematron`
instance always uses the same `xmlReader` that it got when it created it. So
even though xmlReader is a ThreadLocal, we always use the same instance
regardless of the thread, which means multiple threads could use the same
xmlreader and cause issues.
I think part of what is making this difficult to make sense of is the
Schematron class, where some of its parameters aren't thread safe. I wonder if
things would be more clear if we just removed it, and moved its logic and
thread locals into the SchematronValidator? Since SchematronValidator must be
thread-safe it's essentially a requirement that its parameters must all be
thread safe, and any local members must either be thread safe or thread locals.
So if we removed Schematron, we could do something like this:
```scala
// Template is thread safe so can be class member
class SchematronValidator(template: Template, svrlPath: Option[URI]) {
// XMLReader and Transformer are not thread safe so are ThreadLocal's.
Alternatively they could be
// created every time validateXML, but ThreadLocal allows reuse of objects
that are potentially expensive to create
val readerTransformerTL = new ThreadLocal[(XMLReader, Transformer)] = {
override def initialValue(): (XMLReader, Transformer) = {
val reader = DaffodilSaxParserFactory()
...
val transformer = template.newTransformer
(reader, transformer)
}
}
override def validateXML(...) = {
val (reader, transformer) = readerTransformerTL.get
val writer = ...
transformer.transform(new SAXSource(reader, ...), writer)
...
}
```
Note that this puts XMLReader and Transformer in the same ThreadLocal as a
tuple, so we only need to do one ThreadLocal lookup per validateXML.
Also note there's sort of an issue that multiple threads could write to the
same svrlPath if it was provided, but maybe we should just document in
package-info.java that schematron.svrl.file is not thread safe and will be
overwritten
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]