jira-importer opened a new issue, #585:
URL: https://github.com/apache/maven-invoker-plugin/issues/585

   **[Mikkel 
Kjeldsen](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=JIRAUSER303566)**
 opened 
**[MINVOKER-351](https://issues.apache.org/jira/browse/MINVOKER-351?redirect=false)**
 and commented
   
   Neither the Maven Invoker plugin's implementation of `<writeJunitReport>` 
nor the underlying XML infrastructure directly protect against the presence of 
character literals prohibited by the XML specification, meaning such literals 
can appear in the JUnit report and render it unreadable. **I would appreciate 
if the Maven Invoker plugin could learn to strip prohibited literals to protect 
its users from creative plugins.** I argue that this is a safe and expected 
transformation that is not materially lossy.
   
   ---
   
   ## Background
   
   MINVOKER-196 added the `<writeJunitReport>` option [back in 
maven-invoker-plugin-3.2.1](https://github.com/apache/maven-invoker-plugin/blob/maven-invoker-plugin-3.2.1/src/main/java/org/apache/maven/plugins/invoker/AbstractInvokerMojo.java#L1878-L1946).
 As of [maven-invoker-plugin-3.6.0 the effective implementation of the JUnit 
report remains effectively 
unchanged](https://github.com/apache/maven-invoker-plugin/blob/maven-invoker-plugin-3.6.0/src/main/java/org/apache/maven/plugins/invoker/AbstractInvokerMojo.java#L1695-L1754).
   
   The JUnit report includes a `<system-out>` element ([example 
documentation](https://github.com/testmoapp/junitxml)) whose value Maven 
Invoker populates with the raw build log contents. I've observed that this 
value is XML-escaped, which I imagine is well understood in the implementation, 
although I can't immediately find documentation to support that.
   
   However, escaping notwithstanding, a number of character literals are 
outright prohibited by the XML specifications. These literals cannot be 
escaped, and their presence renders an XML document not well formed. The exact 
set of prohibited characters varies by XML version; the report produced by the 
Maven Invoker plugin is XML version 1.0. When the Maven Invoker plugin reads in 
the build log it does not strip these character literals and neither does the 
XML writer the Maven Invoker plugin relies on. Consequently, if a build log 
ends up including a prohibited character the resulting JUnit report will not be 
well formed.
   
   The set of prohibited characters is the complement of [the XML 
specification's definition of `Char`](https://www.w3.org/TR/xml/#NT-Char).
   
   ## Example
   
   Among the literals prohibited by XML version 1.0 is `^H` (backspace). When 
[pitest runs via Maven](https://pitest.org/quickstart/maven/) it prints a 
spinner to standard out, and the implementation uses backspace to render the 
spinner in place. I have used the Maven Invoker plugin with 
`<writeJunitReport>` to verify a pitest configuration, whereby I discovered 
this limitation.
   
   ## Remediation
   
   ### Blame plugins
   
   Perhaps pitest should not behave this way but we can't change pitest, and 
even if pitest could be changed that offers no protection against any other 
plugin, so blaming plugins is an ineffective course of action.
   
   ### Work-arounds
   
   The user can manually clean the build log in-place via 
`<postBuildHookScript>`. This is technically fairly easy to do, and makes the 
transformation very explicit, but it requires considerable local work to 
address an issue many would find obscure and the transformation is permanently 
lossy unless the user also backs up the raw log to another file name.
   
   ### Strip prohibited literals inside Maven Invoker plugin
   
   If the Maven Invoker plugin learns to strip offending character literals 
in-between reading the build log and writing to the `<system-out>` value then 
`<writeJunitReport>` will Just Work™, which I assert is what a user will 
typically expect. Although the `<system-out>` value will no longer exactly 
match the build log contents, this lossy translation is acceptable: the 
prohibited characters are overwhelmingly unprintable to begin with and 
therefore cannot be meaningfully rendered in a static context, and the raw 
build log remains unchanged in the event that the user needs to investigate or 
assert against the raw output.
   
   This change would be backwards compatible, because any existing user that 
would be affected by it would already have unparseable JUnit reports.
   
   * I _believe_ that Java's `j.u.r.Pattern` can trivially express the 
complement of allowed characters but there may exist more efficient solutions.
   * Consider also applying this transformation to the 2 uses of 
`buildJob.getFailureMessage()`.
   
   #### Replace prohibited literals inside Maven Invoker plugin
   
   As a variation of stripping prohibited character literals, the Maven Invoker 
plugin could substitute sentinel values for prohibited character literals. This 
approach has the downside that it requires additional decision making for 
determining suitable substitution(s) but is otherwise comparable.
   
   
   ---
   
   **Attachments:**
   - 
[minvoker-351.tar.gz](https://issues.apache.org/jira/secure/attachment/13068809/minvoker-351.tar.gz)
 (_2.16 kB_)
   
   **Issue Links:**
   - [MINVOKER-196](https://issues.apache.org/jira/browse/MINVOKER-196) Support 
for JUnit report style
    (_**"is caused by"**_)
   - [MINVOKER-348](https://issues.apache.org/jira/browse/MINVOKER-348) Build 
job report is truncated
   
   
   **Remote Links:**
   - [GitHub Pull Request #242
   ](https://github.com/apache/maven-invoker-plugin/pull/242)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@maven.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to