[ 
https://issues.apache.org/jira/browse/MINVOKER-351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17847593#comment-17847593
 ] 

ASF GitHub Bot commented on MINVOKER-351:
-----------------------------------------

michael-o commented on PR #242:
URL: 
https://github.com/apache/maven-invoker-plugin/pull/242#issuecomment-2118994193

   I have now tested the IT with the patched Plexus XML. Output looks fine now:
   ```
   [INFO] Running example.minvoker351.ExampleTest
   � - name: NULL
    - name: START OF HEADING
    - name: START OF TEXT
    - name: END OF TEXT
    - name: END OF TRANSMISSION
    - name: ENQUIRY
    - name: ACKNOWLEDGE
    - name: BELL
    - name: BACKSPACE
         - name: CHARACTER TABULATION
   
    - name: LINE FEED (LF)
    - name: LINE TABULATION
    - name: FORM FEED (FF)
   
    - name: CARRIAGE RETURN (CR)
    - name: SHIFT OUT
    - name: SHIFT IN
    - name: DATA LINK ESCAPE
    - name: DEVICE CONTROL ONE
    - name: DEVICE CONTROL TWO
    - name: DEVICE CONTROL THREE
    - name: DEVICE CONTROL FOUR
    - name: NEGATIVE ACKNOWLEDGE
    - name: SYNCHRONOUS IDLE
    - name: END OF TRANSMISSION BLOCK
    - name: CANCEL
    - name: END OF MEDIUM
    - name: SUBSTITUTE
    - name: ESCAPE
    - name: INFORMATION SEPARATOR FOUR
    - name: INFORMATION SEPARATOR THREE
    - name: INFORMATION SEPARATOR TWO
    - name: INFORMATION SEPARATOR ONE
     - name: SPACE
   ! - name: EXCLAMATION MARK
   " - name: QUOTATION MARK
   # - name: NUMBER SIGN
   $ - name: DOLLAR SIGN
   % - name: PERCENT SIGN
   & - name: AMPERSAND
   ' - name: APOSTROPHE
   ( - name: LEFT PARENTHESIS
   ) - name: RIGHT PARENTHESIS
   * - name: ASTERISK
   + - name: PLUS SIGN
   , - name: COMMA
   - - name: HYPHEN-MINUS
   . - name: FULL STOP
   / - name: SOLIDUS
   0 - name: DIGIT ZERO
   1 - name: DIGIT ONE
   2 - name: DIGIT TWO
   3 - name: DIGIT THREE
   4 - name: DIGIT FOUR
   5 - name: DIGIT FIVE
   6 - name: DIGIT SIX
   7 - name: DIGIT SEVEN
   8 - name: DIGIT EIGHT
   9 - name: DIGIT NINE
   : - name: COLON
   ; - name: SEMICOLON
   < - name: LESS-THAN SIGN
   = - name: EQUALS SIGN
   > - name: GREATER-THAN SIGN
   ? - name: QUESTION MARK
   @ - name: COMMERCIAL AT
   A - name: LATIN CAPITAL LETTER A
   B - name: LATIN CAPITAL LETTER B
   C - name: LATIN CAPITAL LETTER C
   D - name: LATIN CAPITAL LETTER D
   E - name: LATIN CAPITAL LETTER E
   F - name: LATIN CAPITAL LETTER F
   G - name: LATIN CAPITAL LETTER G
   H - name: LATIN CAPITAL LETTER H
   I - name: LATIN CAPITAL LETTER I
   J - name: LATIN CAPITAL LETTER J
   K - name: LATIN CAPITAL LETTER K
   L - name: LATIN CAPITAL LETTER L
   M - name: LATIN CAPITAL LETTER M
   N - name: LATIN CAPITAL LETTER N
   O - name: LATIN CAPITAL LETTER O
   P - name: LATIN CAPITAL LETTER P
   Q - name: LATIN CAPITAL LETTER Q
   R - name: LATIN CAPITAL LETTER R
   S - name: LATIN CAPITAL LETTER S
   T - name: LATIN CAPITAL LETTER T
   U - name: LATIN CAPITAL LETTER U
   V - name: LATIN CAPITAL LETTER V
   W - name: LATIN CAPITAL LETTER W
   X - name: LATIN CAPITAL LETTER X
   Y - name: LATIN CAPITAL LETTER Y
   Z - name: LATIN CAPITAL LETTER Z
   [ - name: LEFT SQUARE BRACKET
   \ - name: REVERSE SOLIDUS
   ] - name: RIGHT SQUARE BRACKET
   ^ - name: CIRCUMFLEX ACCENT
   _ - name: LOW LINE
   ` - name: GRAVE ACCENT
   a - name: LATIN SMALL LETTER A
   b - name: LATIN SMALL LETTER B
   c - name: LATIN SMALL LETTER C
   d - name: LATIN SMALL LETTER D
   e - name: LATIN SMALL LETTER E
   f - name: LATIN SMALL LETTER F
   g - name: LATIN SMALL LETTER G
   h - name: LATIN SMALL LETTER H
   i - name: LATIN SMALL LETTER I
   j - name: LATIN SMALL LETTER J
   k - name: LATIN SMALL LETTER K
   l - name: LATIN SMALL LETTER L
   m - name: LATIN SMALL LETTER M
   n - name: LATIN SMALL LETTER N
   o - name: LATIN SMALL LETTER O
   p - name: LATIN SMALL LETTER P
   q - name: LATIN SMALL LETTER Q
   r - name: LATIN SMALL LETTER R
   s - name: LATIN SMALL LETTER S
   t - name: LATIN SMALL LETTER T
   u - name: LATIN SMALL LETTER U
   v - name: LATIN SMALL LETTER V
   w - name: LATIN SMALL LETTER W
   x - name: LATIN SMALL LETTER X
   y - name: LATIN SMALL LETTER Y
   z - name: LATIN SMALL LETTER Z
   { - name: LEFT CURLY BRACKET
   | - name: VERTICAL LINE
   } - name: RIGHT CURLY BRACKET
   ~ - name: TILDE
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.080 
s 

> Prevent XML-prohibited characters from entering JUnit report
> ------------------------------------------------------------
>
>                 Key: MINVOKER-351
>                 URL: https://issues.apache.org/jira/browse/MINVOKER-351
>             Project: Maven Invoker Plugin
>          Issue Type: Bug
>            Reporter: Mikkel Kjeldsen
>            Assignee: Slawomir Jaranowski
>            Priority: Major
>             Fix For: 3.7.0
>
>         Attachments: minvoker-351.tar.gz
>
>
> Neither the Maven Invoker plugin's implementation of {{<writeJunitReport>}} 
> nor the underlying XML infrastructure directly protect against the presence 
> of character literals prohibited by the XML specification, meaning such 
> literals can appear in the JUnit report and render it unreadable. *I would 
> appreciate if the Maven Invoker plugin could learn to strip prohibited 
> literals to protect its users from creative plugins.* I argue that this is a 
> safe and expected transformation that is not materially lossy.
> ----
> h2. Background
> MINVOKER-196 added the {{<writeJunitReport>}} option [back in 
> maven-invoker-plugin-3.2.1|https://github.com/apache/maven-invoker-plugin/blob/maven-invoker-plugin-3.2.1/src/main/java/org/apache/maven/plugins/invoker/AbstractInvokerMojo.java#L1878-L1946].
>  As of [maven-invoker-plugin-3.6.0 the effective implementation of the JUnit 
> report remains effectively 
> unchanged|https://github.com/apache/maven-invoker-plugin/blob/maven-invoker-plugin-3.6.0/src/main/java/org/apache/maven/plugins/invoker/AbstractInvokerMojo.java#L1695-L1754].
> The JUnit report includes a {{<system-out>}} element ([example 
> documentation|https://github.com/testmoapp/junitxml]) whose value Maven 
> Invoker populates with the raw build log contents. I've observed that this 
> value is XML-escaped, which I imagine is well understood in the 
> implementation, although I can't immediately find documentation to support 
> that.
> However, escaping notwithstanding, a number of character literals are 
> outright prohibited by the XML specifications. These literals cannot be 
> escaped, and their presence renders an XML document not well formed. The 
> exact set of prohibited characters varies by XML version; the report produced 
> by the Maven Invoker plugin is XML version 1.0. When the Maven Invoker plugin 
> reads in the build log it does not strip these character literals and neither 
> does the XML writer the Maven Invoker plugin relies on. Consequently, if a 
> build log ends up including a prohibited character the resulting JUnit report 
> will not be well formed.
> The set of prohibited characters is the complement of [the XML 
> specification's definition of {{Char}}|https://www.w3.org/TR/xml/#NT-Char].
> h2. Example
> Among the literals prohibited by XML version 1.0 is {{^H}} (backspace). When 
> [pitest runs via Maven|https://pitest.org/quickstart/maven/] it prints a 
> spinner to standard out, and the implementation uses backspace to render the 
> spinner in place. I have used the Maven Invoker plugin with 
> {{<writeJunitReport>}} to verify a pitest configuration, whereby I discovered 
> this limitation.
> h2. Remediation
> h3. Blame plugins
> Perhaps pitest should not behave this way but we can't change pitest, and 
> even if pitest could be changed that offers no protection against any other 
> plugin, so blaming plugins is an ineffective course of action.
> h3. Work-arounds
> The user can manually clean the build log in-place via 
> {{<postBuildHookScript>}}. This is technically fairly easy to do, and makes 
> the transformation very explicit, but it requires considerable local work to 
> address an issue many would find obscure and the transformation is 
> permanently lossy unless the user also backs up the raw log to another file 
> name.
> h3. Strip prohibited literals inside Maven Invoker plugin
> If the Maven Invoker plugin learns to strip offending character literals 
> in-between reading the build log and writing to the {{<system-out>}} value 
> then {{<writeJunitReport>}} will Just Work™, which I assert is what a user 
> will typically expect. Although the {{<system-out>}} value will no longer 
> exactly match the build log contents, this lossy translation is acceptable: 
> the prohibited characters are overwhelmingly unprintable to begin with and 
> therefore cannot be meaningfully rendered in a static context, and the raw 
> build log remains unchanged in the event that the user needs to investigate 
> or assert against the raw output.
> This change would be backwards compatible, because any existing user that 
> would be affected by it would already have unparseable JUnit reports.
> * I _believe_ that Java's {{j.u.r.Pattern}} can trivially express the 
> complement of allowed characters but there may exist more efficient solutions.
> * Consider also applying this transformation to the 2 uses of 
> {{buildJob.getFailureMessage()}}.
> h4. Replace prohibited literals inside Maven Invoker plugin
> As a variation of stripping prohibited character literals, the Maven Invoker 
> plugin could substitute sentinel values for prohibited character literals. 
> This approach has the downside that it requires additional decision making 
> for determining suitable substitution(s) but is otherwise comparable.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to