[jira] [Commented] (DOXIA-542) Markdown module converts all apostrophes to quotation marks

2017-02-06 Thread Wolfgang Illmeyer (JIRA)

[ 
https://issues.apache.org/jira/browse/DOXIA-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15855066#comment-15855066
 ] 

Wolfgang Illmeyer commented on DOXIA-542:
-

Yes there is. Unicode is about *characters*, not about glyphs. See Unicode p2.2 
section »Characters, not glyphs«. The first of your linked article completely 
misses this distinction. An apostrophe character conveys the semantics of an 
apostrophe as used in many languages (see e.g. 
https://en.wikipedia.org/wiki/Apostrophe: »The apostrophe looks the same as a 
closing single quotation mark, although they have different meanings.«). 
Notice, how the character is called »APOSTROPHE« and not »Prime«, nor »Single 
quotation mark« and how the the »RIGHT SINGLE QUOTATION MARK« is not called 
»Apostrophe«. The second hand quote of the Unicode standard in the other 
article is of course valid, but it is about *typesetting*, not storage (»When 
text is set, […]«). The recommendation about using a single quote is only 
relevant, when you are a browser or a DTP package, not for semantics bearing 
Unicode-Text (even if it is in HTML).

At the very least, this »feature« has to be optional, preferably off by 
default. And I have high hopes, that the upcoming pegdown replacement may not 
do this kind of damage to my documentation at all.

> Markdown module converts all apostrophes to quotation marks
> ---
>
> Key: DOXIA-542
> URL: https://issues.apache.org/jira/browse/DOXIA-542
> Project: Maven Doxia
>  Issue Type: Bug
>  Components: Module - Markdown
>Affects Versions: 1.4, 1.7
>Reporter: Wolfgang Illmeyer
>  Labels: close-pending
>
> Whenever there is some text in a markdown file containing an apostrophe 
> (U+0027, e.g. »don't«), it is seemingly unconditionally replaced by a »right 
> single quotation mark« (U+2019).
> The problem seems to be an out-of-whack »smart« feature of the underlying 
> pegdown library, which is supposed to perform all kinds of typographic black 
> magic. I'd suggest disabling that (or at least make it configurable), because 
> apostrophes are not quotation marks and modern keyboard layouts have all the 
> fancy typographic characters such as different length dashes, ellipses, and 
> all sorts of quotation marks already easily available.
> The fix is relatively trivial:
> {code}
> --- 
> a/doxia-modules/doxia-module-markdown/src/main/java/org/apache/maven/doxia/module/markdown/MarkdownParser.java
> +++ 
> b/doxia-modules/doxia-module-markdown/src/main/java/org/apache/maven/doxia/module/markdown/MarkdownParser.java
> @@ -70,7 +70,7 @@ public class MarkdownParser
>   * The {@link PegDownProcessor} used to convert Pegdown documents to 
> HTML.
>   */
>  protected static final PegDownProcessor PEGDOWN_PROCESSOR =
> -new PegDownProcessor( Extensions.ALL & ~Extensions.HARDWRAPS, 
> Long.MAX_VALUE );
> +new PegDownProcessor( Extensions.ALL & ~Extensions.HARDWRAPS & 
> ~Extensions.SMARTYPANTS, Long.MAX_VALUE );
>  
>  /**
>   * Regex that identifies a multimarkdown-style metadata section at the 
> start of the document
> {code}
> But this makes some tests fail and I didn't have the time to figure out how 
> to fix them.
> Also, the resulting apostrophes probably need to be escaped in the HTML.
> I tested the patch with 1.7.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DOXIA-542) Markdown module converts all apostrophes to quotation mark

2016-07-05 Thread Wolfgang Illmeyer (JIRA)
Wolfgang Illmeyer created DOXIA-542:
---

 Summary: Markdown module converts all apostrophes to quotation mark
 Key: DOXIA-542
 URL: https://issues.apache.org/jira/browse/DOXIA-542
 Project: Maven Doxia
  Issue Type: Bug
  Components: Module - Markdown
Affects Versions: 1.7, 1.4
Reporter: Wolfgang Illmeyer


Whenever there is some text in a markdown file containing an apostrophe 
(U+0027, e.g. »don't«), it is seemingly unconditionally replaced by a »right 
single quotation mark« (U+2019).

The problem seems to be an out-of-whack »smart« feature of the underlying 
pegdown library, which is supposed to perform all kinds of typographic black 
magic. I'd suggest disabling that (or at least make it configurable), because 
apostrophes are not quotation marks and modern keyboard layouts have all the 
fancy typographic characters such as different length dashes, ellipses, and all 
sorts of quotation marks already easily available.

The fix is relatively trivial:

{code}
--- 
a/doxia-modules/doxia-module-markdown/src/main/java/org/apache/maven/doxia/module/markdown/MarkdownParser.java
+++ 
b/doxia-modules/doxia-module-markdown/src/main/java/org/apache/maven/doxia/module/markdown/MarkdownParser.java
@@ -70,7 +70,7 @@ public class MarkdownParser
  * The {@link PegDownProcessor} used to convert Pegdown documents to HTML.
  */
 protected static final PegDownProcessor PEGDOWN_PROCESSOR =
-new PegDownProcessor( Extensions.ALL & ~Extensions.HARDWRAPS, 
Long.MAX_VALUE );
+new PegDownProcessor( Extensions.ALL & ~Extensions.HARDWRAPS & 
~Extensions.SMARTYPANTS, Long.MAX_VALUE );
 
 /**
  * Regex that identifies a multimarkdown-style metadata section at the 
start of the document
{code}

But this makes some tests fail and I didn't have the time to figure out how to 
fix them.
Also, the resulting apostrophes probably need to be escaped in the HTML.

I tested the patch with 1.7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DOXIA-542) Markdown module converts all apostrophes to quotation marks

2016-07-05 Thread Wolfgang Illmeyer (JIRA)

 [ 
https://issues.apache.org/jira/browse/DOXIA-542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wolfgang Illmeyer updated DOXIA-542:

Summary: Markdown module converts all apostrophes to quotation marks  (was: 
Markdown module converts all apostrophes to quotation mark)

> Markdown module converts all apostrophes to quotation marks
> ---
>
> Key: DOXIA-542
> URL: https://issues.apache.org/jira/browse/DOXIA-542
> Project: Maven Doxia
>  Issue Type: Bug
>  Components: Module - Markdown
>Affects Versions: 1.4, 1.7
>Reporter: Wolfgang Illmeyer
>
> Whenever there is some text in a markdown file containing an apostrophe 
> (U+0027, e.g. »don't«), it is seemingly unconditionally replaced by a »right 
> single quotation mark« (U+2019).
> The problem seems to be an out-of-whack »smart« feature of the underlying 
> pegdown library, which is supposed to perform all kinds of typographic black 
> magic. I'd suggest disabling that (or at least make it configurable), because 
> apostrophes are not quotation marks and modern keyboard layouts have all the 
> fancy typographic characters such as different length dashes, ellipses, and 
> all sorts of quotation marks already easily available.
> The fix is relatively trivial:
> {code}
> --- 
> a/doxia-modules/doxia-module-markdown/src/main/java/org/apache/maven/doxia/module/markdown/MarkdownParser.java
> +++ 
> b/doxia-modules/doxia-module-markdown/src/main/java/org/apache/maven/doxia/module/markdown/MarkdownParser.java
> @@ -70,7 +70,7 @@ public class MarkdownParser
>   * The {@link PegDownProcessor} used to convert Pegdown documents to 
> HTML.
>   */
>  protected static final PegDownProcessor PEGDOWN_PROCESSOR =
> -new PegDownProcessor( Extensions.ALL & ~Extensions.HARDWRAPS, 
> Long.MAX_VALUE );
> +new PegDownProcessor( Extensions.ALL & ~Extensions.HARDWRAPS & 
> ~Extensions.SMARTYPANTS, Long.MAX_VALUE );
>  
>  /**
>   * Regex that identifies a multimarkdown-style metadata section at the 
> start of the document
> {code}
> But this makes some tests fail and I didn't have the time to figure out how 
> to fix them.
> Also, the resulting apostrophes probably need to be escaped in the HTML.
> I tested the patch with 1.7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)