sc/qa/unit/data/functions/text/fods/clean.fods |   24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

New commits:
commit 4a23511cdad3a82f7c628426ede3eb0928a3a325
Author:     Stephan Bergmann <[email protected]>
AuthorDate: Tue May 17 09:49:16 2022 +0200
Commit:     Stephan Bergmann <[email protected]>
CommitDate: Tue May 17 15:27:22 2022 +0200

    At least make CppunitTest_sc_text_functions_test more resilient to ICU 
version
    
    61f4250ee9f43902107e4d2e6322cbf54f52dd8e "Make CLEAN fully compliant woth 
ODFF
    v1.3" has changed lcl_ScInterpreter_IsPrintable
    (sc/source/core/tool/interpr1.cxx) to use ICU's u_isdefined to check for 
Unicode
    code points of category Cn (i.e., noncharacter or reserved).  This is at 
least
    questionable, as assignment of code points to that category varies with 
Unicode
    versions.  And while
    
<https://docs.oasis-open.org/office/OpenDocument/v1.3/os/part4-formula/OpenDocument-v1.3-os-part4-formula.html#__RefHeading__1017856_715980110>
    "Open Document Format for Office Applications (OpenDocument) Version 1.3.
    Part 4: Recalculated Formula (OpenFormula) Format: 1.4 Normative References"
    references "The Unicode Standard, Version 5.2.0" (so one might expect CLEAN 
to
    use the category classification from that old Unicode version), versions of 
ICU
    keep being updated with current Unicode versions' category classifications. 
 For
    example, the currently bundled external/icu's icu4c-70_1-src.tgz uses 
Unicode 14
    (according to <https://icu.unicode.org/download/70#h.x1orhyniml8k>) for its
    implementation of u_isdefined.  And for --with-system-icu, all that 
configure.ac
    apparently requires is that "icu-i18n >= 4.6" (i.e., it will potentially 
allow
    the behavior of u_isdefined to vary over a wide range of Unicode versions).
    
    And case in point, 61f4250ee9f43902107e4d2e6322cbf54f52dd8e also added a 
test to
    sc/qa/unit/data/functions/text/fods/clean.fods (row 47) that verifies that
    U+FDCF ARABIC LIGATURE SALAAMUHU ALAYNAA does not get cleaned away by CLEAN.
    But U+FDCF is only an assigned code point (thus no longer of category Cn) 
since
    Unicode 14 (cf.
    <https://www.unicode.org/charts/PDF/Unicode-14.0/U140-FB50.pdf>), so while
    builds against external/icu (covering Unicode 14) succeed, --with-system-icu
    builds like a flatpak build against org.freedesktop.Sdk//21.08, still at 
ICU 69
    and Unicode 13, fail CppunitTest_sc_text_functions_test with
    
    > Testing load 
file:///run/build/libreoffice//sc/qa/unit/data/functions/text/fods/clean.fods:
    > 
/run/build/libreoffice/sc/qa/unit/functions_test.cxx:43:TextFunctionsTest::testTextFormulasFODS
    > double equality assertion failed
    > - Expected: 1
    > - Actual  : 0
    > - Delta   : 1e-14
    
    (<https://flathub.org/builds/#/builders/11/builds/7103>).
    
    Irrespective of whether using ICU's varying u_isdefined in the 
implementation of
    CLEAN is correct, at least make that "doesn't get CLEAN'ed away" test more
    resilient to what version of ICU is being used, by using F+UF00 TIBETIAN
    SYLLABLE OM, which got added all the way back in Unicode 2, rather than 
U+FDCF
    ARABIC LIGATURE SALAAMUHU ALAYNAA, which only got added in Unicode 14.
    
    (And to add insult to injury, in 
sc/qa/unit/data/functions/text/fods/clean.fods
    61f4250ee9f43902107e4d2e6322cbf54f52dd8e encoded U+FDCF in text not as the 
three
    UTF-8 bytes 0xEF 0xB7 0x8F, but rather re-encoded as the six bytes 0xC3 0xAF
    0xC2 0xB7 0xC2 0x8F, i.e., the three characters U+00EF LATIN SMALL LETTER I 
WITH
    DIARESIS, U+00B7 MIDDLE DOT, U+008F.  But I assume that was just a mistake, 
not
    something that I should faithfully copy in the file's new version.)
    
    Change-Id: Icc8d879b1397d8292914cbd31708d0c561f3b06e
    Reviewed-on: https://gerrit.libreoffice.org/c/core/+/134474
    Tested-by: Jenkins
    Reviewed-by: Stephan Bergmann <[email protected]>

diff --git a/sc/qa/unit/data/functions/text/fods/clean.fods 
b/sc/qa/unit/data/functions/text/fods/clean.fods
index c5531ad364d5..dd74cc334adf 100644
--- a/sc/qa/unit/data/functions/text/fods/clean.fods
+++ b/sc/qa/unit/data/functions/text/fods/clean.fods
@@ -2515,11 +2515,11 @@
      <table:table-cell table:number-columns-repeated="3"/>
     </table:table-row>
     <table:table-row table:style-name="ro5">
-     <table:table-cell table:formula="of:=CLEAN([.J47])" 
office:value-type="string" office:string-value="﷏Test text﷏" 
calcext:value-type="string">
-      <text:p>﷏Test text﷏</text:p>
+     <table:table-cell table:formula="of:=CLEAN([.J47])" 
office:value-type="string" office:string-value="ༀTest textༀ" 
calcext:value-type="string">
+      <text:p>ༀTest textༀ</text:p>
      </table:table-cell>
-     <table:table-cell table:formula="of:=IF([.E47]=&quot;no clean&quot;; 
[.$J47];&quot;Test text&quot;)" office:value-type="string" 
office:string-value="﷏Test text﷏" calcext:value-type="string">
-      <text:p>﷏Test text﷏</text:p>
+     <table:table-cell table:formula="of:=IF([.E47]=&quot;no clean&quot;; 
[.$J47];&quot;Test text&quot;)" office:value-type="string" 
office:string-value="ༀTest textༀ" calcext:value-type="string">
+      <text:p>ༀTest textༀ</text:p>
      </table:table-cell>
      <table:table-cell table:style-name="ce22" 
table:formula="of:=[.A47]=[.B47]" office:value-type="boolean" 
office:boolean-value="true" calcext:value-type="boolean">
       <text:p>TRUE</text:p>
@@ -2531,17 +2531,17 @@
       <text:p>no clean</text:p>
      </table:table-cell>
      <table:table-cell/>
-     <table:table-cell table:formula="of:=DEC2HEX([.H47])" 
office:value-type="string" office:string-value="FDCF" 
calcext:value-type="string">
-      <text:p>FDCF</text:p>
+     <table:table-cell table:formula="of:=DEC2HEX([.H47])" 
office:value-type="string" office:string-value="0F00" 
calcext:value-type="string">
+      <text:p>0F00</text:p>
      </table:table-cell>
-     <table:table-cell table:style-name="ce16" office:value-type="float" 
office:value="64975" calcext:value-type="float">
-      <text:p>64975</text:p>
+     <table:table-cell table:style-name="ce16" office:value-type="float" 
office:value="3840" calcext:value-type="float">
+      <text:p>3840</text:p>
      </table:table-cell>
-     <table:table-cell table:style-name="ce28" 
table:formula="of:=UNICHAR([.H47])" office:value-type="string" 
office:string-value="﷏" calcext:value-type="string">
-      <text:p>﷏</text:p>
+     <table:table-cell table:style-name="ce28" 
table:formula="of:=UNICHAR([.H47])" office:value-type="string" 
office:string-value="ༀ" calcext:value-type="string">
+      <text:p>ༀ</text:p>
      </table:table-cell>
-     <table:table-cell table:formula="of:=[.I47] &amp; &quot;Test text&quot; 
&amp; [.I47]" office:value-type="string" office:string-value="﷏Test text﷏" 
calcext:value-type="string">
-      <text:p>﷏Test text﷏</text:p>
+     <table:table-cell table:formula="of:=[.I47] &amp; &quot;Test text&quot; 
&amp; [.I47]" office:value-type="string" office:string-value="ༀTest textༀ" 
calcext:value-type="string">
+      <text:p>ༀTest textༀ</text:p>
      </table:table-cell>
      <table:table-cell office:value-type="string" calcext:value-type="string">
       <text:p>Tdf#97706</text:p>

Reply via email to