When _M_get_sys_info seeds a Zone line by looking up the active rule
just before info.begin, the previous code interpreted each rule in
isolation against ri.offset() (the line's standard offset alone),
ignoring the running save accumulated by earlier rules in the same
year.  For most zones this gives the right answer because the search
only matters when no rule has fired yet, but for zones whose rule
set has wall-time rules whose effective firing time depends on a
prior rule's save it produces wrong answers.

Canonical case: Europe/Paris around 1945.  France's rules

  R Fr 1945 o - Apr 2  2 2 M
  R Fr 1945 o - Sep 16 3 0 -

both use plain wall time.  In Paris's stdoff=1 frame, the September
rule's at_time of 03:00 wall translates to UT Sep 16 02:00 if no
prior save is applied, but to UT Sep 16 00:00 once the running save
of 2h from the April rule is taken into account.  When seeding a
sys_info whose info.begin falls between those two values, the simple
search picks the April rule (save=2 → CEMT, total offset 3h) when
the correct answer is the September rule (save=0 → CET, total offset
1h).  The harness reports this as a sustained CEMT stretch where
zic and libc agree on CET.

To address above the finding algorithm, is now expanded to collect
tree rule transitions around specified time t, while continuing to
ignore the save.
* curr_tran: transition happening before or at time t
* prev_tran: transition preceding above transition
* next_tran: transition happening after tiem t.

This collects sufficient information to adjust the start_time (if
Wall time is used) for curr_tran (save of prev_tran) and next_tran
(save of curr_tran). Assuming that applying save value does not
change order of transition (cascadding save would be ill-defined
otherwise), after the adjustment the actual active rule is:
 * next_tran.rule: if the adjustment pushed next_tran.when to
   time before or at t, which happen for positive save (see test_paris),
 * prev_tran.rule: if adjustment pushed curr_tran.when to time
   after time t, which happens for negative save (see test_negative),
 * curr_tran.rule.

The fallback "earliest STD rule" logic is preserved for the case
where no rule has fired yet, but is extracted to separate function.
This lookup is optmized, by seraching the rules by name, from, and
save in that order, grouping std rules in given year together.

        PR libstdc++/124853

libstdc++-v3/ChangeLog:

        * src/c++20/tzdb.cc
        (time_zone::_M_get_sys_info): Extract code blocks to
        seprate functions, and invoke them.
        (<unnamed>::find_active_rule): Modify algorithm to
        handle cascading saves.
        (<unnamed>::find_first_std): Simplified implementation
        benfiting from reordering of rules.
        (chrono::reload_tzdb): Sort rules by name, from and save.
        * testsuite/std/time/time_zone/wall_cascade.cc: New test.

Co-authored-by: Álvaro Begué <[email protected]>
Signed-off-by: Tomasz Kamiński <[email protected]>
Signed-off-by: Álvaro Begué <[email protected]>
---
v3 takes a different approach for implementing find_active_rule,
avoiding collecting all rule transitions. Expanded test cases to
include a synthetic test with cascading negative save.

Testing on x86_64-linux. std/time/time_zone* test passed already.
OK for trunk when all test passes?

Not sure regarding backport now, would preffer to finish the
remaining patches, and see if there are any new regerssion when
comparing with zic.


 libstdc++-v3/src/c++20/tzdb.cc                | 181 ++++++++++++------
 .../std/time/time_zone/wall_cascade.cc        | 115 +++++++++++
 2 files changed, 239 insertions(+), 57 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/std/time/time_zone/wall_cascade.cc

diff --git a/libstdc++-v3/src/c++20/tzdb.cc b/libstdc++-v3/src/c++20/tzdb.cc
index 5793155b6d8..a044620cc8f 100644
--- a/libstdc++-v3/src/c++20/tzdb.cc
+++ b/libstdc++-v3/src/c++20/tzdb.cc
@@ -684,6 +684,119 @@ namespace std::chrono
       }
 #endif
     };
+
+    const Rule*
+    find_active_rule(span<const Rule> rules, sys_seconds t, seconds std_offset)
+    {
+      struct Transition {
+       const Rule* rule;
+       sys_seconds when;
+      };
+
+      const year_month_day date(chrono::floor<days>(t));
+      // Rule specifying start time as Wall time, should apply
+      // running save accumulated by eariel rules. To handle
+      // that we firstly collect transitions surrounding specified
+      // time t, ignoring the save:
+      // * curr_tran - rule active directly before or at t,
+      // * prev_tran - rule transition before curr_tran
+      // * next_tran - rule transition directly after t
+      Transition prev_tran{nullptr, sys_seconds::min()};
+      Transition curr_tran{nullptr, sys_seconds::min()};
+      Transition next_tran{nullptr, sys_seconds::max()};
+      for (const auto& rule : rules)
+       {
+         year y = date.year();
+
+         if (y > rule.to) // rule no longer applies at time t
+           continue;
+         if (y < rule.from) // rule doesn't apply yet at time t
+           break; // rules are ordered by from in ascending other
+
+         seconds offset{}; // appropriate for at_time::Universal
+         if (rule.when.indicator == at_time::Wall
+             || rule.when.indicator == at_time::Standard)
+           offset = std_offset;
+
+         // Time the rule takes effect this year:
+         const sys_seconds rule_start = rule.start_time(y, offset);
+         
+         // Times at which rule takes affect before (or equal) t,
+         // and after t, respectivelly.
+         sys_seconds start_before, start_after;
+         if (rule_start <= t)
+           {
+             start_before = rule_start;
+             start_after = rule.start_time(++y, offset);
+           }
+         else
+           {
+             start_after = rule_start;
+             start_before = rule.start_time(--y, offset);
+           }
+
+         if (curr_tran.when < start_before)
+           {
+             prev_tran = curr_tran;
+             curr_tran = {&rule, start_before};
+           }
+         else if (prev_tran.when < start_before)
+           prev_tran = {&rule, start_before};
+
+         if (start_after < next_tran.when)
+           next_tran = {&rule, start_after};
+       }
+
+      // No rule was active at the time of t, running save
+      // cannot change this output, as we have no save to apply.
+      if (!curr_tran.rule)
+       return nullptr;
+     
+      auto cascade_save = [](const Rule* from, Transition& to)
+      {
+       if (!from || from->save == seconds(0))
+         return false;
+       if (!to.rule || to.rule->when.indicator != at_time::Wall)
+          return false;
+       to.when -= from->save;
+       return true;
+      };
+ 
+      if (cascade_save(curr_tran.rule, next_tran))
+       // Running save move what we considered next_tran to time
+       // before or at t, in that case next_tran is active rule.
+       if (next_tran.when <= t)
+         return next_tran.rule;
+
+      if (cascade_save(prev_tran.rule, curr_tran))
+       // Running save moved what we consider curr_tran to
+       // time after t, in that case prev_tran is active rule.
+       if (curr_tran.when > t)
+         return prev_tran.rule;
+
+      return curr_tran.rule;
+    }
+
+    const Rule*
+    find_first_std(span<const Rule> rules)
+    {
+      auto is_std = [](const Rule& rule) { return !rule.save.count(); };
+      // Rules with same name are sorted by year and then save in ascending 
order.
+      auto it = ranges::find_if(rules, is_std);
+      if (it == rules.end())
+       return nullptr;
+
+      const Rule* first = &*it;
+      const year y = first->from;
+      for (const Rule& next : span<const Rule>(++it, rules.end()))
+       {
+         if (!is_std(next) || next.from > y)
+           break;
+         if (next.start_time(y, {}) < first->start_time(y, {}))
+           first = &next;
+       }
+      return first;
+    }
   } // namespace
 #endif // TZDB_DISABLED
 
@@ -872,70 +985,17 @@ namespace std::chrono
 
     if (letters.empty())
       {
-       sys_seconds t = info.begin - seconds(1);
-       const year_month_day date(chrono::floor<days>(t));
-
        // Try to find a Rule active before this time, to get initial
        // SAVE and LETTERS values. There may not be a Rule for the period
        // before the first DST transition, so find the earliest DST->STD
        // transition and use the LETTERS from that.
-       const Rule* active_rule = nullptr;
-       sys_seconds active_rule_start = sys_seconds::min();
-       const Rule* first_std = nullptr;
-       for (const auto& rule : rules)
-         {
-           if (rule.save == minutes(0))
-             {
-               if (!first_std)
-                 first_std = &rule;
-               else if (rule.from < first_std->from)
-                 first_std = &rule;
-               else if (rule.from == first_std->from)
-                 {
-                   if (rule.start_time(rule.from, {})
-                         < first_std->start_time(first_std->from, {}))
-                     first_std = &rule;
-                 }
-             }
-
-           year y = date.year();
-
-           if (y > rule.to) // rule no longer applies at time t
-             continue;
-           if (y < rule.from) // rule doesn't apply yet at time t
-             continue;
-
-           sys_seconds rule_start;
-
-           seconds offset{}; // appropriate for at_time::Universal
-           if (rule.when.indicator == at_time::Wall)
-             offset = info.offset;
-           else if (rule.when.indicator == at_time::Standard)
-             offset = ri.offset();
-
-           // Time the rule takes effect this year:
-           rule_start = rule.start_time(y, offset);
-
-           if (rule_start >= t && rule.from < y)
-             {
-               // Try this rule in the previous year.
-               rule_start = rule.start_time(--y, offset);
-             }
-
-           if (active_rule_start < rule_start && rule_start < t)
-             {
-               active_rule_start = rule_start;
-               active_rule = &rule;
-             }
-         }
-
-       if (active_rule)
+       if (const Rule* active_rule = find_active_rule(rules, info.begin - 
seconds(1), ri.offset()))
          {
            info.offset = ri.offset() + active_rule->save;
            info.save = chrono::duration_cast<minutes>(active_rule->save);
            letters = active_rule->letters;
          }
-       else if (first_std)
+       else if (const Rule* first_std = find_first_std(rules))
          letters = first_std->letters;
       }
 
@@ -1752,7 +1812,14 @@ namespace
 
     ranges::sort(node->db.zones, {}, &time_zone::name);
     ranges::sort(node->db.links, {}, &time_zone_link::name);
-    ranges::stable_sort(node->rules, {}, &Rule::name);
+    ranges::sort(node->rules, [](const Rule& lhs, const Rule& rhs)
+    {
+      if (auto result = lhs.name <=> rhs.name; result != 0)
+       return result < 0;
+      if (auto result = lhs.from <=> rhs.from; result != 0)
+       return result < 0;
+      return lhs.save < rhs.save;
+    });
 
     return Node::_S_replace_head(std::move(head), std::move(node));
 #else
@@ -2365,7 +2432,7 @@ namespace
            {
              if (c = in.get(); c == '<' || c == '>')
                if (in.get() == '=')
-                 if (unsigned d; (in >> d) && (d <= 31)) [[likely]]
+                 if (unsigned d; (in >> d) && (d <= 31)) [[likely]]
                    {
                      on.kind = c == '<' ? LessEq : GreaterEq;
                      on.day_of_week = w.wd.c_encoding();
diff --git a/libstdc++-v3/testsuite/std/time/time_zone/wall_cascade.cc 
b/libstdc++-v3/testsuite/std/time/time_zone/wall_cascade.cc
new file mode 100644
index 00000000000..1b937c6f248
--- /dev/null
+++ b/libstdc++-v3/testsuite/std/time/time_zone/wall_cascade.cc
@@ -0,0 +1,115 @@
+// { dg-do run { target c++20 } }
+// { dg-require-effective-target tzdb }
+// { dg-require-effective-target cxx11_abi }
+// { dg-xfail-run-if "no weak override on AIX" { powerpc-ibm-aix* } }
+
+// Wall-time rules in the same rule set whose effective firing time
+// depends on a prior rule's save (Europe/Paris 1945):
+//   1945 Apr 2  02:00 wall  save=2  M
+//   1945 Sep 16 03:00 wall  save=0  -
+// In the (stdoff=1, save=2) frame the September rule fires at
+// Sep 16 00:00 UT, not Sep 16 02:00 UT.
+
+#include <chrono>
+#include <fstream>
+#include <testsuite_hooks.h>
+
+static bool override_used = false;
+
+namespace __gnu_cxx
+{
+  const char* zoneinfo_dir_override() {
+    override_used = true;
+    return "./";
+  }
+}
+
+using namespace std::chrono;
+
+void
+test_paris()
+{
+  // Line 1 ends at "1945 Sep 16 1u" (Universal time, no save shenanigans),
+  // so info.begin for line 2 is exactly 1945-09-16 01:00 UT.
+  //
+  // Two-line zone whose second line begins at 1945 Sep 16 01:00 UT,
+  // between the cascaded firing time (Sep 16 00:00 UT) and the
+  // non-cascaded firing time (Sep 16 02:00 UT) of the September rule.
+  // The seeding must pick the September rule (save=0, CET) at info.begin.
+  std::ofstream("tzdata.zi") << R"(# version test_wall_cascade
+R Fr 1945 o - Apr 2  2 2 M
+R Fr 1945 o - Sep 16 3 0 -
+Z Test/Paris 0  -  X     1945 Sep 16 1u
+             1  Fr CE%sT
+)";
+
+  const auto& db = reload_tzdb();
+  VERIFY( override_used ); // If this fails then XFAIL for the target.
+  VERIFY( db.version == "test_wall_cascade" );
+
+  auto* tz = locate_zone("Test/Paris");
+
+  // Line 2 begins at exactly 1945-09-16 01:00 UT.  Sample one second
+  // after the boundary, well inside line 2's first sys_info.
+  auto info = tz->get_info(sys_seconds{
+      sys_days(1945y/September/16) + 1h + 1s});
+  VERIFY( info.offset == 1h );
+  VERIFY( info.save == 0min );
+  VERIFY( info.abbrev == "CET" );
+
+  // The boundary instant itself is in the new line.
+  auto at_boundary
+    = tz->get_info(sys_seconds{sys_days(1945y/September/16) + 1h});
+  VERIFY( at_boundary.offset == 1h );
+  VERIFY( at_boundary.save == 0min );
+
+  // Sample later still in line 2 (winter): unchanged.
+  auto winter = tz->get_info(sys_days(1945y/December/1));
+  VERIFY( winter.offset == 1h );
+  VERIFY( winter.save == 0min );
+}
+
+void
+test_negative()
+{
+  // This is synthetic version of above example, with negative
+  // running save.
+  // 
+  // Two-line zone whose second line begins at 1945 Sep 16 01:00 UT,
+  // at the cascaded firing time (Sep 16 01:00 UT), but after
+  // non-cascaded firing time (Sep 16 00:00 UT) of the September rule.
+  // The seeding must pick the April rule (save=-2, CEST) at info.begin.
+  std::ofstream("tzdata.zi") << R"(# version test_negative_cascade
+R Fr 1945 o - Apr 2  2 -2 M
+R Fr 1945 o - Sep 16 0 0 -
+Z Test/Negative 0  -  X     1945 Sep 16 1u
+             1  Fr CE%sT
+)";
+
+  const auto& db = reload_tzdb();
+  VERIFY( override_used ); // If this fails then XFAIL for the target.
+  VERIFY( db.version == "test_negative_cascade" );
+
+  auto* tz = locate_zone("Test/Negative");
+
+  // Line 2 begins at exactly 1945-09-16 01:00 UT, sample
+  // one second after.
+  auto info = tz->get_info(sys_seconds{
+      sys_days(1945y/September/16) + 1h + 1s});
+  VERIFY( info.offset == -1h );
+  VERIFY( info.save == -2h );
+  VERIFY( info.abbrev == "CEMT" );
+
+  // The boundary instant.
+  auto at_boundary
+    = tz->get_info(sys_seconds{sys_days(1945y/September/16) + 1h});
+  VERIFY( at_boundary.offset == -1h );
+  VERIFY( at_boundary.save == -2h );
+}
+
+int
+main()
+{
+  test_paris();
+  test_negative();
+}
-- 
2.54.0

Reply via email to