h-vetinari opened a new issue, #44455:
URL: https://github.com/apache/arrow/issues/44455

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   While 
[testing](https://github.com/conda-forge/arrow-cpp-feedstock/pull/1432) arrow 
18.0.0rc0, I'm getting a new batch of failing tests on windows
   ```
   FAILED pyarrow/tests/test_compute.py::test_strftime - 
pyarrow.lib.ArrowInvalid: Cannot locate timezone 'CET': oops: bad dow name: il
   FAILED pyarrow/tests/test_compute.py::test_extract_datetime_components - 
pyarrow.lib.ArrowInvalid: Cannot locate timezone 'UTC': oops: bad dow name: il
   FAILED pyarrow/tests/test_compute.py::test_assume_timezone - 
pyarrow.lib.ArrowInvalid: Cannot locate timezone 'UTC': oops: bad dow name: il
   FAILED pyarrow/tests/test_compute.py::test_round_temporal[nanosecond] - 
pyarrow.lib.ArrowInvalid: Cannot locate timezone 'Asia/Kolkata': oops: bad dow 
name: il
   FAILED pyarrow/tests/test_compute.py::test_round_temporal[microsecond] - 
pyarrow.lib.ArrowInvalid: Cannot locate timezone 'Asia/Kolkata': oops: bad dow 
name: il
   FAILED pyarrow/tests/test_compute.py::test_round_temporal[millisecond] - 
pyarrow.lib.ArrowInvalid: Cannot locate timezone 'Asia/Kolkata': oops: bad dow 
name: il
   FAILED pyarrow/tests/test_compute.py::test_round_temporal[second] - 
pyarrow.lib.ArrowInvalid: Cannot locate timezone 'Asia/Kolkata': oops: bad dow 
name: il
   FAILED pyarrow/tests/test_compute.py::test_round_temporal[minute] - 
pyarrow.lib.ArrowInvalid: Cannot locate timezone 'Asia/Kolkata': oops: bad dow 
name: il
   FAILED pyarrow/tests/test_compute.py::test_round_temporal[hour] - 
pyarrow.lib.ArrowInvalid: Cannot locate timezone 'Asia/Kolkata': oops: bad dow 
name: il
   FAILED pyarrow/tests/test_compute.py::test_round_temporal[day] - 
pyarrow.lib.ArrowInvalid: Cannot locate timezone 'Asia/Kolkata': oops: bad dow 
name: il
   FAILED 
pyarrow/tests/test_convert_builtin.py::test_sequence_timestamp_from_int_with_unit
 - pyarrow.lib.ArrowInvalid: Cannot locate timezone 'UTC': oops: bad dow name: 
il
   FAILED pyarrow/tests/test_scalars.py::test_timestamp_scalar - 
pyarrow.lib.ArrowInvalid: Cannot locate timezone 'UTC': oops: bad dow name: il
   FAILED pyarrow/tests/test_scalars.py::test_cast_timestamp_to_string - 
pyarrow.lib.ArrowInvalid: Cannot locate timezone 'UTC': oops: bad dow name: il
   ```
   The "cannot locate timezone" seems to be coming from
   
https://github.com/apache/arrow/blob/16ad97938f48c9a12833f61f7553fbcf8dbaf9ca/cpp/src/arrow/compute/kernels/temporal_internal.h#L51
   and hasn't changed recently. In particular, my understanding would be that
   
https://github.com/apache/arrow/blob/16ad97938f48c9a12833f61f7553fbcf8dbaf9ca/cpp/src/arrow/compute/kernels/temporal_internal.h#L20
   doesn't yet refer to C++20's specification that includes the timezones (but 
rather some 
[vendored](https://github.com/apache/arrow/blob/main/cpp/src/arrow/vendored/datetime/tz.h)
 variant of https://github.com/HowardHinnant/date, c.f. 
https://github.com/apache/arrow/commit/641c6990cdc702c736921c78448cf3a175c97863,
 
https://github.com/apache/arrow/commit/2bf44214bf6ce9d3e275cd52b03e178b2a478105,
 #29200, etc.)
   
   AFAIU, the change is most likely due to the tests getting a new/updated 
decorator:
   
https://github.com/apache/arrow/blob/16ad97938f48c9a12833f61f7553fbcf8dbaf9ca/python/pyarrow/tests/test_compute.py#L2508
 which is ultimately determined by
   
https://github.com/apache/arrow/blob/16ad97938f48c9a12833f61f7553fbcf8dbaf9ca/python/pyarrow/tests/util.py#L425-L437
   
   It's not exactly described _what_ needs to be under that path, but it does 
seem to get forwarded to the vendored date here
   
https://github.com/apache/arrow/blob/16ad97938f48c9a12833f61f7553fbcf8dbaf9ca/cpp/src/arrow/config.cc#L93
   which will then 
[iterate](https://github.com/HowardHinnant/date/blob/dd8affc6de5755e07638bf0a14382d29549d6ee9/src/tz.cpp#L3046)
 through the contents.
   
   As such, conda-forge (which ships its own `tzdata`) should be able to set 
`"PYARROW_TZDATA_PATH=%PREFIX%\share\zoneinfo"` and be done with it. I'm mainly 
opening this issue now to document the chasing down of rabbit holes for this, 
which might serve someone else (and if I'm wrong, we'll be able to discuss 
further as well).
   
   ### Component(s)
   
   C++, Packaging


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to