jonkeane opened a new issue, #46123:
URL: https://github.com/apache/arrow/issues/46123

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   We got an email from CRAN asking us to resolve this before XXX. It looks 
limited to our hash join implementation.
   
   https://www.stats.ox.ac.uk/pub/bdr/M1-SAN/arrow/tests/testthat.Rout
   
   
   <details>
   <summary>Full logs</summary>
   
   ```
   R Under development (unstable) (2025-04-11 r88138) -- "Unsuffered 
Consequences"
   Copyright (C) 2025 The R Foundation for Statistical Computing
   Platform: aarch64-apple-darwin24.3.0
   
   R is free software and comes with ABSOLUTELY NO WARRANTY.
   You are welcome to redistribute it under certain conditions.
   Type 'license()' or 'licence()' for distribution details.
   
   R is a collaborative project with many contributors.
   Type 'contributors()' for more information and
   'citation()' on how to cite R or R packages in publications.
   
   Type 'demo()' for some demos, 'help()' for on-line help, or
   'help.start()' for an HTML browser interface to help.
   Type 'q()' to quit R.
   
   > # Licensed to the Apache Software Foundation (ASF) under one
   > # or more contributor license agreements.  See the NOTICE file
   > # distributed with this work for additional information
   > # regarding copyright ownership.  The ASF licenses this file
   > # to you under the Apache License, Version 2.0 (the
   > # "License"); you may not use this file except in compliance
   > # with the License.  You may obtain a copy of the License at
   > #
   > #   http://www.apache.org/licenses/LICENSE-2.0
   > #
   > # Unless required by applicable law or agreed to in writing,
   > # software distributed under the License is distributed on an
   > # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
   > # KIND, either express or implied.  See the License for the
   > # specific language governing permissions and limitations
   > # under the License.
   > 
   > library(testthat)
   > library(arrow)
   Some features are not enabled in this build of Arrow. Run `arrow_info()` for 
more information.
   The repository you retrieved Arrow from did not include all of Arrow's 
features.
   You can install a fully-featured version by running:
   `install.packages('arrow', repos = 'https://apache.r-universe.dev')`.
   
   Attaching package: 'arrow'
   
   The following object is masked from 'package:testthat':
   
       matches
   
   The following object is masked from 'package:utils':
   
       timestamp
   
   > library(tibble)
   > 
   > verbose_test_output <- identical(tolower(Sys.getenv("ARROW_R_DEV", 
"false")), "true") ||
   +   identical(tolower(Sys.getenv("ARROW_R_VERBOSE_TEST", "false")), "true")
   > 
   > if (verbose_test_output) {
   +   arrow_reporter <- MultiReporter$new(list(CheckReporter$new(), 
LocationReporter$new()))
   + } else {
   +   arrow_reporter <- check_reporter()
   + }
   > test_check("arrow", reporter = arrow_reporter)
   
/Users/ripley/R/packages/tests-SAN/arrow/tools/cpp/src/arrow/compute/row/compare_internal.cc:284:30:
 runtime error: load of misaligned address 0x000150040c01 for type 'const 
uint64_t *' (aka 'const unsigned long long *'), which requires 8 byte alignment
   0x000150040c01: note: pointer points here
    00 00 00  6a 69 68 67 66 65 64 00  00 00 00 00 00 00 00 00  00 00 00 00 00 
00 00 00  00 00 00 00 00
                 ^ 
       #0 0x00011ac3da60 in void 
arrow::compute::KeyCompare::CompareVarBinaryColumnToRowHelper<true, 
true>(unsigned int, unsigned int, unsigned int, unsigned short const*, unsigned 
int const*, arrow::compute::LightContext*, arrow::compute::KeyColumnArray 
const&, arrow::compute::RowTableImpl const&, unsigned char*)+0xa44 
(arrow.so:arm64+0x4fd9a60)
       #1 0x00011ac2ddc4 in 
arrow::compute::KeyCompare::CompareColumnsToRows(unsigned int, unsigned short 
const*, unsigned int const*, arrow::compute::LightContext*, unsigned int*, 
unsigned short*, std::__1::vector<arrow::compute::KeyColumnArray, 
std::__1::allocator<arrow::compute::KeyColumnArray>> const&, 
arrow::compute::RowTableImpl const&, bool, unsigned char*)+0x780 
(arrow.so:arm64+0x4fc9dc4)
       #2 0x00011ac60c68 in 
std::__1::__function::__func<arrow::compute::(anonymous 
namespace)::GrouperFastImpl::Make(std::__1::vector<arrow::TypeHolder, 
std::__1::allocator<arrow::TypeHolder>> const&, 
arrow::compute::ExecContext*)::'lambda'(int, unsigned short const*, unsigned 
int const*, unsigned int*, unsigned short*, void*), 
std::__1::allocator<arrow::compute::(anonymous 
namespace)::GrouperFastImpl::Make(std::__1::vector<arrow::TypeHolder, 
std::__1::allocator<arrow::TypeHolder>> const&, 
arrow::compute::ExecContext*)::'lambda'(int, unsigned short const*, unsigned 
int const*, unsigned int*, unsigned short*, void*)>, void (int, unsigned short 
const*, unsigned int const*, unsigned int*, unsigned short*, 
void*)>::operator()(int&&, unsigned short const*&&, unsigned int const*&&, 
unsigned int*&&, unsigned short*&&, void*&&)+0x1cc (arrow.so:arm64+0x4ffcc68)
       #3 0x00011abddbcc in arrow::compute::SwissTable::run_comparisons(int, 
unsigned short const*, unsigned char const*, unsigned int const*, int*, 
unsigned short*, std::__1::function<void (int, unsigned short const*, unsigned 
int const*, unsigned int*, unsigned short*, void*)> const&, void*) const+0x2bc 
(arrow.so:arm64+0x4f79bcc)
       #4 0x00011abde1a8 in arrow::compute::SwissTable::find(int, unsigned int 
const*, unsigned char*, unsigned char const*, unsigned int*, 
arrow::util::TempVectorStack*, std::__1::function<void (int, unsigned short 
const*, unsigned int const*, unsigned int*, unsigned short*, void*)> const&, 
void*) const+0x294 (arrow.so:arm64+0x4f7a1a8)
       #5 0x00011ac5d50c in arrow::compute::(anonymous 
namespace)::GrouperFastImpl::ConsumeImpl(arrow::compute::ExecSpan 
const&)+0x1af0 (arrow.so:arm64+0x4ff950c)
       #6 0x00011ac55b38 in arrow::compute::(anonymous 
namespace)::GrouperFastImpl::Consume(arrow::compute::ExecSpan const&, long 
long, long long)+0x570 (arrow.so:arm64+0x4ff1b38)
       #7 0x0001170d3a84 in arrow::acero::aggregate::GroupByNode::Merge()+0x4d0 
(arrow.so:arm64+0x146fa84)
       #8 0x0001170d6fd8 in 
arrow::acero::aggregate::GroupByNode::OutputResult(bool)+0x4c8 
(arrow.so:arm64+0x1472fd8)
       #9 0x0001170d8808 in 
arrow::acero::aggregate::GroupByNode::InputReceived(arrow::acero::ExecNode*, 
arrow::compute::ExecBatch)+0xa8c (arrow.so:arm64+0x1474808)
       #10 0x0001172f9950 in 
arrow::acero::MapNode::InputReceived(arrow::acero::ExecNode*, 
arrow::compute::ExecBatch)+0x348 (arrow.so:arm64+0x1695950)
       #11 0x0001172f9950 in 
arrow::acero::MapNode::InputReceived(arrow::acero::ExecNode*, 
arrow::compute::ExecBatch)+0x348 (arrow.so:arm64+0x1695950)
       #12 0x000117384a08 in 
std::__1::__function::__func<arrow::acero::(anonymous 
namespace)::SourceNode::SliceAndDeliverMorsel(arrow::compute::ExecBatch 
const&)::'lambda'(), std::__1::allocator<arrow::acero::(anonymous 
namespace)::SourceNode::SliceAndDeliverMorsel(arrow::compute::ExecBatch 
const&)::'lambda'()>, arrow::Status ()>::operator()()+0xa78 
(arrow.so:arm64+0x1720a08)
       #13 0x000117323af4 in 
std::__1::__bind_return<arrow::detail::ContinueFuture, 
std::__1::tuple<arrow::Future<arrow::internal::Empty>, 
std::__1::function<arrow::Status ()>>, std::__1::tuple<>, 
__is_valid_bind_return<arrow::detail::ContinueFuture, 
std::__1::tuple<arrow::Future<arrow::internal::Empty>, 
std::__1::function<arrow::Status ()>>, std::__1::tuple<>>::value>::type 
std::__1::__bind<arrow::detail::ContinueFuture, 
arrow::Future<arrow::internal::Empty>&, std::__1::function<arrow::Status 
()>>::operator()[abi:ne190102]<>()+0xf0 (arrow.so:arm64+0x16bfaf4)
       #14 0x00011b6d9aac in arrow::internal::FnOnce<void ()>::operator()() 
&&+0x14c (arrow.so:arm64+0x5a75aac)
       #15 0x00011b6ec834 in void* 
std::__1::__thread_proxy[abi:ne190102]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct,
 std::__1::default_delete<std::__1::__thread_struct>>, 
arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_0>>(void*)+0x3c8 
(arrow.so:arm64+0x5a88834)
       #16 0x00010358a4a4 in asan_thread_start(void*)+0x4c 
(libclang_rt.asan_osx_dynamic.dylib:arm64e+0x3a4a4)
       #17 0x00019fefdc08 in _pthread_start+0x84 
(libsystem_pthread.dylib:arm64e+0x6c08)
       #18 0x00019fef8b7c in thread_start+0x4 
(libsystem_pthread.dylib:arm64e+0x1b7c)
   
   SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
/Users/ripley/R/packages/tests-SAN/arrow/tools/cpp/src/arrow/compute/row/compare_internal.cc:284:30
 
   
/Users/ripley/R/packages/tests-SAN/arrow/tools/cpp/src/arrow/acero/source_node.cc:76:
 An input buffer was poorly aligned.  This could lead to crashes or poor 
performance on some hardware.  Please ensure that all Acero sources generate 
aligned buffers, or change the unaligned buffer handling configuration to 
silence this warning.
   
/Users/ripley/R/packages/tests-SAN/arrow/tools/cpp/src/arrow/compute/light_array_internal.cc:618:25:
 runtime error: load of misaligned address 0x000124040601 for type 'const 
uint64_t *' (aka 'const unsigned long long *'), which requires 8 byte alignment
   0x000124040601: note: pointer points here
    00 00 00  41 42 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 
00 00 00  00 00 00 00 00
                 ^ 
       #0 0x00011abfeea8 in 
arrow::compute::ExecBatchBuilder::AppendSelected(std::__1::shared_ptr<arrow::ArrayData>
 const&, arrow::compute::ResizableArrayData*, int, unsigned short const*, 
arrow::MemoryPool*)::$_8::operator()(int, unsigned char const*, int) 
const+0x300 (arrow.so:arm64+0x4f9aea8)
       #1 0x00011abf5254 in 
arrow::compute::ExecBatchBuilder::AppendSelected(std::__1::shared_ptr<arrow::ArrayData>
 const&, arrow::compute::ResizableArrayData*, int, unsigned short const*, 
arrow::MemoryPool*)+0x3bfc (arrow.so:arm64+0x4f91254)
       #2 0x00011abf9fdc in 
arrow::compute::ExecBatchBuilder::AppendSelected(arrow::MemoryPool*, 
arrow::compute::ExecBatch const&, int, unsigned short const*, int, int 
const*)+0x364 (arrow.so:arm64+0x4f95fdc)
       #3 0x0001173bec5c in 
arrow::acero::JoinResultMaterialize::Append(arrow::compute::ExecBatch const&, 
int, unsigned short const*, unsigned int const*, unsigned int const*, 
int*)+0x1f4 (arrow.so:arm64+0x175ac5c)
       #4 0x0001173cd08c in arrow::acero::JoinProbeProcessor::OnNextBatch(long 
long, arrow::compute::ExecBatch const&, arrow::util::TempVectorStack*, 
std::__1::vector<arrow::compute::KeyColumnArray, 
std::__1::allocator<arrow::compute::KeyColumnArray>>*)+0x15f0 
(arrow.so:arm64+0x176908c)
       #5 0x0001173d41c4 in arrow::acero::SwissJoin::ProbeSingleBatch(unsigned 
long, arrow::compute::ExecBatch)+0x490 (arrow.so:arm64+0x17701c4)
       #6 0x0001172cb330 in 
arrow::acero::HashJoinNode::Init()::'lambda'(unsigned long, long 
long)::operator()(unsigned long, long long) const+0x1fc 
(arrow.so:arm64+0x1667330)
       #7 0x000117421278 in 
arrow::acero::TaskSchedulerImpl::ExecuteTask(unsigned long, int, long long, 
bool*)+0x210 (arrow.so:arm64+0x17bd278)
       #8 0x00011742b624 in 
std::__1::__function::__func<arrow::acero::TaskSchedulerImpl::ScheduleMore(unsigned
 long, int)::$_0, 
std::__1::allocator<arrow::acero::TaskSchedulerImpl::ScheduleMore(unsigned 
long, int)::$_0>, arrow::Status (unsigned long)>::operator()(unsigned 
long&&)+0x268 (arrow.so:arm64+0x17c7624)
       #9 0x0001173255d0 in 
std::__1::__function::__func<arrow::acero::QueryContext::ScheduleTask(std::__1::function<arrow::Status
 (unsigned long)>, std::__1::basic_string_view<char, 
std::__1::char_traits<char>>)::$_0, 
std::__1::allocator<arrow::acero::QueryContext::ScheduleTask(std::__1::function<arrow::Status
 (unsigned long)>, std::__1::basic_string_view<char, 
std::__1::char_traits<char>>)::$_0>, arrow::Status ()>::operator()()+0x150 
(arrow.so:arm64+0x16c15d0)
       #10 0x000117323af4 in 
std::__1::__bind_return<arrow::detail::ContinueFuture, 
std::__1::tuple<arrow::Future<arrow::internal::Empty>, 
std::__1::function<arrow::Status ()>>, std::__1::tuple<>, 
__is_valid_bind_return<arrow::detail::ContinueFuture, 
std::__1::tuple<arrow::Future<arrow::internal::Empty>, 
std::__1::function<arrow::Status ()>>, std::__1::tuple<>>::value>::type 
std::__1::__bind<arrow::detail::ContinueFuture, 
arrow::Future<arrow::internal::Empty>&, std::__1::function<arrow::Status 
()>>::operator()[abi:ne190102]<>()+0xf0 (arrow.so:arm64+0x16bfaf4)
       #11 0x00011b6d9aac in arrow::internal::FnOnce<void ()>::operator()() 
&&+0x14c (arrow.so:arm64+0x5a75aac)
       #12 0x00011b6ec834 in void* 
std::__1::__thread_proxy[abi:ne190102]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct,
 std::__1::default_delete<std::__1::__thread_struct>>, 
arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_0>>(void*)+0x3c8 
(arrow.so:arm64+0x5a88834)
       #13 0x00010358a4a4 in asan_thread_start(void*)+0x4c 
(libclang_rt.asan_osx_dynamic.dylib:arm64e+0x3a4a4)
       #14 0x00019fefdc08 in _pthread_start+0x84 
(libsystem_pthread.dylib:arm64e+0x6c08)
       #15 0x00019fef8b7c in thread_start+0x4 
(libsystem_pthread.dylib:arm64e+0x1b7c)
   
   SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
/Users/ripley/R/packages/tests-SAN/arrow/tools/cpp/src/arrow/compute/light_array_internal.cc:618:25
 
   
/Users/ripley/R/packages/tests-SAN/arrow/tools/cpp/src/arrow/compute/light_array_internal.cc:618:20:
 runtime error: store to misaligned address 0x000130020401 for type 'uint64_t 
*' (aka 'unsigned long long *'), which requires 8 byte alignment
   0x000130020401: note: pointer points here
    00 00 00  42 0a 02 30 01 00 00 00  00 00 00 00 00 00 00 00  40 bf 56 dd 6c 
67 02 00  c0 b0 d6 a9 c8
                 ^ 
       #0 0x00011abfee90 in 
arrow::compute::ExecBatchBuilder::AppendSelected(std::__1::shared_ptr<arrow::ArrayData>
 const&, arrow::compute::ResizableArrayData*, int, unsigned short const*, 
arrow::MemoryPool*)::$_8::operator()(int, unsigned char const*, int) 
const+0x2e8 (arrow.so:arm64+0x4f9ae90)
       #1 0x00011abf5254 in 
arrow::compute::ExecBatchBuilder::AppendSelected(std::__1::shared_ptr<arrow::ArrayData>
 const&, arrow::compute::ResizableArrayData*, int, unsigned short const*, 
arrow::MemoryPool*)+0x3bfc (arrow.so:arm64+0x4f91254)
       #2 0x00011abf9fdc in 
arrow::compute::ExecBatchBuilder::AppendSelected(arrow::MemoryPool*, 
arrow::compute::ExecBatch const&, int, unsigned short const*, int, int 
const*)+0x364 (arrow.so:arm64+0x4f95fdc)
       #3 0x0001173bd6c8 in 
arrow::acero::JoinResultMaterialize::AppendProbeOnly(arrow::compute::ExecBatch 
const&, int, unsigned short const*, int*)+0x1f0 (arrow.so:arm64+0x17596c8)
       #4 0x0001173cd60c in arrow::acero::JoinProbeProcessor::OnNextBatch(long 
long, arrow::compute::ExecBatch const&, arrow::util::TempVectorStack*, 
std::__1::vector<arrow::compute::KeyColumnArray, 
std::__1::allocator<arrow::compute::KeyColumnArray>>*)+0x1b70 
(arrow.so:arm64+0x176960c)
       #5 0x0001173d41c4 in arrow::acero::SwissJoin::ProbeSingleBatch(unsigned 
long, arrow::compute::ExecBatch)+0x490 (arrow.so:arm64+0x17701c4)
       #6 0x0001172cb330 in 
arrow::acero::HashJoinNode::Init()::'lambda'(unsigned long, long 
long)::operator()(unsigned long, long long) const+0x1fc 
(arrow.so:arm64+0x1667330)
       #7 0x000117421278 in 
arrow::acero::TaskSchedulerImpl::ExecuteTask(unsigned long, int, long long, 
bool*)+0x210 (arrow.so:arm64+0x17bd278)
       #8 0x00011742b624 in 
std::__1::__function::__func<arrow::acero::TaskSchedulerImpl::ScheduleMore(unsigned
 long, int)::$_0, 
std::__1::allocator<arrow::acero::TaskSchedulerImpl::ScheduleMore(unsigned 
long, int)::$_0>, arrow::Status (unsigned long)>::operator()(unsigned 
long&&)+0x268 (arrow.so:arm64+0x17c7624)
       #9 0x0001173255d0 in 
std::__1::__function::__func<arrow::acero::QueryContext::ScheduleTask(std::__1::function<arrow::Status
 (unsigned long)>, std::__1::basic_string_view<char, 
std::__1::char_traits<char>>)::$_0, 
std::__1::allocator<arrow::acero::QueryContext::ScheduleTask(std::__1::function<arrow::Status
 (unsigned long)>, std::__1::basic_string_view<char, 
std::__1::char_traits<char>>)::$_0>, arrow::Status ()>::operator()()+0x150 
(arrow.so:arm64+0x16c15d0)
       #10 0x000117323af4 in 
std::__1::__bind_return<arrow::detail::ContinueFuture, 
std::__1::tuple<arrow::Future<arrow::internal::Empty>, 
std::__1::function<arrow::Status ()>>, std::__1::tuple<>, 
__is_valid_bind_return<arrow::detail::ContinueFuture, 
std::__1::tuple<arrow::Future<arrow::internal::Empty>, 
std::__1::function<arrow::Status ()>>, std::__1::tuple<>>::value>::type 
std::__1::__bind<arrow::detail::ContinueFuture, 
arrow::Future<arrow::internal::Empty>&, std::__1::function<arrow::Status 
()>>::operator()[abi:ne190102]<>()+0xf0 (arrow.so:arm64+0x16bfaf4)
       #11 0x00011b6d9aac in arrow::internal::FnOnce<void ()>::operator()() 
&&+0x14c (arrow.so:arm64+0x5a75aac)
       #12 0x00011b6ec834 in void* 
std::__1::__thread_proxy[abi:ne190102]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct,
 std::__1::default_delete<std::__1::__thread_struct>>, 
arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_0>>(void*)+0x3c8 
(arrow.so:arm64+0x5a88834)
       #13 0x00010358a4a4 in asan_thread_start(void*)+0x4c 
(libclang_rt.asan_osx_dynamic.dylib:arm64e+0x3a4a4)
       #14 0x00019fefdc08 in _pthread_start+0x84 
(libsystem_pthread.dylib:arm64e+0x6c08)
       #15 0x00019fef8b7c in thread_start+0x4 
(libsystem_pthread.dylib:arm64e+0x1b7c)
   
   SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
/Users/ripley/R/packages/tests-SAN/arrow/tools/cpp/src/arrow/compute/light_array_internal.cc:618:20
 
   
/Users/ripley/R/packages/tests-SAN/arrow/tools/cpp/src/arrow/compute/row/compare_internal.cc:284:30:
 runtime error: load of misaligned address 0x00010c022f41 for type 'const 
uint64_t *' (aka 'const unsigned long long *'), which requires 8 byte alignment
   0x00010c022f41: note: pointer points here
    00 00 00  61 61 61 62 62 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 
00 00 00  00 00 00 00 00
                 ^ 
       #0 0x00011ac3f180 in void 
arrow::compute::KeyCompare::CompareVarBinaryColumnToRowHelper<false, 
true>(unsigned int, unsigned int, unsigned int, unsigned short const*, unsigned 
int const*, arrow::compute::LightContext*, arrow::compute::KeyColumnArray 
const&, arrow::compute::RowTableImpl const&, unsigned char*)+0x9fc 
(arrow.so:arm64+0x4fdb180)
       #1 0x00011ac2de44 in 
arrow::compute::KeyCompare::CompareColumnsToRows(unsigned int, unsigned short 
const*, unsigned int const*, arrow::compute::LightContext*, unsigned int*, 
unsigned short*, std::__1::vector<arrow::compute::KeyColumnArray, 
std::__1::allocator<arrow::compute::KeyColumnArray>> const&, 
arrow::compute::RowTableImpl const&, bool, unsigned char*)+0x800 
(arrow.so:arm64+0x4fc9e44)
       #2 0x00011739ec00 in 
arrow::acero::RowArray::Compare(arrow::compute::ExecBatch const&, int, int, 
int, unsigned short const*, unsigned int const*, unsigned int*, unsigned 
short*, arrow::util::TempVectorStack*, 
std::__1::vector<arrow::compute::KeyColumnArray, 
std::__1::allocator<arrow::compute::KeyColumnArray>>&, unsigned char*)+0x168 
(arrow.so:arm64+0x173ac00)
       #3 0x0001173ae7dc in 
arrow::acero::SwissTableWithKeys::EqualCallback(int, unsigned short const*, 
unsigned int const*, unsigned int*, unsigned short*, void*)+0xc28 
(arrow.so:arm64+0x174a7dc)
       #4 0x00011abddb08 in arrow::compute::SwissTable::run_comparisons(int, 
unsigned short const*, unsigned char const*, unsigned int const*, int*, 
unsigned short*, std::__1::function<void (int, unsigned short const*, unsigned 
int const*, unsigned int*, unsigned short*, void*)> const&, void*) const+0x1f8 
(arrow.so:arm64+0x4f79b08)
       #5 0x00011abde114 in arrow::compute::SwissTable::find(int, unsigned int 
const*, unsigned char*, unsigned char const*, unsigned int*, 
arrow::util::TempVectorStack*, std::__1::function<void (int, unsigned short 
const*, unsigned int const*, unsigned int*, unsigned short*, void*)> const&, 
void*) const+0x200 (arrow.so:arm64+0x4f7a114)
       #6 0x0001173b1acc in 
arrow::acero::SwissTableWithKeys::Map(arrow::acero::SwissTableWithKeys::Input*, 
bool, unsigned int const*, unsigned char*, unsigned int*)+0x728 
(arrow.so:arm64+0x174dacc)
       #7 0x0001173cc474 in arrow::acero::JoinProbeProcessor::OnNextBatch(long 
long, arrow::compute::ExecBatch const&, arrow::util::TempVectorStack*, 
std::__1::vector<arrow::compute::KeyColumnArray, 
std::__1::allocator<arrow::compute::KeyColumnArray>>*)+0x9d8 
(arrow.so:arm64+0x1768474)
       #8 0x0001173d41c4 in arrow::acero::SwissJoin::ProbeSingleBatch(unsigned 
long, arrow::compute::ExecBatch)+0x490 (arrow.so:arm64+0x17701c4)
       #9 0x0001172cb330 in 
arrow::acero::HashJoinNode::Init()::'lambda'(unsigned long, long 
long)::operator()(unsigned long, long long) const+0x1fc 
(arrow.so:arm64+0x1667330)
       #10 0x000117421278 in 
arrow::acero::TaskSchedulerImpl::ExecuteTask(unsigned long, int, long long, 
bool*)+0x210 (arrow.so:arm64+0x17bd278)
       #11 0x00011742b624 in 
std::__1::__function::__func<arrow::acero::TaskSchedulerImpl::ScheduleMore(unsigned
 long, int)::$_0, 
std::__1::allocator<arrow::acero::TaskSchedulerImpl::ScheduleMore(unsigned 
long, int)::$_0>, arrow::Status (unsigned long)>::operator()(unsigned 
long&&)+0x268 (arrow.so:arm64+0x17c7624)
       #12 0x0001173255d0 in 
std::__1::__function::__func<arrow::acero::QueryContext::ScheduleTask(std::__1::function<arrow::Status
 (unsigned long)>, std::__1::basic_string_view<char, 
std::__1::char_traits<char>>)::$_0, 
std::__1::allocator<arrow::acero::QueryContext::ScheduleTask(std::__1::function<arrow::Status
 (unsigned long)>, std::__1::basic_string_view<char, 
std::__1::char_traits<char>>)::$_0>, arrow::Status ()>::operator()()+0x150 
(arrow.so:arm64+0x16c15d0)
       #13 0x000117323af4 in 
std::__1::__bind_return<arrow::detail::ContinueFuture, 
std::__1::tuple<arrow::Future<arrow::internal::Empty>, 
std::__1::function<arrow::Status ()>>, std::__1::tuple<>, 
__is_valid_bind_return<arrow::detail::ContinueFuture, 
std::__1::tuple<arrow::Future<arrow::internal::Empty>, 
std::__1::function<arrow::Status ()>>, std::__1::tuple<>>::value>::type 
std::__1::__bind<arrow::detail::ContinueFuture, 
arrow::Future<arrow::internal::Empty>&, std::__1::function<arrow::Status 
()>>::operator()[abi:ne190102]<>()+0xf0 (arrow.so:arm64+0x16bfaf4)
       #14 0x00011b6d9aac in arrow::internal::FnOnce<void ()>::operator()() 
&&+0x14c (arrow.so:arm64+0x5a75aac)
       #15 0x00011b6ec834 in void* 
std::__1::__thread_proxy[abi:ne190102]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct,
 std::__1::default_delete<std::__1::__thread_struct>>, 
arrow::internal::ThreadPool::LaunchWorkersUnlocked(int)::$_0>>(void*)+0x3c8 
(arrow.so:arm64+0x5a88834)
       #16 0x00010358a4a4 in asan_thread_start(void*)+0x4c 
(libclang_rt.asan_osx_dynamic.dylib:arm64e+0x3a4a4)
       #17 0x00019fefdc08 in _pthread_start+0x84 
(libsystem_pthread.dylib:arm64e+0x6c08)
       #18 0x00019fef8b7c in thread_start+0x4 
(libsystem_pthread.dylib:arm64e+0x1b7c)
   
   SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
/Users/ripley/R/packages/tests-SAN/arrow/tools/cpp/src/arrow/compute/row/compare_internal.cc:284:30
 
   [ FAIL 0 | WARN 0 | SKIP 94 | PASS 6631 ]
   
   ══ Skipped tests (94) 
══════════════════════════════════════════════════════════
   • ARROW-17043 (date/datetime arithmetic with integers) (1):
     'test-compute-arith.R:132:3'
   • ARROW-18101 (1): 'test-udf.R:302:3'
   • Arrow C++ not built with gcs (1): 'test-gcs.R:18:1'
   • Arrow C++ not built with gzip (9): 'test-compressed.R:21:3',
     'test-compressed.R:36:3', 'test-compressed.R:52:3', 'test-csv.R:174:3',
     'test-csv.R:608:3', 'test-dataset-csv.R:131:3', 'test-feather.R:209:3',
     'test-parquet.R:68:3', 'test-parquet.R:310:3'
   • Arrow C++ not built with s3 (2): 'test-s3-minio.R:18:1', 'test-s3.R:18:1'
   • Arrow C++ not built with substrait (1): 'test-query-engine.R:96:3'
   • Flight server is not running (1): 'test-python-flight.R:84:5'
   • GH-33708: timestamp_parsers don't appear to be working properly (1):
     'test-dataset-csv.R:585:3'
   • Implement more aggressive implicit casting for scalars (ARROW-11402) (1):
     'test-dataset-dplyr.R:96:3'
   • Ingest_POSIXct only implemented for REALSXP (1): 'test-Array.R:297:5'
   • Need halffloat support: https://issues.apache.org/jira/browse/ARROW-3802 
(1):
     'test-Array.R:420:3'
   • On CRAN (65): 'test-Array.R:209:3', 'test-Array.R:216:3',
     'test-Array.R:1113:3', 'test-Array.R:1178:3', 'test-Array.R:1216:3',
     'test-Array.R:1248:3', 'test-Array.R:1300:3', 'test-RecordBatch.R:516:3',
     'test-RecordBatch.R:525:3', 'test-Table.R:507:3',
     'test-chunked-array.R:120:3', 'test-csv.R:729:3',
     'test-dataset-dplyr.R:326:3', 'test-dataset-write.R:591:3',
     'test-dataset.R:866:3', 'test-dplyr-across.R:229:3',
     'test-dplyr-eval.R:59:5', 'test-dplyr-filter.R:290:3',
     'test-dplyr-funcs-conditional.R:23:1', 'test-dplyr-funcs-datetime.R:26:1',
     'test-dplyr-funcs-math.R:22:1', 'test-dplyr-funcs-string.R:21:1',
     'test-dplyr-funcs-type.R:24:1', 'test-dplyr-funcs.R:19:1',
     'test-dplyr-glimpse.R:22:3', 'test-dplyr-glimpse.R:28:3',
     'test-dplyr-glimpse.R:34:3', 'test-dplyr-glimpse.R:40:3',
     'test-dplyr-glimpse.R:46:3', 'test-dplyr-glimpse.R:70:3',
     'test-dplyr-glimpse.R:88:3', 'test-dplyr-glimpse.R:96:3',
     'test-dplyr-join.R:125:3', 'test-dplyr-mutate.R:155:3',
     'test-dplyr-mutate.R:513:3', 'test-dplyr-query.R:626:3',
     'test-dplyr-slice.R:107:3', 'test-dplyr-summarize.R:328:3',
     'test-dplyr-summarize.R:835:3', 'test-dplyr-summarize.R:1287:3',
     'test-duckdb.R:19:1', 'test-extension.R:43:3', 'test-extension.R:214:3',
     'test-extra-package-roundtrip.R:18:1', 'test-feather.R:143:3',
     'test-feather.R:262:3', 'test-feather.R:332:3', 'test-filesystem.R:138:3',
     'test-filesystem.R:146:3', 'test-filesystem.R:155:3',
     'test-filesystem.R:167:3', 'test-filesystem.R:178:3',
     'test-filesystem.R:193:3', 'test-io.R:71:3', 'test-ipc-stream.R:44:3',
     'test-ipc-stream.R:48:3', 'test-parquet.R:116:3', 'test-parquet.R:485:3',
     'test-python.R:19:3', 'test-safe-call-into-r.R:21:3',
     'test-safe-call-into-r.R:36:3', 'test-safe-call-into-r.R:51:3',
     'test-type.R:61:3', 'test-udf.R:62:3', 'test-util.R:37:3'
   • Parquet test data missing (1): 'test-parquet.R:473:3'
   • TODO (ARROW-16630): make sure BottomK can handle NA ordering (1):
     'test-dplyr-collapse.R:182:3'
   • TODO: (if anyone uses RangeEquals) (1): 'test-Array.R:139:3'
   • Table with 0 cols doesn't know how many rows it should have (1):
     'test-Table.R:114:3'
   • Work around masking of data type functions (ARROW-12322) (1):
     'test-type.R:116:3'
   • environment variable ARROW_LARGE_MEMORY_TESTS (1): 'test-Table.R:669:3'
   • https://issues.apache.org/jira/browse/ARROW-7653 (1): 
'test-dataset.R:518:3'
   • pyarrow not available for testing (1): 'test-python.R:38:1'
   • tolower(Sys.info()[["sysname"]]) != "windows" is TRUE (1):
     'test-compressed.R:27:3'
   
   [ FAIL 0 | WARN 0 | SKIP 94 | PASS 6631 ]
   > 
   > proc.time()
      user  system elapsed 
    85.378  15.202 101.602 
   ```
   
   </details>
   
   
   Definition of the job: https://www.stats.ox.ac.uk/pub/bdr/M1-SAN/README.txt
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to