Query crash with 15.5 on debian bookworm/armv8

2023-12-25 Thread Clemens Eisserer
Hi,

I've just updated my raspberry pi 3 from postgresql-13.3 on
bullseye/armv6 to postgresq-15.5 on debian-bookworm/armv8.

However after the upgrade, I experience reproducable crashes quering
the following table:

CREATE TABLE public.smartmeter (
   leistungsfaktor real,
   momentanleistung integer,
   spannungl1 real,
   spannungl2 real,
   spannungl3 real,
   stroml1 real,
   stroml2 real,
   stroml3 real,
   wirkenergien real,
   wirkenergiep real,
   ts timestamp with time zone NOT NULL
);
CREATE INDEX smartmeter_ts_idx ON public.smartmeter USING brin (ts);

with the following query:
SELECT floor(extract(epoch from ts)/60)*60 AS "time", AVG(spannungL1)
as l1, AVG(spannungL2) as l2, AVG(spannungL3) as l3 FROM smartmeter
WHERE ts BETWEEN '2023-12-01T13:01:30.514Z' AND
'2023-12-25T19:01:30.514Z' GROUP BY time order by time;

any ideas how to diagnose the issue further?
is this a known problem?

Thanks & best regards, Clemens

Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
0x007ff6eb7fe0 in __GI_epoll_pwait (epfd=4, events=0xea2d20,
maxevents=1, timeout=timeout@entry=-1, set=set@entry=0x0) at
../sysdeps/unix/sysv/linux/epoll_pwait.c:40
40  ../sysdeps/unix/sysv/linux/epoll_pwait.c: No such file or directory.
(gdb) c
Continuing.

Program received signal SIGUSR1, User defined signal 1.
0x007ff6ea7f58 in __libc_pread64 (fd=25,
buf=buf@entry=0x7feb754880, count=count@entry=8192,
offset=offset@entry=16384) at ../sysdeps/unix/sysv/linux/pread64.c:25
25  ../sysdeps/unix/sysv/linux/pread64.c: No such file or directory.
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x007fe5e6a9f0 in ?? () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
(gdb) bt full
#0  0x007fe5e6a9f0 in ?? () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#1  0x007fe59bb49c in llvm::raw_ostream::write(char const*,
unsigned long) () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#2  0x007fe6d71048 in
llvm::MCContext::createTempSymbol(llvm::Twine const&, bool) () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#3  0x007fe6d713f0 in llvm::MCContext::createTempSymbol() () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#4  0x007fe6d95c6c in
llvm::MCObjectStreamer::emitCFIEndProcImpl(llvm::MCDwarfFrameInfo&) ()
from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#5  0x007fe619f4c0 in ?? () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#6  0x007fe6180b6c in llvm::AsmPrinter::emitFunctionBody() () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#7  0x007fe72a4ba4 in ?? () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#8  0x007fe5d3122c in
llvm::MachineFunctionPass::runOnFunction(llvm::Function&) () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#9  0x007fe5b14390 in
llvm::FPPassManager::runOnFunction(llvm::Function&) () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#10 0x007fe5b1af70 in
llvm::FPPassManager::runOnModule(llvm::Module&) () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#11 0x007fe5b14d98 in
llvm::legacy::PassManagerImpl::run(llvm::Module&) () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#12 0x007fe7187d70 in
llvm::orc::SimpleCompiler::operator()(llvm::Module&) () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#13 0x007fe71dc138 in ?? () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#14 0x007fe71dbf44 in
llvm::orc::IRCompileLayer::emit(std::unique_ptr >,
llvm::orc::ThreadSafeModule) ()
  from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#15 0x007fe71dc634 in
llvm::orc::IRTransformLayer::emit(std::unique_ptr >,
llvm::orc::ThreadSafeModule) ()
  from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#16 0x007fe71dc634 in
llvm::orc::IRTransformLayer::emit(std::unique_ptr >,
llvm::orc::ThreadSafeModule) ()
  from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#17 0x007fe71e2648 in
llvm::orc::BasicIRLayerMaterializationUnit::materialize(std::unique_ptr >) ()
  from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#18 0x007fe7199c18 in llvm::orc::MaterializationTask::run() ()
from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#19 0x007fe71a4ea0 in ?? () from /lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#20 0x007fe719bad0 in
llvm::orc::ExecutionSession::dispatchOutstandingMUs() () from
/lib/aarch64-linux-gnu/libLLVM-14.so.1
No symbol table info available.
#21 0x007fe719ea84 in
llvm::orc::ExecutionSession::OL_completeLookup(std::unique_ptr >,
std::s

Re: Query crash with 15.5 on debian bookworm/armv8

2023-12-25 Thread Clemens Eisserer
Hi Adrian,

> How did you upgrade?

A fresh install based on  "Raspberry Pi OS Lite" image provided (based
on debian bookworm) with pgdump_all & plsql -f.




Re: Query crash with 15.5 on debian bookworm/armv8

2023-12-25 Thread Clemens Eisserer
> Does that install Postgres as part of the image or did you get it from
> somewhere else?

I installed it via "apt-get install postgresql" and it downloaded
postgresql-15_15.5-0+deb12u1_arm64.deb - which seems to be the current
package shipped with debian bookworm for arm64:
https://packages.debian.org/bookworm/arm64/postgresql-15/download

best regards, Clemens




Re: Query crash with 15.5 on debian bookworm/armv8

2023-12-26 Thread Clemens Eisserer
Hi Tom,

> FWIW, since this crash is inside LLVM you could presumably dodge the bug
> by setting "jit" to off.

Thanks, this indeed solved the crash.
Just to make sure this crash doesn't have anything to do with my
setup/config (I'd changed quite a few settings in postgresql.conf),
I gave it a try on a fresh bookworm install and it also crashed immeditaly.

> As for an actual fix, perhaps a newer version of LLVM is needed?
> I don't see a problem testing this query on my RPI with Ubuntu 23.10
> (LLVM 16).

I also gave Ubuntu 23.10 a try (15.4 built with llvm-15) and it worked
as expected, explain analyze even mentioned the JIT was active.

I've filed a debian bug report with a link to this discussion and a
plea to build postgresql against llvm >= 15:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1059476

To be honest I don't know why llvm-14 was chosen, as 15 is also
available in bookworm.

Thanks & best regards, Clemens




Re: Query crash with 15.5 on debian bookworm/armv8

2023-12-28 Thread Clemens Eisserer
Hi Thomas,

In case it is helpful for analyzing whats causing the cash, I've
uploaded the db dump I experience the crash on to:
https://drive.google.com/file/d/1H9Y3FaoBafakHwXhpT3s8NNQ1UNJLpJY/view?usp=sharing

The only steps I had to do to trigger the crash were:
- Start with fresh rasperry pi os bookworm 64-bit image
- install postgresql (packages are pulled from debian and also match
debian's md5 sums so I guess there should be no difference caused by
using raspberry pi os base image)
- import the linked db export with psql -f (I had to generate de_AT locale first
- execute the query

Best regards, Clemens

Am Di., 26. Dez. 2023 um 23:16 Uhr schrieb Thomas Munro
:
>
> On Wed, Dec 27, 2023 at 5:17 AM Clemens Eisserer  wrote:
> > > FWIW, since this crash is inside LLVM you could presumably dodge the bug
> > > by setting "jit" to off.
> >
> > Thanks, this indeed solved the crash.
> > Just to make sure this crash doesn't have anything to do with my
> > setup/config (I'd changed quite a few settings in postgresql.conf),
> > I gave it a try on a fresh bookworm install and it also crashed immeditaly.
> >
> > > As for an actual fix, perhaps a newer version of LLVM is needed?
> > > I don't see a problem testing this query on my RPI with Ubuntu 23.10
> > > (LLVM 16).
> >
> > I also gave Ubuntu 23.10 a try (15.4 built with llvm-15) and it worked
> > as expected, explain analyze even mentioned the JIT was active.
>
> I can't reproduce this on LLVM 14 on an aarch64 Mac FWIW (after
> setting jit_*_cost to 0, as required since the table is empty).
>
> > I've filed a debian bug report with a link to this discussion and a
> > plea to build postgresql against llvm >= 15:
> > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1059476
>
> I doubt they'll change that, and in any case we'll need to get to the
> bottom of this.  Perhaps an assertion build of LLVM will fail in some
> illuminating internal assertion?  Unfortunately it's a non-trivial
> business to get a debug build of LLVM going (it takes oodles of disk
> and CPU and a few confusing-to-me steps)...
>
> . o O ( It would be wonderful if assertion-enabled packages were
> readily available for a common platform like Debian.  I've finally
> been spurred on to reach out to the maintainer of apt.llvm.org to ask
> about that.  It'd also be very handy for automated next-version
> monitoring. )