Re: ZFS deadlock in 14

2023-08-24 Thread Alexander Motin
Martin, The PR was just merged to upstream master. Merge to zfs-2.2-release should follow shortly: https://github.com/openzfs/zfs/pull/15204 , same as some other 2.2 fixes: https://github.com/openzfs/zfs/pull/15205 . Can't wait to get back in sync with ZFS master in FreeBSD main. ;) On 22.0

Re: ZFS deadlock in 14

2023-08-24 Thread Mark Millard
On Aug 23, 2023, at 13:37, Mark Millard wrote: > > On Aug 23, 2023, at 11:40, Alexander Motin wrote: > >> On 22.08.2023 14:24, Mark Millard wrote: >>> Alexander Motin wrote on >>> Date: Tue, 22 Aug 2023 16:18:12 UTC : I am waiting for final test results from George Wilson and then will >>

Re: ZFS deadlock in 14

2023-08-23 Thread Mark Millard
On Aug 23, 2023, at 11:40, Alexander Motin wrote: > On 22.08.2023 14:24, Mark Millard wrote: >> Alexander Motin wrote on >> Date: Tue, 22 Aug 2023 16:18:12 UTC : >>> I am waiting for final test results from George Wilson and then will >>> request quick merge of both to zfs-2.2-release branch. Un

Re: ZFS deadlock in 14

2023-08-23 Thread Alexander Motin
On 22.08.2023 14:24, Mark Millard wrote: Alexander Motin wrote on Date: Tue, 22 Aug 2023 16:18:12 UTC : I am waiting for final test results from George Wilson and then will request quick merge of both to zfs-2.2-release branch. Unfortunately there are still not many reviewers for the PR, since

Re: ZFS deadlock in 14

2023-08-22 Thread Mark Millard
Alexander Motin wrote on Date: Tue, 22 Aug 2023 16:18:12 UTC : > I am waiting for final test results from George Wilson and then will > request quick merge of both to zfs-2.2-release branch. Unfortunately > there are still not many reviewers for the PR, since the code is not > trivial, but at

Re: ZFS deadlock in 14

2023-08-22 Thread Alexander Motin
Hi Martin, I am waiting for final test results from George Wilson and then will request quick merge of both to zfs-2.2-release branch. Unfortunately there are still not many reviewers for the PR, since the code is not trivial, but at least with the test reports Brian Behlendorf and Mark Mayb

Re: ZFS deadlock in 14

2023-08-22 Thread Martin Matuska
Hi Alexander, as 15107 is a prerequisite for 15122, would it be possible to have https://github.com/openzfs/zfs/pull/15107 merged into the OpenZFS zfs-2.2-release branch (and of course later 15122)? If the patches help I can cherry-pick them into main. Cheers, mm Alexander Motin wrote:

Re: ZFS deadlock in 14

2023-08-20 Thread Mark Millard
Dag-Erling_Smørgrav wrote on Date: Sun, 20 Aug 2023 13:00:27 UTC : > Alexander Motin writes: > > Unfortunately I think the current code in main should still suffer > > from this specific deadlock. cd25b0f740 fixes some deadlocks in this > > area, may be that is why you are getting issues less of

Re: ZFS deadlock in 14 [USE_TMPFS=no poudriere messed up from the start, lots of "vlruwk"]

2023-08-19 Thread Mark Millard
On Aug 19, 2023, at 16:27, Mark Millard wrote: > On Aug 19, 2023, at 15:41, Mark Millard wrote: > >> On Aug 19, 2023, at 13:41, Mark Millard wrote: >> >>> [I forgot to adjust USE_TMPFS for the purpose of the test. >>> So I'll later be starting over.] >>> >>> . . . >> >> I finally got around

Re: ZFS deadlock in 14 [USE_TMPFS=no poudriere messed up from the start, lots of "vlruwk"]

2023-08-19 Thread Mark Millard
On Aug 19, 2023, at 15:41, Mark Millard wrote: > On Aug 19, 2023, at 13:41, Mark Millard wrote: > >> [I forgot to adjust USE_TMPFS for the purpose of the test. >> So I'll later be starting over.] >> >> . . . > > I finally got around to starting a from-scratch bulk -a > again (based on USE_TMP

Re: ZFS deadlock in 14 [USE_TMPFS=no poudriere messed up from the start, lots of "vlruwk"]

2023-08-19 Thread Mark Millard
On Aug 19, 2023, at 13:41, Mark Millard wrote: > [I forgot to adjust USE_TMPFS for the purpose of the test. > So I'll later be starting over.] > > . . . I finally got around to starting a from-scratch bulk -a again (based on USE_TMPFS=no this time). This is with 15107.patch and 15122.patch appl

Re: ZFS deadlock in 14

2023-08-19 Thread Mark Millard
[I forgot to adjust USE_TMPFS for the purpose of the test. So I'll later be starting over.] On Aug 19, 2023, at 12:18, Mark Millard wrote: > On Aug 19, 2023, at 11:40, Mark Millard wrote: > >> We will see how long the following high load average bulk -a >> configuration survives a build attemp

Re: ZFS deadlock in 14

2023-08-19 Thread Mark Millard
On Aug 19, 2023, at 11:40, Mark Millard wrote: > We will see how long the following high load average bulk -a > configuration survives a build attempt, using a non-debug kernel > for this test. > > I've applied: > > # fetch -o- https://github.com/openzfs/zfs/pull/15107.patch | git -C > /usr/ma

Re: ZFS deadlock in 14

2023-08-19 Thread Mark Millard
We will see how long the following high load average bulk -a configuration survives a build attempt, using a non-debug kernel for this test. I've applied: # fetch -o- https://github.com/openzfs/zfs/pull/15107.patch | git -C /usr/main-src/ am --dir=sys/contrib/openzfs -

Re: ZFS deadlock in 14

2023-08-19 Thread Alexander Motin
On 18.08.2023 18:34, Dag-Erling Smørgrav wrote: Dag-Erling Smørgrav writes: Plot twist: c47116e909 _without_ the patches also appears to be working fine. The last kernel I know for sure deadlocks is b36f469a15, so I'm going to test cd25b0f740 and 28d2e3b5de. c47116e909 with cd25b0f740 and 28

ZFS deadlock in 14: with ZIL :-)

2023-08-19 Thread Graham Perrin
On 19/08/2023 11:31, Dimitry Andric wrote: On 19 Aug 2023, at 09:36, Graham Perrin wrote: … no ZIL; I never used the feature. The ZIL always exists, but it can be stored on a separate device for performance reasons. See zpoolconc

Re: ZFS deadlock in 14

2023-08-19 Thread Dag-Erling Smørgrav
Dag-Erling Smørgrav writes: > c47116e909 with cd25b0f740 and 28d2e3b5de reverted deadlocks, see > attached ddb.txt. I'm going to see if reverting only 28d2e3b5de but not > cd25b0f740 changes anything. c47116e909 with only 28d2e3b5de reverted also deadlocked, but in both cases it took much longer

Re: ZFS deadlock in 14: without ZIL

2023-08-19 Thread Dimitry Andric
On 19 Aug 2023, at 09:36, Graham Perrin wrote: > > On 19/08/2023 00:03, Mark Millard wrote: >> I believe the below quoted messages were reports of deadlocks >> based on after the following 2 MFV being in place in their >> environments: >> >> Thu, 10 Aug 2023 >> . . . >> • git: cd25b0f740f8 -

ZFS deadlock in 14: without ZIL

2023-08-19 Thread Graham Perrin
On 19/08/2023 00:03, Mark Millard wrote: I believe the below quoted messages were reports of deadlocks based on after the following 2 MFV being in place in their environments: Thu, 10 Aug 2023 . . . • git: cd25b0f740f8 - main - zfs: cherry-pick fix from openzfs Martin Matuska • git: 2

Re: ZFS deadlock in 14

2023-08-18 Thread Mark Millard
I believe the below quoted messages were reports of deadlocks based on after the following 2 MFV being in place in their environments: Thu, 10 Aug 2023 . . . • git: cd25b0f740f8 - main - zfs: cherry-pick fix from openzfs Martin Matuska • git: 28d2e3b5dedf - main - zfs: cherry-pick fix fr

Re: ZFS deadlock in 14

2023-08-18 Thread Dag-Erling Smørgrav
Dag-Erling Smørgrav writes: > Plot twist: c47116e909 _without_ the patches also appears to be working > fine. The last kernel I know for sure deadlocks is b36f469a15, so I'm > going to test cd25b0f740 and 28d2e3b5de. c47116e909 with cd25b0f740 and 28d2e3b5de reverted deadlocks, see attached ddb.

Re: ZFS deadlock in 14

2023-08-18 Thread Mark Millard
[I had sent to the wrong list. Just fixing that here.] On Aug 18, 2023, at 09:26, Mark Millard wrote: > Dag-Erling_Smørgrav wrote on > Date: Fri, 18 Aug 2023 14:16:12 UTC : > >> Dag-Erling Smørgrav writes: >>> A kernel built from c47116e909 plus these two patches has so far built >>> 2,285 pa

Re: ZFS deadlock in 14

2023-08-18 Thread Dag-Erling Smørgrav
Dag-Erling Smørgrav writes: > A kernel built from c47116e909 plus these two patches has so far built > 2,285 packages without a hitch, whereas normally it would have > deadlocked after well before reaching 500 packages. I'll do another run > without the patches tomorrow just to be sure. Plot twi

Re: ZFS deadlock in 14

2023-08-17 Thread Dag-Erling Smørgrav
Alexander Motin writes: > I don't have a FreeBSD branch, but these two patches apply clean and > build on top of today's FreeBSD main branch: > > https://github.com/openzfs/zfs/pull/15107 > https://github.com/openzfs/zfs/pull/15122 A kernel built from c47116e909 plus these two patches has so far

Re: ZFS deadlock in 14

2023-08-17 Thread Alexander Motin
On 17.08.2023 15:41, Dag-Erling Smørgrav wrote: Alexander Motin writes: Trying to run your test (so far without reproduction) I see it producing a substantial amount of ZIL writes. The range of commits you reduced the scope to so far includes my ZIL locking refactoring, where I know for sure a

Re: ZFS deadlock in 14

2023-08-17 Thread Dag-Erling Smørgrav
Alexander Motin writes: > Dag, That's not my name, Al. > Trying to run your test (so far without reproduction) I see it > producing a substantial amount of ZIL writes. The range of commits > you reduced the scope to so far includes my ZIL locking refactoring, > where I know for sure are some de

Re: ZFS deadlock in 14

2023-08-17 Thread Alexander Motin
On 17.08.2023 14:57, Alexander Motin wrote: On 15.08.2023 12:28, Dag-Erling Smørgrav wrote: Mateusz Guzik writes: Going through the list may or may not reveal other threads doing something in the area and it very well may be they are deadlocked, which then results in other processes hanging on

Re: ZFS deadlock in 14

2023-08-17 Thread Alexander Motin
On 15.08.2023 12:28, Dag-Erling Smørgrav wrote: Mateusz Guzik writes: Going through the list may or may not reveal other threads doing something in the area and it very well may be they are deadlocked, which then results in other processes hanging on them. Just like in your case the process re

Re: ZFS deadlock in 14

2023-08-15 Thread Dag-Erling Smørgrav
The attached script successfully deadlocks 9228ac3a69c4. DES -- Dag-Erling Smørgrav - d...@freebsd.org #!/bin/sh : ${n:=$(nproc)} : ${pool:=zroot} basefs="${pool}/zfsdl" set -eu zfs destroy -r "${basefs}" >/dev/null 2>&1 || true zfs create -o com.sun:auto-snapshot=false "${basefs}" basedir="$

Re: ZFS deadlock in 14

2023-08-15 Thread Dag-Erling Smørgrav
Mateusz Guzik writes: > Going through the list may or may not reveal other threads doing > something in the area and it very well may be they are deadlocked, > which then results in other processes hanging on them. > > Just like in your case the process reported as hung is a random victim > and wh

Re: ZFS deadlock in 14

2023-08-15 Thread Mateusz Guzik
On 8/15/23, Dag-Erling Smørgrav wrote: > Mateusz Guzik writes: >> Given that the custom reproducer failed I think the most prudent >> course of action is to reproduce again with poudriere, but this time >> arrange to have all stacktraces dumped. > > Why? What more information do you need? > Goi

Re: ZFS deadlock in 14

2023-08-15 Thread Dag-Erling Smørgrav
Mateusz Guzik writes: > Given that the custom reproducer failed I think the most prudent > course of action is to reproduce again with poudriere, but this time > arrange to have all stacktraces dumped. Why? What more information do you need? DES -- Dag-Erling Smørgrav - d...@freebsd.org

Re: ZFS deadlock in 14

2023-08-15 Thread Mateusz Guzik
On 8/15/23, Dag-Erling Smørgrav wrote: > Dag-Erling Smørgrav writes: >> I managed to geat a deadlock with 4e8d558c9d1c. Its predecessor >> 5ca7f02946 appears to be working. I'm going to try to come up with a >> more efficient way to reproduce the deadlock than running poudriere. > > I wrote a s

Re: ZFS deadlock in 14

2023-08-15 Thread Dag-Erling Smørgrav
Dag-Erling Smørgrav writes: > I managed to geat a deadlock with 4e8d558c9d1c. Its predecessor > 5ca7f02946 appears to be working. I'm going to try to come up with a > more efficient way to reproduce the deadlock than running poudriere. I wrote a script that creates multiple filesystems, snapsho

Re: ZFS deadlock in 14

2023-08-14 Thread Dag-Erling Smørgrav
Dag-Erling Smørgrav writes: > Trying to narrow this range down, I did not get a deadlock with > 4e8d558c9d1c (10 June) but I did with b7198dcfc039 (16 June) [...] > Perhaps I should try 4e8d558c9d1c again. I managed to geat a deadlock with 4e8d558c9d1c. Its predecessor 5ca7f02946 appears to be w

Re: ZFS deadlock in 14

2023-08-12 Thread Cy Schubert
On August 12, 2023 7:11:10 AM PDT, "Dag-Erling Smørgrav" wrote: >Dag-Erling Smørgrav writes: >> At some point between 42d088299c (4 May) and f0c9703301 (26 June), a >> deadlock was introduced in ZFS. > >Trying to narrow this range down, I did not get a deadlock with >4e8d558c9d1c (10 June) but I

Re: ZFS deadlock in 14

2023-08-12 Thread Dag-Erling Smørgrav
Dag-Erling Smørgrav writes: > At some point between 42d088299c (4 May) and f0c9703301 (26 June), a > deadlock was introduced in ZFS. Trying to narrow this range down, I did not get a deadlock with 4e8d558c9d1c (10 June) but I did with b7198dcfc039 (16 June), albeit after building ~1800 packages.

Re: ZFS deadlock in 14

2023-08-11 Thread Cy Schubert
The poudriere build machine building amd64 packages also panicked. But with: Dumping 2577 out of 8122 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91 % __curthread () at /opt/src/git-src/sys/amd64/include/pcpu_aux.h:59 59 __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struc

Re: ZFS deadlock in 14

2023-08-10 Thread Graham Perrin
On 11/08/2023 04:32, Kevin Bowling wrote: Spoke too soon still seeing zfs lockups Kevin, do you have a log device? (ZIL) under heavy poudriere workload … Can you tell what was building when the problem occurred? For example, qutebrowser /usr/local/poudriere/data/logs/bulk/main-default/la

Re: ZFS deadlock in 14

2023-08-10 Thread Kevin Bowling
Spoke too soon still seeing zfs lockups under heavy poudriere workload after the MFVs. Regression time matches what has been reported here. On Thu, Aug 10, 2023 at 4:33 PM Cy Schubert wrote: > I haven't experienced any problems (yet) either. > > > -- > Cheers, > Cy Schubert > FreeBSD UNIX:

Re: ZFS deadlock in 14

2023-08-10 Thread Cy Schubert
I haven't experienced any problems (yet) either. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 In message , Kevin Bowling writes: > The two MFVs on head have improved/fixed stability with p

Re: ZFS deadlock in 14

2023-08-10 Thread Kevin Bowling
The two MFVs on head have improved/fixed stability with poudriere for me 48 core bare metal. On Thu, Aug 10, 2023 at 6:37 AM Cy Schubert wrote: > > In message om> > , Kevin Bowling writes: > > Possibly https://github.com/openzfs/zfs/commit/2cb992a99ccadb78d97049b40bd4= > > 42eb4fdc549d > > > > O

Re: ZFS deadlock in 14

2023-08-10 Thread Cy Schubert
In message , Kevin Bowling writes: > Possibly https://github.com/openzfs/zfs/commit/2cb992a99ccadb78d97049b40bd4= > 42eb4fdc549d > > On Tue, Aug 8, 2023 at 10:08=E2=80=AFAM Dag-Erling Sm=C3=B8rgrav sd.org> wrote: > > > > At some point between 42d088299c (4 May) and f0c9703301 (26 June), a > > dea

Re: ZFS deadlock in 14

2023-08-10 Thread Kurt Jaeger
Hi! > > At some point between 42d088299c (4 May) and f0c9703301 (26 June), a > > deadlock was introduced in ZFS. It is still present as of 9c2823bae9 (4 > > August) and is 100% reproducable just by starting poudriere bulk in a > > 16-core VM and waiting a few hours until deadlkres kicks in. > >

Re: ZFS deadlock in 14

2023-08-10 Thread Kurt Jaeger
Hi! > At some point between 42d088299c (4 May) and f0c9703301 (26 June), a > deadlock was introduced in ZFS. It is still present as of 9c2823bae9 (4 > August) and is 100% reproducable just by starting poudriere bulk in a > 16-core VM and waiting a few hours until deadlkres kicks in. I have a amd

Re: ZFS deadlock in 14

2023-08-09 Thread Kevin Bowling
Possibly https://github.com/openzfs/zfs/commit/2cb992a99ccadb78d97049b40bd442eb4fdc549d On Tue, Aug 8, 2023 at 10:08 AM Dag-Erling Smørgrav wrote: > > At some point between 42d088299c (4 May) and f0c9703301 (26 June), a > deadlock was introduced in ZFS. It is still present as of 9c2823bae9 (4 >

Re: ZFS deadlock in 14

2023-08-08 Thread Graham Perrin
maybe. OpenPGP_signature Description: OpenPGP digital signature

Re: ZFS deadlock in 14

2023-08-08 Thread Dag-Erling Smørgrav
Alan Somers writes: > Do you have ZFS block cloning enabled on your pool? There were a lot > of bugs associated with that feature. I think that was merged on > 3-April. No, and this deadlock did not appear until May. DES -- Dag-Erling Smørgrav - d...@freebsd.org

Re: ZFS deadlock in 14

2023-08-08 Thread Alan Somers
On Tue, Aug 8, 2023 at 10:08 AM Dag-Erling Smørgrav wrote: > > At some point between 42d088299c (4 May) and f0c9703301 (26 June), a > deadlock was introduced in ZFS. It is still present as of 9c2823bae9 (4 > August) and is 100% reproducable just by starting poudriere bulk in a > 16-core VM and wa

ZFS deadlock in 14

2023-08-08 Thread Dag-Erling Smørgrav
At some point between 42d088299c (4 May) and f0c9703301 (26 June), a deadlock was introduced in ZFS. It is still present as of 9c2823bae9 (4 August) and is 100% reproducable just by starting poudriere bulk in a 16-core VM and waiting a few hours until deadlkres kicks in. In the latest instance, d