Adding xen-devel back. On Wed, Dec 29, 2021 at 01:44:18AM +0800, G.R. wrote: > On Tue, Dec 28, 2021 at 3:05 AM Roger Pau Monné <[email protected]> wrote: > > > > On Sun, Dec 26, 2021 at 02:06:55AM +0800, G.R. wrote: > > > > > Thanks. I've raised this on freensd-net for advice [0]. IMO netfront > > > > > shouldn't receive an mbuf that crosses a page boundary, but if that's > > > > > indeed a legit mbuf I will figure out the best way to handle it. > > > > > > > > > > I have a clumsy patch (below) that might solve this, if you want to > > > > > give it a try. > > > > > > > > Applied the patch and it worked like a charm! > > > > Thank you so much for your quick help! > > > > Wish you a wonderful holiday! > > > > > > I may have said too quickly... > > > With the patch I can attach the iscsi disk and neither the dom0 nor > > > the NAS domU complains this time. > > > But when I attempt to mount the attached disk it reports I/O errors > > > randomly. > > > By randomly I mean different disks behave differently... > > > I don't see any error logs from kernels this time. > > > (most of the iscsi disks are NTFS FS and mounted through the user mode > > > fuse library) > > > But since I have a local backup copy of the image, I can confirm that > > > mounting that backup image does not result in any I/O error. > > > Looks like something is still broken here... > > > > Indeed. That patch was likely too simple, and didn't properly handle > > the split of mbuf data buffers. > > > > I have another version based on using sglist, which I think it's also > > a worthwhile change for netfront. Can you please give it a try? I've > > done a very simple test and seems fine, but you certainly have more > > interesting cases. > > > > You will have to apply it on top of a clean tree, without any of the > > other patches applied. > > Unfortunately this new version is even worse. > It not only does not fix the known issue on iSCSI, but also creating > regression on NFS. > The regression on NFS is kind of random that it takes a > non-deterministic time to show up. > Here is a stack trace for reference: > db:0:kdb.enter.default> bt > Tracing pid 1696 tid 100622 td 0xfffff800883d5740 > kdb_enter() at kdb_enter+0x37/frame 0xfffffe009f80d900 > vpanic() at vpanic+0x197/frame 0xfffffe009f80d950 > panic() at panic+0x43/frame 0xfffffe009f80d9b0 > xn_txq_mq_start_locked() at xn_txq_mq_start_locked+0x5bc/frame > 0xfffffe009f80da50
I think this is hitting a KASSERT, could you paste the text printed as part of the panic (not just he backtrace)? Sorry this is taking a bit of time to solve. Thanks!
