I have reproduced today this same problem with a freshly installed and
fully up-to-date Ubuntu Trusty 14.04.3 64bit on a Dell Poweredge R300
with the SAS6i/R controller. The system has 2 SAS disks configured as a
RAID1 volume in the SAS 6i/R.
Switching to kernel linux-generic-lts-wily did not solv
** Tags removed: kernel-key
** Tags added: kernel-da-key
** Tags removed: performing-bisect
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1276705
Title:
Kernel 3.13 fail to boot with LSI SAS1068E (
Hello, Marco.
The problem handled by this entry is about mptsas_probe() hitting
scsi4: error handler thread failed to spawn, error = -12
mptsas: ioc0: WARNING - Unable to register controller with SCSI subsystem
BUG: unable to handle kernel NULL pointer dereference at 0060
due t
for us the problem is not solved! we can't install ubuntu an hardware
with LSI SAS1068E. now we get the next problem see attachment. we have
DO AN additional test:
we have install ubuntu 13.10 and upgrade to 14.04 with latest kernel and
have set rootdelay=120. we must multiple restart server to b
** Changed in: linux (Ubuntu Trusty)
Status: Confirmed => Fix Released
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1276705
Title:
Kernel 3.13 fail to boot with LSI SAS1068E (Dell SAS 6/iR)
I see that trusty has now a kernel with the fix included:
$ cat changelog.Debian
linux (3.13.0-21.43) trusty; urgency=low
[...]
[ Tetsuo Handa ]
* SAUCE: kthread: Do not leave kthread_create() immediately upon SIGKILL.
[...]
After a apt-get dist-upgrade to this kernel, I've successfully boot
I've tested this new kernel, it boot without issue on the server (as
usual, I booted three time the kernel to make it always works well).
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1276705
Title:
Sorry, typo in my link. The kernel is at:
http://kernel.ubuntu.com/~jsalisbury/lp1276705
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1276705
Title:
Kernel 3.13 fail to boot with LSI SAS1068E (Del
I built a test kernel based on the Trusty master-next branch. The
kernel is available at:
http://kernel.ubuntu.com/~jsalisbury/l;1276705
Can you test this kernel and see if it resolves this bug?
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed
(a) Linux kernel guys think that a hardcoded timeout is a systemd bug.
https://lkml.org/lkml/2014/3/23/42
(b) The systemd guys think that kernel module loading takes more than
30 seconds is a kernel module's bug. But Linux kernel guys won't
be able to fix it immediately. Also, solutio
Joseph,
kernel freeze is planed in 7 days, which will arrive very fast. Do you
think we could have a fix committed before this deadline ?
I still didn't tested the firmware upgrade. I didn't tested it to keep a
machine which exhibit the bug... upgrading firmware is okay with a local
machine, but
I've tested with kernel from comment #56.
The kernel generated too much logs for IPMI serial console (which
generated too much garbage), so I switched to a real serial console (and
at 115kbauds).
I've attached a archive with 3 runs (the last run it the most
interesting I think):
First run with s
Just a note; I've got a lot of experience with the 1068E controller.
Make sure you're running a recent firmware.
I have 1.33.00, "Phase 21", flashed on most of my 1068Es of various
vendor brands.
http://lime-technology.com/forum/index.php?topic=12767.0
Versions at Phase 20 (1.32.00 and below) ha
I built a test kernel that has a debugging patch from upstream[0].
The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1276705
Can you test that kernel and report back ?
[0] https://lkml.org/lkml/2014/3/19/494
--
You received this bug notification because you are a
Tetsuo,
Thanks so much for taking the time to work on this bug. I sent an email
to the systemd mailing list asking for some feedback.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1276705
Title:
K
Message sent to systemd mailing list:
http://lists.freedesktop.org/archives/systemd-devel/2014-March/018006.html
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1276705
Title:
Kernel 3.13 fail to boot
PierreF wrote:
> Applied patch on tag v3.14-rc6 (fa389e2), run kernel 4 four times, all
> worked.
Thank you!
Now we proved that systemd-udevd's 30 seconds timeout is the trigger of
this problem. It would be best if we can fix systemd side.
Joseph, is there any possibility that systemd-udevd's ti
Applied patch on tag v3.14-rc6 (fa389e2), run kernel 4 four times, all
worked.
We seen on output (full output attached):
[5.537193] mousedev: PS/2 mouse device common for all mice
[ [9.776032] floppy0: no floppy controllers found
[ 36.823538] Ignored SIGKILL by systemd-udevd
[ 38.3560
Pierre, would you give me a hand? I proposed the final patch but
I'm unable to prove that SIGKILL sent by systemd-udevd's 30 seconds
timeout is the trigger of this problem, for I don't have a real
machine which takes very long time upon initialization.
According to https://lkml.org/lkml/2014/3/18/
I reproduced a similar result using test patch shown below.
-- test patch start --
diff --git a/drivers/message/fusion/mptspi.c b/drivers/message/fusion/mptspi.c
index 5653e50..eaaa5e2 100644
--- a/drivers/message/fusion/mptspi.c
+++ b/drivers/message/fusion/mptspi.c
@@ -1412,6 +14
I've tested the final patch againt both 786235ee and tag v3.14-rc6
(fa389e22). It still works.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1276705
Title:
Kernel 3.13 fail to boot with LSI SAS1068E
Great!
I updated this patch to be more OOM killer friendly.
I will propose this patch for 3.14-final.
** Patch added: "Final patch"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1276705/+attachment/4026689/+files/kthread-Do-not-leave-kthread_create-immediately.patch
--
You received
Yes, it is working!
With this new patch applied (on 786235ee), server boot without any
issue.
I've attached the console ouput (which show no error).
As for other test, I've booter 5 times on this kernel to be sure it was
not by luck that it work.
Thanks for this fix.
** Attachment added: "Cons
I changed this patch to call wait_for_completion() again only if
wait_for_completion_timeout() returned 0, for
wait_for_completion_timeout() will return non-0 if completed.
** Patch added: "kthread: defer leaving kthread_create() upon SIGKILL. (v2)"
https://bugs.launchpad.net/ubuntu/+source/l
Thank you. I missed that we are not allowed to call wait_for_completion() again
if wait_for_completion_timeout() succeeded, for do_wait_for_common() does
x->done-- which cancels x->done++ done by complete(). I must update this patch.
--
You received this bug notification because you are a member
** Attachment added: "Console output with kthread-defer-leaving patch applied
on 786235ee"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1276705/+attachment/4026609/+files/serial-ouput-patch-kthread-defer-leaving.txt
--
You received this bug notification because you are a member of Ub
I've tested the following:
* v3.14-rc6-trusty from comment #38 : still fail with same error.
* Kernel 786235eeba0e1e85e5cbbb9f97d1087ad03dfa21 with patch check-sigkill :
still got the fail to spawn thread. I will attach full output from serial
console.
* Kernel 786235eeba0e1e85e5cbbb9f97d1087ad0
Would you try this patch?
** Patch added: "kthread: defer leaving kthread_create() upon SIGKILL."
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1276705/+attachment/4026192/+files/kthread-defer-leaving.patch
--
You received this bug notification because you are a member of Ubuntu
Bugs
OK. I read this thread.
I'm sure that somebody is sending SIGKILL to the systemd-udevd process
who is doing finit_module() system call, after waiting for 30 seconds.
However, since the probe function takes more than 30 seconds, the probe
function already received SIGKILL by the moment scsi_host_al
Tetsuo, see my comments above for the diagnosis of two different bugs.
I've already identified (b) as well and it's completely orthogonal to
the bug in question. No idea about (a), though, your comment is
interesting.
--
You received this bug notification because you are a member of Ubuntu
Bugs,
That return statement is called only when wait_for_completion_killable()
returned an error. That is, the caller received SIGKILL while waiting for
kthreadd to create a kernel thread.
That matches your bisection result because commit 786235ee changed to return to
the caller when the caller received
Just for completeness, can you test the latest mainline kernel, which can be
downloaded from:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.14-rc6-trusty/
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/b
Link to message sent upstream:
https://lkml.org/lkml/2014/3/14/475
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1276705
Title:
Kernel 3.13 fail to boot with LSI SAS1068E (Dell SAS 6/iR)
To manage
** Tags added: kernel-bug-exists-upstream
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1276705
Title:
Kernel 3.13 fail to boot with LSI SAS1068E (Dell SAS 6/iR)
To manage notifications about this
Thanks for the detailed feedback, Pierre. I'll ask for some feedback
from upstream about reverting commit:
786235eeba0e1e85e5cbbb9f97d1087ad03dfa21
** Also affects: linux (Ubuntu Trusty)
Importance: High
Status: Confirmed
--
You received this bug notification because you are a member
With few hopes, I've tried the latest kernel from:
* trusty: linux 3.13.0-16.36 (linux-image-3.13.0-16-generic)
* trusty-proposed (downloaded from launchpad directly) : linux 3.13.0-17.37
Both still have the bug.
--
You received this bug notification because you are a member of Ubuntu
Bugs, whi
If it help, I've done another change (against git hash 786235ee):
diff --git a/kernel/kthread.c b/kernel/kthread.c
index b5ae3ee..25a4780 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -298,7 +298,7 @@ struct task_struct *kthread_create_on_node(int
(*threadfn)(void *data),
I've attached the debdiff patch for trusty.
I'm building a backport for precise to test if slapd can start with this
patch applied (the server on which the issue occure is running precise).
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubun
By more testing, you just mean reboot several time on this kernel to
check that the isssue do not appear sometime ?
During my bisect, I always booted 3 times on good kernel to make sure it
was not by "luck" that the kernel worked. I also booted three time the
kernel from comment #28.
To double ch
** Tags added: patch
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1276705
Title:
Kernel 3.13 fail to boot with LSI SAS1068E (Dell SAS 6/iR)
To manage notifications about this bug go to:
https://bu
Hmm, if you think there is a race condition, then there should be more
testing done with the kernel posted in comment #28. If we can be
certain that reverting commit 786235eeba0e1e85e5cbbb9f97d1087ad03dfa21
fixes this bug, then I can send a message upstream and to the bug author
to get their feeba
Any update ?
If i can help for something tell me, but I don't know kernel and can't do
debuging of it by myself.
I've tried to identify which ENOMEM cause the issue by added the printk
(one before the first ENOMEM, one before the second ENOMEM, one after
both ENOMEM)... but with just this change
Yes, this version is working:
Linux version 3.13.0-12-generic (root@gomeisa) (gcc version 4.8.2
(Ubuntu 4.8.2-15ubuntu3) ) #32 SMP Mon Feb 24 18:50:37 UTC 2014 (Ubuntu
3.13.0-12.32-generic 3.13.4)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed
I built a test kernel with commit
786235eeba0e1e85e5cbbb9f97d1087ad03dfa21 reverted.
The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1276705
Note there is a linux-image and linux-image-extra package, both need to
be installed.
Can you test that kernel and report ba
Pretty sure that's it. -12 is -ENOMEM, and there are two sites in that
commit that return -ENOMEM, right when we have an error message from the
failure to spawn the SCSI error handler thread.
Note that the oops/backtrace is a red herring. There is a secondary,
unrelated bug in the mptsas code that
Bisect finished. The first bad commit is
786235eeba0e1e85e5cbbb9f97d1087ad03dfa21. It seem more likely as this
commit concerne kthread (and the first error is "scsi4: error handler
thread failed to spawn, error = -12").
I also attach my bisect log if needed.
** Attachment added: "bisect.log"
Ok, I've restarted a bisect without limitation on driver/scsi (git
bisect start v3.13-rc1 v3.12).
Git tell me it's 13 steps, will took some time, but during middle of
next week we should have the bad commit.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is
Well, Pierre gave the full output, so the answer to your question is yes
(that's really easy to verify).
The merge commit doesn't look like having any relevant code changes, so
this is a dead end.
I actually think that limiting to drivers/scsi was a bad idea from the
start. The v3.12..v3.13-rc1 d
Hmm, commit 8ceafbfa91ffbdbb2afaea5c24ccb519ffb8b587 is a merge commit,
so the bisect should have went deeper. Did the bisect specifically tell
you that commit 8ceafbfa91ffbdbb2afaea5c24ccb519ffb8b587 was the first
bad commit?
--
You received this bug notification because you are a member of Ubu
I can not test this kernel, it was only build for i386. The server is
installed with amd64 :(
Because of timezone difference we can only test one kernel per day, to speed up
the bisect, I've done one by myself, the result is the following:
$ git bisect log
# bad: [6ce4eac1f600b34f2f7f58f9cd8f0
I built the next test kernel, up to the following commit:
2f466d33f5f60542d3d82c0477de5863b22c94b9
The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1276705
Can you test that kernel and report back if it has the bug or not. I
will build the next test kernel based on y
This kernel version is also good:
Linux version 3.12.0-031200rc5-generic (jsalisbury@gomeisa) (gcc version
4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201402131403 SMP Thu Feb 13
19:04:57 UTC 2014
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to U
I built the next test kernel, up to the following commit:
323f6226a816f0b01514d25fba5529e0e68636c3
The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1276705
Can you test that kernel and report back if it has the bug or not. I
will build the next test kernel based on y
This one is good, it is working:
[...]
Linux version 3.12.0-031200rc5-generic (jsalisbury@gomeisa) (gcc version 4.6.3
(Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201402131150 SMP Thu Feb 13 16:54:49 UTC 2014
[...]
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is sub
I built the next test kernel, up to the following commit:
53151bbb83f11b358ac94eddd81347c581dc51ea
This kernel is from a new bisect, which will only bisect through the
scsi directory.
The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsalisbury/lp1276705
Can you test that kernel
Thanks for testing and the update. I'm actually going to restart the
bisect to just go through the ~drivers/scsi subsystem. That should
speed up the process considerably.
I'll build the next test kernel and post it shortly.
--
You received this bug notification because you are a member of Ubun
Tested this kernel. It is NOT working, it has the issue.
Extract of console log:
[...]
Linux version 3.12.0-031200-generic (jsalisbury@gomeisa) (gcc version 4.6.3
(Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201402101715 SMP Thu Feb 13 14:58:01 UTC 2014
[...]
[ 42.455969] scsi4: error handler thread fail
I started a kernel bisect between v3.12 final and v3.13-rc1. The kernel
bisect will require testing of about 7-10 test kernels.
I built the first test kernel, up to the following commit:
5cbb3d216e2041700231bcfc383ee5f8b7fc8b74
The test kernel can be downloaded from:
http://kernel.ubuntu.com/~jsa
** Tags added: kernel-key
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1276705
Title:
Kernel 3.13 fail to boot with LSI SAS1068E (Dell SAS 6/iR)
To manage notifications about this bug go to:
https
We're also experiencing the same issue when booting the trusty installer
on a Dell PowerEdge R610 with a Dell SAS 6/iR controller (LSI SAS1068E),
resulting in the inability to install/run this system. This is an
extremely popular controller, prevalent in a signifcant portion of Dell
systems (or sys
Yes, I confirm that 3.12-saucy works.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1276705
Title:
Kernel 3.13 fail to boot with LSI SAS1068E (Dell SAS 6/iR)
To manage notifications about this bug
Can you confirm the bug does not exists in the 3.12 final kernel:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.12-saucy/
If that is the case, we can bisect between v3.12 final and v3.13-rc1.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed t
None of them worked. All had the same issue.
Tested:
* v3.13-rc3
* v3.13-rc2
* v3.13-rc1 (http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.13-rc1-trusty/)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bug
Can you test the following kernels and report back? We are looking for
the first kernel version that exhibits this bug:
v3.13-rc3: http://kernel.ubuntu.com/~kernel-
ppa/mainline/v3.13-rc3-trusty/
If v3.13-rc3 does not exhibit the bug then test v3.13-rc6:
v3.13-rc6: http://kernel.ubuntu.com/~kerne
It still occure with 3.14.0-031400rc1-generic.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1276705
Title:
Kernel 3.13 fail to boot with LSI SAS1068E (Dell SAS 6/iR)
To manage notifications about
Thanks for all the testing. So it sounds like this bug was introduced
in the 3.13 kernel. In order to bisect, we need to narrow down further
and identify which v3.13 release candidate introduced the regression.
However, it might be good to first test the latest mainline kernel to
see if this has
I booted kernel with following common option : "ro console=tty0
console=ttyS1,57600".
When booted with rootdelay, it's "rootdelay=45".
The result are the following:
* 3.13.0-7-generic, rootdelay => error
* 3.13.0-7-generic, no rootdelay => Ok
* 3.6, rootdelay => Ok
* 3.12, rootdelay => Ok, t
I'd like to perform a bisect to figure out what commit caused this
regression. We need to identify the earliest kernel where the issue
started happening as well as the latest kernel that did not have this
issue.
Can you test the following kernels and report back? We are looking for
the first kerne
** Changed in: linux (Ubuntu)
Importance: Undecided => High
** Tags added: performing-bisect
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1276705
Title:
Kernel 3.13 fail to boot with LSI SAS106
** Attachment added: "dmesg from a running system (no rootdelay, press
control-d in initramfs)"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1276705/+attachment/3970066/+files/dmesg.txt
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed
** Attachment added: "lspci -vnn on the server"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1276705/+attachment/3970067/+files/lspci.txt
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/12767
71 matches
Mail list logo