On 15.11.2011 01:00, Sebastian Chmielewski wrote:
On Tue, 15 Nov 2011 00:39:52 +0200
Alexander Motin<m...@freebsd.org> wrote:
SATA device can be dropped because of error during reset/ probe/
initialization sequence or because controller reported disconnection.
Verbose boot messages (boot -v from loader prompt) should give more
information about what happened there. Show please full verbose dmesg.
Using rc_debug="YES" in rc.conf I've found that my device is dropped during
sysctl_start. With empty sysctl.conf my device is not lost. The contents of
file seems quite innocent:
# Uncomment this to prevent users from seeing information about processes that
# are being run under another UID.
security.bsd.see_other_uids=1
# Enable/disable coredump
kern.coredump=1
# Up the maxfiles to 4x default
kern.maxfiles=49312
kern.ipc.shmmax=67108864
kern.ipc.shmall=32768
# Allow users to mount CD's
vfs.usermount=1
vfs.hirunningspace=8388608
vfs.lorunningspace=1048576
kern.corefile="/var/coredumps/%U/%N.core"
# Do not truncate command line arguments in ps(1) listing
kern.ps_arg_cache_limit=10000
# Tune for desktop usage
kern.sched.preempt_thresh=224
# Increase default setting - recommended for 2 GB of RAM
kern.maxvnodes=400000
dev.acpi_ibm.0.lcd_brightness=6
dev.acpi_ibm.0.lcd_brightness=3
net.link.tap.user_open=1
net.link.tap.up_on_open=1
The device is lost even when sysctl is started with new file when booting
finishes (I did service sysctl restart from X session).
# sysctl debug.bootverbose=1
# service sysctl restart
# dmesg
ahcich1: DISCONNECT requested
ahcich1: AHCI reset...
ahcich1: SATA connect timeout time=10000us status=00000000
ahcich1: AHCI reset: device not found
(ada1:ahcich1:0:0:0): lost device
(pass1:ahcich1:0:0:0): lost device
(pass1:ahcich1:0:0:0): removing device entry
Crazy, isn't it?
It is. I've never heard about such things.
Reset status looks like if device was indeed disconnected or powered
down. I don't even know how to do it this way, at least on Intel
chipsets. My laptop's BIOS has bug that disables SATA port after
suspend/resume, but there it can be seen in reset status that port was
explicitly disabled. I have only one crazy idea: while setting screen
brightness you are calling ACPI code that is black box by definition and
can do whatever it wants with hardware, including using any possible
custom power control interfaces.
Was the second disk initially planned in this laptop? Laptop vendors
more then desktop ones tend to hardcode things.
I would try two things:
- bisecting list of sysctls found one that cause this;
- tried to enable SATA interface power management for the device. If
power management was somehow enabled on the device around the OS, it may
cause false DISCONNECT messages, while it still it should not cause such
reset status. Setting hint.ahcich.1.pm_level=1 in loader.conf will make
ahci(4) driver do ignore link loss events. If device indeed lost, you
should see command timeouts and only then device loss.
--
Alexander Motin
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"