Having recently installed and got running an original Radeon QD card, I've discovered a simple and reliable way to lock the card up: I run the xscreensaver-demo version of gears with -delay 0 -fps -wireframe. I've seen the same lockup with flightgear once, with one of the fastest aircraft, and nowhere else - it doesn't happen with gears without the -wireframe option. With -wireframe, it takes at most a minute or so to lock up.
This happens with one opengl program running - I haven't tried with
more than one.
I can run Q3A timedemos without locking up - setting com_maxfps to
130 makes no difference to this (aside from upping the average
framerate).
The lockup starts out being incomplete: the mouse pointer will still
respond, but the windowmanager won't: windows won't redraw, none of
the keyboard shortcuts respond, but the mouse pointer behaves
normally. After a bit of time, or if any new program is started, or
if Alt-SysRq-b is hit, it locks up hard, and needs a hardware reset
to reboot. I tried sshing in, and succeeded in logging on and
getting a shell, but it locked up solid immediately after that.
Hitting Alt-SysRq-t shows gears in R state, in sys_ioctl(), where
the call into filp->f_op->ioctl is made.
Running strace on gears shows it making lots of ioctls, and then
hitting a point where the ioctl() calls fail with -EBUSY. The
transition point of the strace output:
ioctl(4, 0x6444, 0) = 0
ioctl(4, 0x6444, 0) = 0
ioctl(4, 0x6444, 0) = 0
ioctl(4, 0x6444, 0) = 0
ioctl(4, 0x6444, 0) = 0
ioctl(4, 0x6447, 0) = 0
write(3, "+\1\1\0", 4) = 4
read(3, "\1\2A\6\0\0\0\0\1\0@\5\0\0\0\0\1\0\0\0\0\0\0\0\1\0\0\0\330\10\363\10", 32) =
32
ioctl(3, 0x541b, [0]) = 0
ioctl(4, 0x4008642a, 0xbffff304) = 0
ioctl(4, 0x40186448, 0xbffff324) = 0
ioctl(4, 0x4010644f, 0xbfffebc4) = 0
ioctl(4, 0xc0286429, 0xbfffeba4) = 0
ioctl(4, 0x4008642a, 0xbfffec34) = 0
ioctl(4, 0x4010644f, 0xbfffec04) = 0
ioctl(4, 0x4008642b, 0xbfffec74) = 0
ioctl(4, 0x4008642a, 0xbfffec04) = 0
ioctl(4, 0x4010644f, 0xbfffebd4) = 0
ioctl(4, 0x6444, 0) = -1 EBUSY (Device or resource busy)
There was a long list of these - the lockup didn't happen at this
point, but after another 900 or so failed ioctls.
Loading the radeon.o module with the debug option enabled dumps a
load of gunk all over my /var/log/kern.log file . . . It overwrites
the file way before the point where it would have started writing,
so something strange is happening there . . .
I've tried looking through the code, but I can't make head nor tail
of it, so I don't have the foggiest notion how to work out what's
going wrong here. I'm willing to try any
suggestions/patches/whatever in order to track it down.
The details of my system: I'm running kernel 2.4.18-pre9. I have
Xfree86 4.2 installed from CVS, and the latest DRI CVS code. I have an
ASUS K7M motherboard - this has the AMD 751 northbridge. The card is
a Radeon QD: lspci reports
01:05.0 VGA compatible controller: ATI Technologies Inc Radeon QD (prog-if 00 [VGA])
Subsystem: ATI Technologies Inc: Unknown device 008a
Flags: bus master, stepping, 66Mhz, medium devsel, latency 64, IRQ 11
Memory at d8000000 (32-bit, prefetchable) [size=128M]
I/O ports at 9800 [size=256]
Memory at efe80000 (32-bit, non-prefetchable) [size=512K]
Expansion ROM at efe60000 [disabled] [size=128K]
Capabilities: [58] AGP version 2.0
Capabilities: [50] Power Management version 2
for the card, and
00:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-751 [Irongate] AGP Bridge (rev
01) (prog-if 00 [Normal decode])
Flags: bus master, 66Mhz, medium devsel, latency 64
Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
I/O behind bridge: 00008000-00009fff
Memory behind bridge: efd00000-efefffff
Prefetchable memory behind bridge: d7b00000-e7bfffff
for the AGP chipset. The motherboard's bios is out of date - an
update is out there, which I was unable to apply because of a dead
floppy drive. The update was a features update, rather a bugfix one
(unless the docs for it were incomplete).
I have a K7 550 in there, running at 550MHz. I have the
mem=nopentium thing on the kernel command line.
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 1
model name : AMD-K7(tm) Processor
stepping : 2
cpu MHz : 553.899
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat mmx
syscall mmxext 3dnowext 3dnow
bogomips : 1104.28
Finally, I've had a good year/18 months of successful DRI use with
the G400 I had before - no lockups, basically no problems at all.
As I said, I'd like to track this down, but I don't know enough to
do so myself - I'll do whatever people ask . . .
Simon Fowler
--
PGP public key Id 0x144A991C, or ftp://bg77.anu.edu.au/pub/himi/himi.asc
(crappy) Homepage: http://bg77.anu.edu.au
doe #237 (see http://www.lemuria.org/DeCSS)
My DeCSS mirror: ftp://bg77.anu.edu.au/pub/mirrors/css/
msg02919/pgp00000.pgp
Description: PGP signature
