Jon Steel <[EMAIL PROTECTED]> writes:

> I forgot to add:
> 
> In the log of pmap.c I found
> 
> revision 1.97
> date: 2007/02/20 21:15:01;  author: tom;  state: Exp;  lines: +204 -500
> Revert PAE pmap for now, until the strange bug is found.  This stops
> the freezes many of us are seeing (especially on amd64 machines running
> OpenBSD/i386).
> 
> Much testing by nick@ (as always - thanks!), hugh@, ian@, kettenis@
> and Sam Smith (s (at) msmith (dot) net).
> 
> Requested by, input from, and ok deraadt@  ok art@, kettenis@, miod@
> 
> 
> What is "the strange bug"?

Most likely the things you've been seeing, although that's not certain.
Some people have been seeing the bug even after the PAE pmap was removed,
but definitely not as many (one of the developers rather than 5-10).

//art

> Thanks again
> 
> 
> Jon Steel wrote:
> > Hi
> >
> > Ive finally got the current version running and the problem below has
> > disappeared. I was wondering however if the problem has actually been
> > solved.
> >
> > The line of code that Im crashing on is line 3005 of pmap.c in version 4.0:
> >
> > 3005                if (pve->pv_ptp && (PDE(pve->pv_pmap,
> > 3006                     pdei(pve->pv_va)) & PG_FRAME) !=
> > 3007                     VM_PAGE_TO_PHYS(pve->pv_ptp)) {
> >
> > Specifically its crashing on PDE(pve->pv_pmap, pdei(pve->pv_val) because
> > of a page fault. This code has disappeared in -current, but does anybody
> > who was working on this section of code now why I was having this
> > problem or if its been fixed?
> >
> > Thank you
> >
> > Jonathan  Steel
> >
> >
> > Jon Steel wrote:
> >   
> >> Hi
> >>
> >> Im having a very similar problem as the one reported in Bug Query 5374.
> >> Im trying to solve the problem but Im finding it very hard to even get
> >> started. Is there somewhere besides the code that I can start to try and
> >> understand how SMP is being handled?
> >>
> >> http://cvs.openbsd.org/cgi-bin/query-pr-wrapper?full=yes&numbers=5374
> >>
> >> I can usually duplicate the crash by running the follwing script several
> >> times concurrently.
> >>
> >> #!/usr/bin/perl
> >>
> >> system("tcpdump -i em1 -w /var/crashTest1.pcap&");
> >> system("tcpdump -i em1 -w /var/crashTest2.pcap&");
> >> system("tcpdump -i em1 -w /var/crashTest3.pcap&");
> >> system("tcpdump -i em1 -w /var/crashTest4.pcap&");
> >> system("tcpdump -i em1 -w /var/crashTest5.pcap&");
> >> system("tcpdump -i em1 -w /var/crashTest6.pcap&");
> >> system("tcpdump -i em1 -w /var/crashTest7.pcap&");
> >>
> >> while (1) {
> >>     system("nmap 192.168.66.90&");
> >> }
> >>
> >> Then after about an hour, when you try and reboot, I get an error:
> >>
> >> uvm_fault(0x..., 0x..., 0, 1) -> e
> >> kernel: page fault trap, code = 0
> >> stopped at pmap_page_remove_86+0x114:
> >>     0(%eax, %edx, 4), %eax
> >>
> >> The trace output is:
> >>
> >> pmap_page_remove_86(d0d31420,c0,e9b57e2c,d04adeb9,e99f) at 
> >> pmap_page_remove_86+0x114
> >> uvm_vnp_terminate(d8034e04,0,0,0,0,14,0,d7e95004) at uvm_vnpterminate+0x31f
> >> uvm_attach(d8034e04,0,2,0,d7f38378) at uvn_attach+0x2b5
> >> uvm_unmap_detach(d7e959a4,0,d7f3841c,1) at uvm_unmap_detach+-x62
> >> uvmspace_free(d7f38378,6,d08120e0) at uvmspace_free+0xfd
> >> uvm_exit(d7fbb868,14,8,286) at uvm_exit+0x19
> >> reaper(d80df430) at reaper+0x90
> >> Bad frame pointer: 0xd0913eb8
> >>
> >>
> >> A couple times the error has also occured on its own without saying
> >> 'reboot' when running a ton of nmaps and tcpdumps at the same time.
> >>
> >> This trace is remarkably similar to the one in Bug Query 5374.
> >> Additionally I am using the same processor as he is. There is an unkown
> >> core statement in my dmesg but both cores seem to be working correctly.
> >> Here is my dmesg:
> >>
> >> OpenBSD 4.0 (GENERIC.MP) #936: Sat Sep 16 19:27:28 MDT 2006
> >>     [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC.MP
> >> cpu0: Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz ("GenuineIntel" 686-class)
> >> 2.13 GHz
> >> cpu0:
> >> FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CF
> >> LUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,CX16
> >> real mem  = 2145869824 (2095576K)
> >> avail mem = 1949290496 (1903604K)
> >> using 4256 buffers containing 107397120 bytes (104880K) of memory
> >> mainbus0 (root)
> >> bios0 at mainbus0: AT/286+(e6) BIOS, date 10/30/06, BIOS32 rev. 0 @
> >> 0xfd470, SMB IOS rev. 2.51 @ 0x7feea000 (33 entries)
> >> bios0: Supermicro PDSMi
> >> pcibios0 at bios0: rev 2.1 @ 0xfd470/0xb90
> >> pcibios0: PCI BIOS has 20 Interrupt Routing table entries
> >> pcibios0: PCI Interrupt Router at 000:31:0 ("Intel 82801GB LPC" rev 0x00)
> >> pcibios0: PCI bus #15 is the last bus
> >> bios0: ROM list: 0xc0000/0xb000 0xcb000/0x1000 0xcc000/0x1000 
> >> 0xcd000/0x1000
> >> ipmi at mainbus0 not configured
> >> mainbus0: Intel MP Specification (Version 1.4) (INTEL    MUKILTEO    )
> >> cpu0 at mainbus0: apid 0 (boot processor)
> >> cpu0: unknown Core FSB_FREQ value 0 (0x42080000)
> >> cpu0: apic clock running at 266 MHz
> >> cpu1 at mainbus0: apid 1 (application processor)
> >> cpu1: Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz ("GenuineIntel" 686-class)
> >> 2.13 GHz
> >> cpu1:
> >> FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CF
> >> LUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,CX16
> >> mainbus0: bus 0 is type PCI
> >> mainbus0: bus 9 is type PCI
> >> mainbus0: bus 10 is type PCI
> >> mainbus0: bus 13 is type PCI
> >> mainbus0: bus 14 is type PCI
> >> mainbus0: bus 15 is type PCI
> >> mainbus0: bus 16 is type ISA
> >> ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 20, 24 pins
> >> ioapic1 at mainbus0: apid 3 pa 0xfec10000, version 20, 24 pins
> >> pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
> >> pchb0 at pci0 dev 0 function 0 "Intel E7230 MCH" rev 0xc0
> >> ppb0 at pci0 dev 1 function 0 "Intel E7230 PCIE" rev 0xc0
> >> pci1 at ppb0 bus 1
> >> ppb1 at pci0 dev 28 function 0 "Intel 82801GB PCIE" rev 0x01
> >> pci2 at ppb1 bus 9
> >> ppb2 at pci2 dev 0 function 0 "Intel PCIE-PCIE" rev 0x09
> >> pci3 at ppb2 bus 10
> >> em0 at pci3 dev 1 function 0 "Intel PRO/1000GT (82541GI)" rev 0x05: apic
> >> 3 int 0  (irq 11), address 00:0e:0c:b6:80:9e
> >> "Intel IOxAPIC" rev 0x09 at pci2 dev 0 function 1 not configured
> >> ppb3 at pci0 dev 28 function 4 "Intel 82801G PCIE" rev 0x01
> >> pci4 at ppb3 bus 13
> >> em1 at pci4 dev 0 function 0 "Intel PRO/1000MT (82573E)" rev 0x03: apic
> >> 2 int 16  (irq 11), address 00:30:48:8a:ca:f8
> >> ppb4 at pci0 dev 28 function 5 "Intel 82801G PCIE" rev 0x01
> >> pci5 at ppb4 bus 14
> >> em2 at pci5 dev 0 function 0 "Intel PRO/1000MT (82573L)" rev 0x00: apic
> >> 2 int 17  (irq 11), address 00:30:48:8a:ca:f9
> >> uhci0 at pci0 dev 29 function 0 "Intel 82801GB USB" rev 0x01: apic 2 int
> >> 23 (irq  10)
> >> usb0 at uhci0: USB revision 1.0
> >> uhub0 at usb0
> >> uhub0: Intel UHCI root hub, rev 1.00/1.00, addr 1
> >> uhub0: 2 ports with 2 removable, self powered
> >> uhci1 at pci0 dev 29 function 1 "Intel 82801GB USB" rev 0x01: apic 2 int
> >> 19 (irq  11)
> >> usb1 at uhci1: USB revision 1.0
> >> uhub1 at usb1
> >> uhub1: Intel UHCI root hub, rev 1.00/1.00, addr 1
> >> uhub1: 2 ports with 2 removable, self powered
> >> uhci2 at pci0 dev 29 function 2 "Intel 82801GB USB" rev 0x01: apic 2 int
> >> 18 (irq  5)
> >> usb2 at uhci2: USB revision 1.0
> >> uhub2 at usb2
> >> uhub2: Intel UHCI root hub, rev 1.00/1.00, addr 1
> >> uhub2: 2 ports with 2 removable, self powered
> >> uhci3 at pci0 dev 29 function 3 "Intel 82801GB USB" rev 0x01: apic 2 int
> >> 16 (irq  11)
> >> usb3 at uhci3: USB revision 1.0
> >> uhub3 at usb3
> >> uhub3: Intel UHCI root hub, rev 1.00/1.00, addr 1
> >> uhub3: 2 ports with 2 removable, self powered
> >> ehci0 at pci0 dev 29 function 7 "Intel 82801GB USB" rev 0x01: apic 2 int
> >> 23 (irq  10)
> >> usb4 at ehci0: USB revision 2.0
> >> uhub4 at usb4
> >> uhub4: Intel EHCI root hub, rev 2.00/1.00, addr 1
> >> uhub4: 8 ports with 8 removable, self powered
> >> ppb5 at pci0 dev 30 function 0 "Intel 82801BA AGP" rev 0xe1
> >> pci6 at ppb5 bus 15
> >> vga1 at pci6 dev 0 function 0 "ATI ES1000" rev 0x02
> >> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
> >> wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
> >> ichpcib0 at pci0 dev 31 function 0 "Intel 82801GB LPC" rev 0x01: PM 
> >> disabled
> >> pciide0 at pci0 dev 31 function 1 "Intel 82801GB IDE" rev 0x01: DMA,
> >> channel 0 c onfigured to compatibility, channel 1 configured to
> >> compatibility
> >> atapiscsi0 at pciide0 channel 0 drive 0
> >> scsibus0 at atapiscsi0: 2 targets
> >> cd0 at scsibus0 targ 0 lun 0: <TEAC, CD-224E-N, 1.AA> SCSI0 5/cdrom
> >> removable
> >> cd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2
> >> pciide0: channel 1 disabled (no drives)
> >> pciide1 at pci0 dev 31 function 2 "Intel 82801GR AHCI SATA" rev 0x01:
> >> DMA, chann el 0 wired to native-PCI, channel 1 wired to native-PCI
> >> pciide1: using apic 2 int 19 (irq 11) for native-PCI interrupt
> >> wd0 at pciide1 channel 0 drive 0: <WDC WD3200YS-01PGB0>
> >> wd0: 16-sector PIO, LBA48, 305245MB, 625142448 sectors
> >> wd1 at pciide1 channel 0 drive 1: <WDC WD5000YS-01MPB1>
> >> wd1: 16-sector PIO, LBA48, 476940MB, 976773168 sectors
> >> wd0(pciide1:0:0): using PIO mode 4, Ultra-DMA mode 5
> >> wd1(pciide1:0:1): using PIO mode 4, Ultra-DMA mode 5
> >> wd2 at pciide1 channel 1 drive 0: <WDC WD5000YS-01MPB0>
> >> wd2: 16-sector PIO, LBA48, 476940MB, 976773168 sectors
> >> wd2(pciide1:1:0): using PIO mode 4, Ultra-DMA mode 5
> >> ichiic0 at pci0 dev 31 function 3 "Intel 82801GB SMBus" rev 0x01: apic 2
> >> int 19 (irq 11)
> >> iic0 at ichiic0
> >> lm1 at iic0 addr 0x2d: W83627HF
> >> "unknown" at iic0 addr 0x2f not configured
> >> isa0 at ichpcib0
> >> isadma0 at isa0
> >> pckbc0 at isa0 port 0x60/5
> >> pckbd0 at pckbc0 (kbd slot)
> >> pckbc0: using irq 1 for kbd slot
> >> wskbd0 at pckbd0: console keyboard, using wsdisplay0
> >> pcppi0 at isa0 port 0x61
> >> midi0 at pcppi0: <PC speaker>
> >> spkr0 at pcppi0
> >> lpt0 at isa0 port 0x378/4 irq 7
> >> lm0 at isa0 port 0x290/8: W83627HF
> >> lm1 detached
> >> npx0 at isa0 port 0xf0/16: using exception 16
> >> pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
> >> pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
> >> fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
> >> biomask 0 netmask 0 ttymask 0
> >> ioapic0: pin 16 shares different IPL interrupts (40..50), degraded
> >> performance
> >> pctr: 686-class user-level performance counters enabled
> >> mtrr: Pentium Pro MTRR support
> >> dkcsum: wd0 matches BIOS drive 0x80
> >> dkcsum: wd1 matches BIOS drive 0x82
> >> dkcsum: wd2 matches BIOS drive 0x81
> >> root on wd0a
> >> rootdev=0x0 rrootdev=0x300 rawdev=0x302
> >> cpu1: unknown Core FSB_FREQ value 0 (0x42080000)
> >>
> >> Thank You
> >>
> >> Jonathan Steel

Reply via email to