Hi Matthew, Stefan is right, you should reduce the DEBUG messages depth to find the fail cause.
I have tried SRP boot only with Hermon driver (ConnectX) and it worked for me. Regards, Itay On Wed, Jun 23, 2010 at 11:27 AM, Stefan Hajnoczi <[email protected]>wrote: > On Wed, Jun 23, 2010 at 6:44 AM, M Lowe <[email protected]> wrote: > > My motherboard doesn't have a serial port, so that's not an > > option. Unless gPXE supports USB-Serial converters? > > gPXE doesn't support USB. Can your BIOS can redirect to (USB-)serial? > > > Arbel 0x1f7c4 command failed with status 22: > > 000404c8: 00 00 00 00 00 00 00 00-00 00 00 00 00 (rest of line cut off) > > 000404d8: cf ec 00 00 00 00 0 (rest of line cut off) > > I think this is the error code (from the Linux driver): > > /* HCA local attached memory not present: */ > MTHCA_CMD_STAT_LAM_NOT_PRE = 0x22, > > The gPXE source says this error can be ignored. > > > Arbel 0x1f7c4 command failed with status 0a: > > 0004019c: 00 00 00 00 cf eb f0 00-00 00 00 02 00 00 00 00 : > > ............... > > 000401ac: cf ec 00 00 00 00 00 00-0a 00 30 24 : .........0$ > > Arbel 0x1f7c4 could not issue MAD IFC: Input/output error (0x1d714039) > > Error code from Linux again: > > /* Index out of range: */ > MTHCA_CMD_STAT_BAD_INDEX = 0x0a, > > I think this happens here: > /* Update MAD parameters */ > for ( i = 0 ; i < ARBEL_NUM_PORTS ; i++ ) > ib_smc_update ( arbel->ibdev[i], arbel_mad ); > > The driver defines ARBEL_NUM_PORTS to 2, so perhaps it is probing a port > that > doesn't exist. This should be fine, too. > > > It seems that running gdbstub halts whatever thread is handling the > network > > IO, making it impossible to connect to gdbstub over udp. After exiting > > gdbstub, gPXE starts responding to pings and arp requests again. > > The gdbstub performs low-level network I/O - it directly polls the network > device for packets. The network stack will not respond while the gdbstub > is > active. However, the gdbstub implements ARP response directly. > > Are you running gdbudp on the NIC you are trying to debug? In order to be > able > to debug the arbel driver, gdbudp needs to use another NIC (e.g. an e1000 > card). This is because setting breakpoints in the arbel code won't work if > gdbudp is using the arbel card. > > > Any ideas? > > I think you are on the right track looking at DBG() messages. You've > established that transmit is working and the target receives the login > request. > > You might need to reduce the number of DBG() messages in gPXE's > receive code path when debugging without a serial port. Run without > the ":3" on the DEBUG= options for less verbose output. You can also > try commenting out or moving DBG() messages that are too frequent and > not useful. > > The aim would be to find out if the response is being received at each > layer of the stack (arbel driver, infiniband, srp) and then understand > the reason for dropping the response. > > Michael Brown and Itay Gazit may have better Infiniband and SRP > debugging ideas. I have CCed them and added the gPXE mailing list > (the Etherboot-discuss list has been replaced by [email protected]). > > Stefan >
_______________________________________________ gPXE mailing list [email protected] http://etherboot.org/mailman/listinfo/gpxe
