[ipxe-devel] SuperMicro SYS-1028U-TR+ with Intel X710-DR2 nic card

Todd Stansell todd at stansell.org
Wed Mar 30 00:36:23 UTC 2016


On Tue, Mar 29, 2016 at 09:18:06PM +0100, Michael Brown wrote:
> Should now be fixed in:
> 
>   http://git.ipxe.org/ipxe.git/commitdiff/70509e6
> 
> If you retry with current master, you should now get a "No such file or 
> directory" from "show net0/bustype".

Yep.  I can confirm that.

> >Is there some way to help determine if the problem is within iPXE 
> >somewhere or
> >the BIOS or the NIC itself?  It feels like all of these problems are the
> >failed handoff of the bus info above, so it can't load proper network 
> >drivers,
> >etc.
> 
> The bus info failure is interesting.  It suggests that pxeprefix.S was 
> unable to retrieve the device information.  You should see an error 
> message somewhere in the "PXE->EB" block of text (before the iPXE 
> startup banner is printed).

You're right.  It looks like this:

PXE->EB: !PXE at 9534:0070, entry point at 9534:0106
	 UNDI code segment 9534:4730, data segment 8DBC:7780 (566-615kB)
	 Unable to determine UNDI physical device, type DIX+802.3
	 566kB free base memory after PXE unload

We tried to modify the pxeprefix.S code to print out the structure it's reading
when it's in the get_physical_device section and it just returns all zeros.  So
there's definitely something weird with this BIOS not passing any of the PCI
bus information.  Regardless, from our understanding the undi driver should
still work (and mostly does).

> You could even push your luck and try adding a PCI_ROM() line for 
> 8086:1572 to drivers/net/intelx.c.  :)

Well, I tried to add it, but I'm not sure it helped.  When I use ipxe.pxe, I
now see net0-net3 (the 4 onboard gigabit ports) but don't see either of the two
pci card ports from the 8086:1572.  This might all be related to the lack of
PCI bus information for the card, in that the BIOS is not passing it properly.
So, I'm not too surprised we don't see the ports at all.

So, we went back to the undionly.kpxe and did some more trial/error testing
with more debugging enabled.  We turned on ipv4 debug messages so we could see
what kinds of packets we're getting and see if we could determine anything
more:

    - the "unrecognized tcp/ip protocol 51" errors we saw were from random
      multicast packets floating around on the network.
    - the "socket not connected" errors we saw were from other random
      broadcast UDP packets flying around on the network

Pretty much every time it hangs, though, is just after it receives both this
multicast packet and another broadcast packet, as those are the last messages
we see before things halt.  It really feels like there's something that's
causing the internal state to get messed up and iPXE thinks it's waiting for
something that it's already received and ends up in a deadlock.  We can hit
CTRL-C to cancel the operation and we'll see it send a FIN/ACK to the
webserver to close the connection cleanly.

Here's an example output compiled with DEBUG=httpcore:3,tcpip:3,ipv4:3,tcp:3
(sadly, i only have a screen capture to work from and some of it is too fuzzy
to read for the dark blue on black text, shown with ____)

    TCP 0x2bab4 bound to port 7302
    TCP 0x2bab4 timer fired in SYN_SENT for 6b07ce5a..6b07ce5a 00000000
    TCP 0x2bab4 TX 7302->80 6b07ce5a..6b07ce5b           00000000    0 SYN
    TCP/IP sending IPv4 packet
    IPv4 TX 10.64.105.41->10.64.100.20 len 64 proto 6 _____
    IPv4 RX 10.64.105.41<-10.64.100.20 len 64 proto 6 _____
    TCP/IP received TCP packet
    TCP 0x2bab4 RX 7302<-80           6b07ce5b 3e81ca02..3e81ca02    0 SYN ACK
    TCP 0x2bab4 using timestamps, SACK, TX window x1, RX window x512
    TCP 0x2bab4 transitioned from SYN_SENT to ESTABLISHED
    HTTP 0x2b5e4 TX GET /boot.cgi?env=ipxe HTTP/1.1
    HTTP 0x2b5e4 TX Connection: keep-alive
    HTTP 0x2b5e4 TX User-Agent: iPXE/1.0.0+ (7050)
    HTTP 0x2b5e4 TX Host: web01
    TCP 0x2bab4 TX 7302->80 6b07ce5b..6b07cec3           3e81ca03  104 PSH ACK
    TCP/IP sending IPv4 packet
    IPv4 TX 10.64.105.41->10.64.100.20 len ____ proto 6 id ____
    IPv4 RX 224.0.0.18<-10.64.104.2 len 64 proto 51 id 0004 csum 6872
    Unrecognized TCP/IP protocol 51
    IPv4 received packet rejected by stack: Error 0x440e6003 (http://ipxe.org/440e6003)
    IPv4 RX 10.64.105.41<-10.64.100.20 len 52 proto 6 id ____ csum ____
    TCP/IP received TCP packet
    TCP 0x2bab4 RX 7302<-80           6b07cec3 3e81ca03..3e81ca03    0 ACK
    ....................

And then it sits and waits forever.

Todd



More information about the ipxe-devel mailing list