[ipxe-devel] Windows having problems parsing iBFT from recent iPXE versions?

Floris Bos bos at je-eigen-domein.nl
Wed Oct 29 22:15:56 UTC 2014


Hi,

On 10/29/2014 06:31 PM, Michael Brown wrote:
> On 29/10/14 17:14, Floris Bos wrote:
>> I'm not sure if it is actually the iBFT that is the problem.
>> My initial guess was that was the case because the nameserver does not
>> show up in "ipconfig", and my iSCSI disk is not there.
>> But perhaps Windows does not copy the nameserver from iBFT, but normally
>> gets that by using normal DHCP later on.
>> And the real problem is that network connectivity is just screwed up,
>> perhaps caused by iPXE leaving the network adapter in some kind of state
>> Windows is not expecting.
>> That I am seeing DHCP requests, and repeated ARP requests for the IP of
>> my SAN after Windows booted supports the theory that it does have the
>> iBFT, but that Windows is able to transmit network packets, but somehow
>> has problems receiving them.
>>
>> - Several commits I tried before "[tcp] Do not send RST for unrecognised
>> connections" all work properly
>> - Several commits I tried after, all fail
>> - It might be coincidence, but I just managed to get HEAD to work by
>> reversing both "[tcp] Do not send RST for unrecognised connections" and
>> "[tcp] Defer sending ACKs until all received packets have been
>> processed" both which do hackery in src/net/tcp.c.
>
> The problem does not seem to be related to the iBFT; I think we can 
> leave that aside for now.
>
> Interesting.  I wonder if it could be somehow related to the 
> possibility of packets arriving between the time that Windows last 
> allows iPXE control of the NIC (via an INT 13 call) and the moment 
> that the Windows native driver starts up.
>
> Unfortunately there is no way to enforce a clean handover of the NIC 
> when doing anything with iSCSI, since the INT 13 API simply does not 
> have any "shut down device" call.  The Windows driver will therefore 
> always find the NIC in a slightly unexpected state in which it is 
> already up and running and receiving packets.  It's plausible that the 
> two TCP-related changes alter the behaviour in terms of when packets 
> are transmitted (and thus responses received) sufficiently to 
> trigger/avoid a bug.
>
> You could try using the iPXE native driver instead of undionly.kkpxe. 
> This will definitely change the state of the NIC at the time that the 
> Windows driver starts up, and it may be that Windows likes this state 
> better.
>

Does seem to work with the native driver.

> You could also try using wireshark to see if there are any packets 
> present on the network which might arrive after iPXE last relinquishes 
> control (i.e. after the last packet sent by iPXE within its TCP 
> connection to the iSCSI target) but before the Windows driver has 
> started up (i.e. before Windows' initial DHCP request or anything else 
> which has obviously been sent by Windows).
>

undionly.kkpxe with the two patches reversed (does work):

- iPXE communcation seems to end with an iSCSI read response, iPXE ACKs 
nicely
- then there is this long wait on Windows startup (waiting for disks?), 
and during that there are some TCP retransmissions of an iSCSI NOP 
command trying to keep the connection warm from SAN to virtualbox.
- straight after that Windows takes over, there is some DHCP/ARP traffic 
(not shown below), a LLMNR request for wpad, and a new iSCSI login.

==
No.     Time           Source                Destination Protocol Length 
Info
     315 29.079661000   192.168.178.4         192.168.178.99 iSCSI    
116    SCSI: Read(10) LUN: 0x00 (LBA: 0x00000000, Len: 1)
     316 29.080009000   192.168.178.99        192.168.178.4 TCP      
116    [TCP segment of a reassembled PDU]
     317 29.080017000   192.168.178.99        192.168.178.4 iSCSI    
580    SCSI: Data In LUN: 0x00 (Read(10) Response Data) SCSI: Response 
LUN: 0x00 (Read(10)) (Good)
     318 29.080127000   192.168.178.4         192.168.178.99 TCP      
68     5624 > iscsi-target [ACK] Seq=929 Ack=4389 Win=262144 Len=0 
TSval=1345915 TSecr=9180359
     319 29.080229000   192.168.178.4         192.168.178.99 TCP      
68     5624 > iscsi-target [ACK] Seq=929 Ack=4901 Win=262144 Len=0 
TSval=1345915 TSecr=9180359
     383 39.098163000   192.168.178.99        192.168.178.4 iSCSI    
116    NOP In
     384 39.941994000   192.168.178.99        192.168.178.4 iSCSI    
116    [TCP Retransmission] NOP In
     385 41.633880000   192.168.178.99        192.168.178.4 iSCSI    
116    [TCP Retransmission] NOP In
     388 45.017528000   192.168.178.99        192.168.178.4 iSCSI    
116    [TCP Retransmission] NOP In
     403 51.784762000   192.168.178.99        192.168.178.4 iSCSI    
116    [TCP Retransmission] NOP In
     907 172.706543000  192.168.178.4         224.0.0.252 LLMNR    
66     Standard query 0xbed4  A wpad
     908 172.706550000  192.168.178.4         224.0.0.252 LLMNR    
66     Standard query 0xbed4  A wpad
     909 172.787428000  192.168.178.4         192.168.178.99 TCP      
68     49154 > iscsi-target [SYN] Seq=0 Win=8192 Len=0 MSS=1460 WS=256 
SACK_PERM=1
     910 172.787730000  192.168.178.99        192.168.178.4 TCP      
68     iscsi-target > 49154 [SYN, ACK] Seq=0 Ack=1 Win=14600 Len=0 
MSS=1460 SACK_PERM=1 WS=8
     911 172.787999000  192.168.178.4         192.168.178.99 TCP      
56     49154 > iscsi-target [ACK] Seq=1 Ack=1 Win=65536 Len=0
     912 172.789389000  192.168.178.4         192.168.178.99 iSCSI    
244    Login Command
[...various iSCSI commands...]
    2149 173.114694000  192.168.178.4         224.0.0.252 LLMNR    
66     Standard query 0xbed4  A wpad
    2150 173.114697000  192.168.178.4         224.0.0.252 LLMNR    
66     Standard query 0xbed4  A wpad
    2169 176.317161000  192.168.178.4         192.168.178.255 NBNS     
112    Registration NB WORKGROUP<00>
    2170 176.317172000  192.168.178.4         192.168.178.255 NBNS     
112    Registration NB WORKGROUP<00>
    2171 176.317356000  192.168.178.4         192.168.178.255 NBNS     
112    Registration NB MININT-SG3NP4U<00>
==

undionly.kkpxe without patch reversion (does NOT work):

- Seems the iSCSI read response is retransmitted lacking the last ACK. 
Those packets may arrive when Windows is about to take over.
- Windows does not seem to do any iSCSI communication

==
     293 30.869835000   192.168.178.4         192.168.178.99 iSCSI    
116    SCSI: Read(10) LUN: 0x00 (LBA: 0x00000000, Len: 1)
     294 30.870209000   192.168.178.99        192.168.178.4 TCP      
116    [TCP segment of a reassembled PDU]
     295 30.870230000   192.168.178.99        192.168.178.4 iSCSI    
580    SCSI: Data In LUN: 0x00 (Read(10) Response Data) SCSI: Response 
LUN: 0x00 (Read(10)) (Good)
     296 30.870346000   192.168.178.4         192.168.178.99 TCP      
68     28509 > iscsi-target [ACK] Seq=929 Ack=4389 Win=262144 Len=0 
TSval=1363248 TSecr=9418323
     315 34.310476000   192.168.178.99        192.168.178.4 TCP      
580    [TCP Retransmission] iscsi-target > 28509 [PSH, ACK] Seq=4389 
Ack=929 Win=16624 Len=512 TSval=9419184 TSecr=1363248[Reassembly error, 
protocol TCP: New fragment overlaps old data (retransmission?)]
     353 41.189666000   192.168.178.99        192.168.178.4 TCP      
580    [TCP Retransmission] iscsi-target > 28509 [PSH, ACK] Seq=4389 
Ack=929 Win=16624 Len=512 TSval=9420904 TSecr=1363248[Reassembly error, 
protocol TCP: New fragment overlaps old data (retransmission?)]
     522 54.980144000   192.168.178.99        192.168.178.4 TCP      
580    [TCP Retransmission] iscsi-target > 28509 [PSH, ACK] Seq=4389 
Ack=929 Win=16624 Len=512 TSval=9424352 TSecr=1363248[Reassembly error, 
protocol TCP: New fragment overlaps old data (retransmission?)]
     549 60.879695000   192.168.178.99        192.168.178.4 iSCSI    
164    NOP In, NOP In
    1333 173.270305000  192.168.178.4         192.168.178.255 NBNS     
112    Registration NB MININT-GRDEK79<00>
    1334 173.270318000  192.168.178.4         192.168.178.255 NBNS     
112    Registration NB MININT-GRDEK79<00>
    1335 173.270536000  192.168.178.4         192.168.178.255 NBNS     
112    Registration NB WORKGROUP<00>
==


>>> A problem is that the SAN-booted OS is likely to clear the screen
>>> almost immediately, meaning that the warning message would not be seen
>>> in practice.
>>
>> But if I am doing a SAN installation couldn't the warning be printed the
>> moment I do the sanhook command?
>
> The iBFT is not created until you attempt to boot from the SAN target.

Thought you already had memory reserved for it in some data segment, 
before filling it in.

==
/** The boot firmware table generated by iPXE */
static union xbft_table __bss16 ( xbftab ) __attribute__ (( aligned ( 16 
) ));
==

Or am I misunderstanding what that does?
(not a low level programmer)

-- 
Yours sincerely,

Floris Bos




More information about the ipxe-devel mailing list