[ipxe-devel] [PATCH] [tcp] Send keepalive packets to prevent TCP stalls

Thu Jun 9 14:12:10 UTC 2016

On Thu, Jun 9, 2016 at 3:11 PM, Michael Brown <mcb30 at ipxe.org> wrote:
> On 09/06/16 13:44, Ladi Prosek wrote:
>>>
>>> Do you know what prevents the usual TCP retransmission mechanism from
>>> recovering?  ARP discovery should still work even for retransmitted
>>> packets.
>>
>>
>> Just like you wrote in the ipxe-devel thread linked from the commit
>> description, from the client point of view the connection is "stable".
>> Everything the client has sent has been acked so the retransmission
>> timer is not running. The server is retransmitting for sure but its
>> packets just can't reach the client - they're routed somewhere else or
>> are blackholed altogether.
>
>
> Understood that the client (iPXE) will not be retransmitting, but that still
> doesn't explain what happens to the server's retransmitted packets.
>
>> I can get to this state easily by configuring my virtual NIC with the
>> hardcoded default MAC. There are more such hosts on the network
>> claiming the same MAC so sooner or later I find myself cut off.
>
>
> OK, but in that situation we don't expect traffic to get through anyway;
> it's a broken setup.

Yes, there definitely has to be something broken about the network
setup for this to make a difference. Unfortunately, and please excuse
my pragmatism, that's often the reality and finding/fixing the root
cause may be prohibitively hard.

> I'm trying to think of a situation in which this situation could arise in a
> non-broken setup, to convince myself that this is something we should be
> adding.  The best I can think of off-hand is where iPXE is behind some kind
> of NAT, and the NATting device has lost track of the relevant state.

NAT losing state is definitely one plausible case. Another could be
some kind of a multi-path setup where failover has just happened and
the new path is unaware of the connection. Or a virtual machine that
has just been migrated to another part of the network and the
infrastructure is still learning its new location, whatever that means
:-p Hosts that have just come up and are booting could face all kinds
of network instability problems.

That's all I can offer in terms of supporting arguments. I know that
it's been tried and it works. But I also know that there's no RFC to
refer to, it's a grey territory at best.

>> That sounds good. Under certain circumstances this may generate
>> otherwise unnecessary traffic so I just want to be careful. For
>> example if it's an HTTP connection and it is kept alive (as in HTTP
>> keepalive), it will look idle and will be pinging the server with
>> keepalives periodically even though it's not waiting for anything. Big
>> deal? Probably not. Worth adding a way for upper layers to signal this
>> down to the TCP implementation? Probably not either.
>
>
> I think we should keep it as simple as possible.  Always send keepalives on
> any established connection, use start_timer_fixed() with some period long
> enough to not be disruptive to real traffic (e.g. 15 seconds), reset the
> timer whenever any packet is received on that connection.
>
> It might also be desirable to use the common transmit path to send the
> keepalive packet, if that can cleanly be made to result in smaller and
> simpler code.  I don't think we need to use the (seq-1) trick since we're
> not aiming to elicit a response unless the remote end genuinely has
> something it's already trying to send.  We should be able to just send an
> unsolicited pure ACK, which the existing transmit path can already create
> via the TCP_ACK_PENDING flag.

I'm all for making it as simple as possible. Unifying it with the
regular transmit path sounds great. The seq-1 trick could make it
marginally easier to follow what's going on when looking at the
traffic with a network analyzer but it's certainly not required.
Should I prepare a simplified v2 of the patch?

Thanks,
Ladi