[ipxe-devel] intelxlvf: LAN Queue Overflow Event error on PF

Greg Edwards gedwards at ddn.com
Wed Sep 25 17:22:21 UTC 2019


I've been testing the intelxlvf driver (ipxe commit 41a9a5c7b367) on a
system with a quad-port X722 controller.  The bare metal system is
running a 5.3.1 kernel and QEMU 4.0.0.  We have a cobbler server the VMs
PXE boot from.

When I PXE boot a VM with one of the X722 VFs attached, it boots into
the cobbler chooser menu fine.  However, if I don't immediately select
an option, I see the following error on the hypervisor host for the PF:

[ 3864.490352] i40e 0000:19:00.0: ARQ LAN queue overflow event received
[ 3864.497045] i40e 0000:19:00.0: overflow Rx Queue Number = 129 QTX_CTL=0x00000000

When I do select an option from the cobbler menu, the intelxlvf driver
spits out:

INTELXL 0x1e438 unrecognised status change event 0x2:
00000000 : 02 00 00 00 f3 8f ff ff-00 7b c0 b4 ff 00 00 00 : .........{.......

Then just hangs and doesn't boot the selected entry.

Looking at the Linux i40e PF driver code, the handling of the LAN Queue
Overflow Event first notifies the VF it is about to be reset, then
resets it.  From i40e_handle_lan_overflow_event():

        /* Queue belongs to VF, find the VF and issue VF reset */
        if (((qtx_ctl & I40E_QTX_CTL_PFVF_Q_MASK)
            >> I40E_QTX_CTL_PFVF_Q_SHIFT) == I40E_QTX_CTL_VF_QUEUE) {
                vf_id = (u16)((qtx_ctl & I40E_QTX_CTL_VFVM_INDX_MASK)
                         >> I40E_QTX_CTL_VFVM_INDX_SHIFT);
                vf_id -= hw->func_caps.vf_base_id;
                vf = &pf->vf[vf_id];
                /* Allow VF to process pending reset notification */
                i40e_reset_vf(vf, false);

i40e_vc_notify_vf_reset() sends VIRTCHNL_EVENT_RESET_IMPENDING (0x2) to
the VF, which I believe is the unrecognized 0x2 event the intelxlvf
driver complains about.

Looking at the Intel® Ethernet Controller X710/XXV710/XL710 Datasheet
(Revision 3.65, August 2019), I did come across section
"Prevention of PFC Crosstalk between PCIe Functions", which describes
one reason this scenario could happen, though I don't know if that is
the case.

If I immediately select a cobbler entry upon being presented the menu,
it all works fine.  If I take 20s or so to choose an entry, that's when
I encounter the LAN Queue Overflow Event error.


More information about the ipxe-devel mailing list