[ipxe-devel] iPXE boot fails if multique is enabled in Openstack
Ladi Prosek
lprosek at redhat.com
Sat Dec 2 07:14:58 UTC 2017
On Wed, Nov 29, 2017 at 3:51 PM, Maxime Coquelin
<maxime.coquelin at redhat.com> wrote:
>
>
> On 11/29/2017 03:04 PM, Ladi Prosek wrote:
>>
>> On Wed, Nov 29, 2017 at 2:36 PM, Maxime Coquelin
>> <maxime.coquelin at redhat.com> wrote:
>>>
>>>
>>>
>>> On 11/29/2017 02:31 PM, Ladi Prosek wrote:
>>>>
>>>>
>>>> On Wed, Nov 29, 2017 at 2:06 PM, Maxime Coquelin
>>>> <maxime.coquelin at redhat.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 11/29/2017 11:42 AM, Ladi Prosek wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Nov 29, 2017 at 9:57 AM, Maxime Coquelin
>>>>>> <maxime.coquelin at redhat.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hi Ladi,
>>>>>>>
>>>>>>> Sorry for the late reply.
>>>>>>>
>>>>>>> On 11/27/2017 05:01 PM, Ladi Prosek wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I think I understand what's going on. DPDK simply won't consider the
>>>>>>>> interface 'ready' until after all queues have been initialized.
>>>>>>>>
>>>>>>>> http://dpdk.org/browse/dpdk/tree/lib/librte_vhost/vhost_user.c#n713
>>>>>>>>
>>>>>>>> It looks like Maxime is the right person to bug about this. One of
>>>>>>>> his
>>>>>>>> recent commits appears to be somewhat related:
>>>>>>>> http://dpdk.org/browse/dpdk/commit/?id=eefac9536a
>>>>>>>>
>>>>>>>> Maxime, iPXE has a simple virtio-net driver that never negotiates
>>>>>>>> the
>>>>>>>> VIRTIO_NET_F_MQ feature and never initializes more than one queue.
>>>>>>>> This makes it incompatible with vhost-user configured with mq=on, as
>>>>>>>> Rafael and Zoltan have discovered.
>>>>>>>>
>>>>>>>> Is there any chance DPDK can be made aware of the VIRTIO_NET_F_MQ
>>>>>>>> feature bit being acked by the guest driver, and successfully
>>>>>>>> operate
>>>>>>>> with one queue in case it was not acked? There's some context below
>>>>>>>> in
>>>>>>>> this email. I can provide instructions on how to build iPXE and
>>>>>>>> launch
>>>>>>>> QEMU to test this if you're interested.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I think I get your problem. I'm interested in instructions to
>>>>>>> reproduce
>>>>>>> the issue.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Here it is, let me know if you run into any issues:
>>>>>>
>>>>>> $ git clone git://git.ipxe.org/ipxe.git
>>>>>> $ cd ipxe/src
>>>>>> $ make bin/1af41000.rom DEBUG=virtio-net:2
>>>>>> $ ln -s bin/1af41000.rom efi-virtio.rom
>>>>>>
>>>>>> Then run QEMU without changing the current directory (i.e. should
>>>>>> still be .../ipxe/src):
>>>>>>
>>>>>> qemu-system-x86_64 \
>>>>>> -machine pc,accel=kvm -m 128M -boot strict=on -device cirrus-vga \
>>>>>> -monitor stdio \
>>>>>> -object
>>>>>> memory-backend-file,id=mem,size=128M,mem-path=/dev/hugepages,share=on
>>>>>> \
>>>>>> -numa node,memdev=mem \
>>>>>> -chardev socket,id=char1,path=/var/run/openvswitch/vhost-user0 \
>>>>>> -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce \
>>>>>> -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,bootindex=0
>>>>>>
>>>>>> You'll see a bunch of "enqueuing iobuf" debug messages on the screen,
>>>>>> followed by at least "tx complete". Maybe also "rx complete" depending
>>>>>> on what /var/run/openvswitch/vhost-user0 is connected to.
>>>>>>
>>>>>> Now if you enable multiqueue by replacing the last two lines with:
>>>>>>
>>>>>> -netdev type=vhost-user,id=mynet1,chardev=char1,vhostforce,queues=16 \
>>>>>> -device
>>>>>>
>>>>>>
>>>>>> virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mq=on,vectors=34,bootindex=0
>>>>>>
>>>>>> you'll see only "enqueuing iobuf" without any completion, indicating
>>>>>> that the host is not processing packets placed in the tx virtqueue by
>>>>>> iPXE.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Thanks, just tested with DPDK v16.11 & DPDK v17.11 using testpmd
>>>>> instead
>>>>> of OVS.
>>>>>
>>>>> In my case, the packets send by iPXE are well received both with and
>>>>> without mq=on.
>>>>>
>>>>> I don't think there is an issue with the virtio_is_ready() code you
>>>>> mentioned. Indeed, nr_vrings gets incremented only when receiving
>>>>> vhost-user protocol requests for a new ring. This code has changed
>>>>> between v16.11 and v17.11 but idea remains the same.
>>>>
>>>>
>>>>
>>>> Thank you for looking into it.
>>>>
>>>>> In he case of iPXE, it only sends requests for queues 0 & 1, so
>>>>> nr_vrings is two.
>>>>
>>>>
>>>>
>>>> Interesting, that's not what I see. Here's part of the log I recorded:
>>>>
>>>> dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
>>>> dpdk|INFO|VHOST_CONFIG: set queue enable: 1 to qp idx: 0
>>>> netdev_dpdk|INFO|State of queue 0 ( tx_qid 0 ) of vhost device
>>>> '/var/run/openvswitch/vhost-user1'changed to 'enabled'
>>>> dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
>>>> dpdk|INFO|VHOST_CONFIG: set queue enable: 1 to qp idx: 1
>>>> dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
>>>> dpdk|INFO|VHOST_CONFIG: set queue enable: 0 to qp idx: 2
>>>> netdev_dpdk|INFO|State of queue 2 ( tx_qid 1 ) of vhost device
>>>> '/var/run/openvswitch/vhost-user1'changed to 'disabled'
>>>> dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
>>>> dpdk|INFO|VHOST_CONFIG: set queue enable: 0 to qp idx: 3
>>>> dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
>>>> dpdk|INFO|VHOST_CONFIG: set queue enable: 0 to qp idx: 4
>>>> netdev_dpdk|INFO|State of queue 4 ( tx_qid 2 ) of vhost device
>>>> '/var/run/openvswitch/vhost-user1'changed to 'disabled'
>>>> dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
>>>> dpdk|INFO|VHOST_CONFIG: set queue enable: 0 to qp idx: 5
>>>> dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
>>>> ...
>>>
>>>
>>>
>>> Do you have VHOST_USER_SET_VRING_ADDR for idx > 1 in your log?
>>
>>
>> No, VHOST_USER_SET_VRING_ADDR is logged only for 0 and 1.
>
>
> Hmm, just reproduced it.
> I forgot to update -netdev, I only updated -device...
We've created https://bugzilla.redhat.com/show_bug.cgi?id=1518884
to track the issue.
>>>>
>>>>
>>>> VHOST_USER_SET_VRING_ENABLE is sent for all queues and nr_vrings
>>>> reflects
>>>> that.
>>>
>>>
>>>
>>> This is surprising, what QEMU version are you using?
>>> I tried with qemu-2.10.1-1.fc27.
>>
>>
>> Latest git master. And Zoltan is on 2.5 based on the output of apt
>> list he sent earlier.
>>
>>>
>>>>
>>>>> I will try with OVS to try to reproduce your issue, what OVS version
>>>>> are
>>>>> you using?
>>>>
>>>>
>>>>
>>>> I had built the most recent OVS from the master branch at commit
>>>> a7ce5b8. Thanks!
>>>>
>>>>> Thanks,
>>>>> Maxime
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks!
>>>>>> Ladi
>>>>>>
>>>>>
>>>
>
More information about the ipxe-devel
mailing list