[ipxe-devel] arbel sanboot ib_srp

Lee Staples lee.staples at gmail.com
Tue May 24 08:55:00 UTC 2011


Hi Michael,

Would be grateful if you could advise if the following known firmware bug
might be the cause


2.

DoorBell loss

DoorBells may be lost on systems with a 64-KByte page size upon heavy stress
conditions

http://www.mellanox.com/pdf/firmware/fw-25218-5_3_000-release_notes.pdf

Regards

Lee



On 19 May 2011 08:34, Michael Brown <mbrown at fensystems.co.uk> wrote:

>  On Wednesday 18 May 2011 20:28:43 Lee Staples wrote:
> > Have just reinstalled SCST and built latest ipxe commit
> > (c49659c4f26e23f3fc234c2068786872554daa69) with the arbel queue pair
> > patch from last year.
> >
> > Would be grateful if you could take a look at the error I'm getting and
> > advise as it appears to login successfully to the server
> >
> >    Could not open SAN device: Input/output error (
> http://ipxe.org/1d714039)
> >    srp boot failedCMRC 0x23354 shutting down
> >    Arbel 0x215b4 issuing command 0021
> >    CMRC 0x23354 send error: Operation canceled (http://ipxe.org/0b1360a0
> )
>
> Login is handled via a pair of management datagrams.  The RC queue pair is
> not
> used until the first packet after login completes.  Your client.log shows
> that
> the RC queue pair is transitioning into an error state as soon as the first
> send WQE is posted:
>
>  QPN 0xd75405 context before doorbell:
>  Arbel 0x215b4 issuing command 0022
>  Arbel 0x215b4 QPN 0xd75405 context:
>  00000000 : 30 00 19 00 ff 3e 3f 16-9f 0a 13 00 00 00 00 01
>  ...
>  QPN 0xd75405 context after doorbell:
>  Arbel 0x215b4 issuing command 0022
>  Arbel 0x215b4 QPN 0xd75405 context:
>  00000000 : 60 00 19 00 f8 3c 28 14-9f 0a 13 00 00 00 00 01
>  ...
>
> The first nibble of this hex dump is the queue state - 3="ready to send",
> 6="error".
>
> Something must be wrong with either the queue pair context or the send WQE
> created by iPXE for Arbel.  I have examined both in excruciating detail the
> last time I worked on this, and I couldn't find any problem.
>
>
> Itay: is there a debug version of the firmware available that would provide
> some indication of why the QP is transitioning to ERR?
>
> Michael
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ipxe.org/pipermail/ipxe-devel/attachments/20110524/b25f7f97/attachment.htm>


More information about the ipxe-devel mailing list