NetApp FAS2650 node panic – ONTAP 9.1

After installing a new NetApp FAS2650 recently, a bug was observed which was causing one of the nodes to panic, and restart to the LOADER A prompt. After engaging with NetApp support, the following bug was found

A PCI NMI error triggers from QLogic 16Gb FC or 10 GbE Converged Network Adapter
(CNA) ports on some storage systems, such as FAS8200, FAS2650, FAS2620,
AFF A300, or AFF A200. The issue might continue to reoccur several times. This
issue only occurs when the port pair is configured in CNA mode.
An example of an error message is displayed, as follows:
PANIC : PCI Error NMI from device(s):RPT(0,3,3):QLogic FC/10GbE CNA on
Controller.

http://mysupport.netapp.com/NOW/cgi-bin/bol?Type=Detail&Display=1026931

To resolve this, the following workaround is available:

Use the service processor to perform a system power cycle on the affected controller.

Reconfigure all unused CNA port pairs to Fibre Channel (FC) mode. Run the below commands from the Service Processor, commands below:

  1. ucadmin show –node “Affected Node” –adapter *
  2. ucadmin modify -node * -adapter * -mode fc -type target

You will need to make sure all CNA adapters are changed to fc to resolve the issue. Example below:

Before example:

*> ucadmin show

Current  Current    Pending  Pending    Admin

Adapter  Mode     Type       Mode     Type       Status

——-  ——-  ———  ——-  ———  ——-

0c     fc       target     –        –          offline

0d     fc       target     –        –          offline

0e     cna      target     fc       –          offline

0f     cna      target     fc       –          offline

Example 2/:

*> ucadmin modify -m fc -t target 0f

ucadmin modify: Mode on adapter 0f and also adapter 0e will be changed to fc.

Do you want to continue (y/n)? y

*> ucadmin show

Current  Current    Pending  Pending    Admin

Adapter  Mode     Type       Mode     Type       Status

——-  ——-  ———  ——-  ———  ——-

0c     fc       target     –        –          offline

0d     fc       target     –        –          offline

0e     cna      target     fc       –          offline

0f     cna      target     fc       –          offline

After Modify command is used

::> ucadmin show

Current  Current    Pending  Pending    Admin

Node          Adapter  Mode     Type       Mode     Type       Status

————  ——-  ——-  ———  ——-  ———  ———–

Node01            0c       fc       target     –        –          online

Node01            0d       fc       target     –        –          online

Node01            0e       fc       target     –        –          online

Node01            0f       fc       target     –        –          offline

 

Caveat: information presented in this how to guide is as is,  myself or my employer hold no responsibility to the guaranteed success of this guide

 

8 thoughts on “NetApp FAS2650 node panic – ONTAP 9.1

Add yours

  1. Hi,

    Thanks for this post, I was basically looking for more information around this bug, I read the bug article but it does not have enough info.

    I have posted my queries on the forum, just trying to get more info:
    https://community.netapp.com/t5/Data-ONTAP-Discussions/A-PCI-error-triggered-from-a-memory-error-on-the-DRAM-component-of-Converged/m-p/134652#M29514

    As you have experienced this first hand, can I ask you this query:
    Are you only doing FC in your current setup ?, If I am doing CIFS using CNA mode then the workaround does not apply to me. Isn’t it. ?

    Kind regards,
    -Ashwin

    Like

    1. Hello ash,

      Yes this workaround changed FC ports to CNA to prevent the panic. If you’re using CIFS or ISCSI then this workaround will not help you, unless you’re unused ports are still marked as FC

      In my scenario we were using both ISCSI and CIFS, but had two unused ports at FC. Once these changed to CNA we’ve not had a panic

      What version of ONTAP are you running? 9.2 is out now, have you tried upgrading?

      You can mail me at mail@carlmcdade.com if you’ve any further info

      Cheers

      Like

    1. Under Network, FC/FCoE Adapters, all of them for the AFF-A300 show as “offlined by user/system” and the ones for the FAS-2650 just show “link not connected.” I have no idea if that means anything, or is different from pre-bug triggering. Still no word from NetApp. Giving what is going on this week for them, and where, I expect things are a bit slow internally for their folks.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a website or blog at WordPress.com

Up ↑

%d bloggers like this: