Monday, May 30, 2011

InfiniBand switch - relocated Subnet Manager Master to another switch

After I posted some about sm_priority is not set to recommended value of 5 on infiniband switch. Thank You everyone for comment, documents and ideas.

I ignored about this warning. why?
Exadata Document:
Exadata Database Machine Full Racks and Oracle Exadata Database Machine X2-2 Half Racks have three Sun Datacenter InfiniBand Switch 36 switches. The switch at rack unit 1 (U1) is referred to as the spine switch. The switches at rack unit 20 (U20) and rack unit 24 (U24) in Oracle Exadata Database Machine X2-2 racks, or unit 21(U21) and rack unit 23 (U23) in Oracle Exadata Database Machine X2-8 Full Racks are referred to as leaf switches. The spine switch is the Subnet Manager Master for the InfiniBand subnet. It has priority 8.
Sun Datacenter InfiniBand Switch 36 Topic Set
By setting a Subnet Manager to a higher priority than another Subnet Manager, it becomes the primary (or Master) Subnet Manager.
Infiniband Switch (01) = spine switch (spine switch is the Subnet Manager Master) - So, It should have a higher priority and Exadata Document (The spine switch is the Subnet Manager Master for the InfiniBand subnet. It has priority 8).

However, I found something wrong, maybe I tested many thing on Exadata.
# getmaster
20110530 11:06:00 OpenSM Master on Switch : 0x0021286ccca9a0a0 ports 36 Sun DCS 36 QDR switch exasw-ib2 enhanced port 0 lid 4 lmc 0
So, Relocated Subnet Manager Master to another switch. remote to leaf switch (exasw-ib2 ) and then disable/enable SM.
# ssh exasw-ib2

# disablesm
Stopping IB Subnet Manager.. [ OK ]

# enablesm
Starting IB Subnet Manager. [ OK ]

# getmaster
20110530 11:08:31 OpenSM Master on Switch : 0x0021286cd635a0a0 ports 36 Sun DCS 36 QDR switch exasw-ib1 enhanced port 0 lid 1 lmc 0.
It's relocated to 01 (Infiniband Switch).

No comments: