SNO 2U / 5U controller replacement

Robert Leong -

Best practice is to transfer the M2 of the defective controller to the replacement controller.

SpycerNode Controller Replacement with M2-Disk exchange:

http://www.dvsus.com/gold/SNO/SpycerNode_controller_replacement_with_M.2_SSD_exchange.pdf

  • IPMI address does not get transferred, must reconfigure it with ipmitool after swap.

Other related information.

Source Jam post:
https://rohde-schwarz.sapjam.com/articles/HBc93Bmg8bSZXVFVLvXwYA

Development Team is working on a fully automated solution for controller restore.
In the meantime there are a couple of workarounds for controller replacement:

IF M2 NVME SSD is still working (Easy/Most Cases):

Easiest way to restore a controller is to swap the M2 NVME SSD.

https://dvsus.zendesk.com/hc/article_attachments/4406835883284/SpycerNode_controller_replacement_with_M.2_SSD_exchange.pdf

After that boot into the system and apply the following firmware updates:

  1. Set Mellanox card into ETH Mode
    • /opt/rohde-schwarz/fbms-setup/scriptlet/set_mellanox_to_eth.sh
  2. Apply Seagate Firmware Update
    • run
    • /opt/seagate/uut/uut --skip "Update BIOS Settings" -v -d -y
    • /opt/seagate/sbu/sbu -s -f /opt/rohde-schwarz/SpycerNode/cfg/sno-bios-settings-v1.xml
  3. LSI SAS HBA controller (built-in component in the controller) sas3flash –f /opt/seagate/3rd_party_firmware/lsi/1015164_LagunaSeca_SAS3008_phase16_01_90sec_noBIOS.fw

  4. reconfigure software network settings with:
    • fbms-setup.sh -c <network_interface>

Else (Complicated):

It will get a little more complicated:

  • Connect Monitor and Keyboard
  • Boot from RuS- Rescue thumb drive
  • Press Esc on boot and select the USB thumb drive
  • leave RuS- Rescue Menu with „0“ 
  • run lsblk on both controllers and identify system disk (for example sdw on controller working and sdx on failed)
  • identify ip address of working
  • identify ip address of failed controller
    • ifdown eth0 to eth4
    • ifup eth0
    • avahi-autoipd eth0
    • ifconfig eth0

    • for example 169.254.1.1

  • On replacement controller run
    • nc -l 2222 > /dev/sdx

 

  • On working controller run
    • nc 169.254.1.1 2222< /dev/sdw
  • Now everything from replacement controller
  • Create Directory
    • mkdir /target/
  • mount system1 partition
    • Fsck …
    • mount /dev/sdx2 /target/ (sdz ggf. ersetzen)
  • delete everything
    • rm -rf /target/*
  • Copy ROOT Filesystem form working controller
  • Restore backup config
    • rsync -aAXvPx root@169.254.9.18:/vol/config-bak/ /target/
  • umount /target/
  • reboot
  • Follow instructions from IF M2 NVME SSD is still working (Easy)
  • Follow instruction in „SNO_RAID Setup and Administration Guide“ Section 6.4 Replace a failed NVR system disk partitions

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.