Firmware Upgrade to version 12.12.0-0098 or 12.12.0-0111 causes virtual disk loss - IBM ServeRAID M5000 Series



Source

RETAIN tip: H206542

Symptom

If the IBM ServeRAID M5014, M5015, or M5025 SAS/SATA Controller is upgraded to firmware version 12.12.0-0098 or 12.12.0-0111, from firmware 12.0.1-0097, or earlier levels, users might see the following message at Power On Self Test (POST):

All of the disks from your previous configuration are gone. If this is an unexpected message, then please power off your system and check your cables to ensure all disks are present. Please press any key to continue, or 'C' to load the configuration utility.

  cogent_27132_config_lost_a.png

If the user then enters the configuration utility, all the disks will show as 'Unconfigured Good.'

cogent_27132_config_lost_b.png

Affected configurations

The system can be any of the following IBM servers:

  • System x3100 M4, type 2582, any model
  • System x3200 M3, type 7327, any model
  • System x3200 M3, type 7328, any model
  • System x3250 M3, type 4251, any model
  • System x3250 M3, type 4252, any model
  • System x3250 M3, type 4261, any model
  • System x3400 M2, type 7836, any model
  • System x3400 M2, type 7837, any model
  • System x3500 M2, type 7839, any model
  • System x3500 M3, type 7380, any model
  • System x3550 M2, type 7946, any model
  • System x3550 M3, type 4254, any model
  • System x3550 M3, type 7944, any model
  • System x3620 M3, type 7376, any model
  • System x3630 M3, type 7377, any model
  • System x3650 M2, type 7947, any model
  • System x3650 M3, type 4255, any model
  • System x3650 M3, type 7945, any model
  • System x3690 X5, type 7147, any model
  • System x3690 X5, type 7148, any model
  • System x3690 X5, type 7149, any model
  • System x3690 X5, type 7192, any model
  • System x3850 X5, type 7143, any model
  • System x3850 X5, type 7145, any model
  • System x3850 X5, type 7146, any model
  • System x3850 X5, type 7191, any model
  • iDataPlex dx360 M3 Server, type 6391, any model

The system is configured with one or more of the following IBM options:

  • ServeRAID M5014 SAS/SATA Controller, option part number 46M0916, any replacement part number (CRU)
  • ServeRAID M5015 SAS/SATA Controller, option part number 46M0829, any replacement part number (CRU)
  • ServeRAID M5025 SAS/SATA Controller, option part number 46M0830, any replacement part number (CRU)

This tip is not software specific.

The 12.12.0-0098 and 12.12.0-0111 firmware for the ServeRAID M5000 Series SAS/SATA controller is affected.

The system has the symptom described above.

Solution

This issue is resolved in the ServeRAID M5000 Series SAS/SATA firmware update version 12.12.0-0126, or later levels.

The update is now available by selecting the appropriate Product Group, type of System, Product name, Product machine type, and Operating system on IBM Support's Fix Central web page, at the following URL:

Workaround

If users encounter the symptom and message, the configuration will need to be recreated to retrieve data. To recreate the configuration, perform the following steps:

Note: Knowledge of previous configuration is required.

  1. Press C and then Y to enter the configuration utility.
  2. Select the appropriate controller then click Start.
  3. Select Configuration Wizard.
  4. Select Add Configuration then click Next.
  5. Select Virtual Drive Configuration and then Next.
  6. Select Manual Configuration then click Next.
  7. Select the first drive that was in the previous array and choose Add to Array. Continue until all drives from the previous array have been added to the Drive Group. Make sure that drives are added in the same order as they were before. After all drives have been added, select Accept Disk Group and then click Next.
  8. Select Add to Span then click Next.
  9. Verify that the Redundant Array of Independent Disks (RAID) level and size are correct, and all other options are identical to the previous array. When verified, select Accept.
  10. If Write-Back was selected, select Yes at the warning. If Write-Back was not selected, skip this step.
  11. Select Next and then Accept, and Yes to save the configuration on the next screen.
  12. Select Cancel if presented with a Solid-State Drive (SSD) Caching screen, unless applicable.
  13. IMPORTANT: Select No when asked if users want to initialize. Selecting Yes will result in data loss.
  14. Restart the system.

Once data has been retrieved from the recreated array, the array should be recreated. This is because the bad stripe table has been lost and will need to be recreated from scratch.

If the symptom message was not displayed after updating from version 12.0.1-0097, or an earlier level, to 12.12.0-0098 or 12.12.0-0111:

Immediately flash the firmware to level 12.12.0-0126. (Only the firmware needs to be flashed. The system itself does not require an immediate power cycle. The system still is exposed to the issue while it is at level 12.12.0-0111.

Note: Systems with firmware 12.0.1-0097, or earlier levels, can be flashed safely up to level 12.12.0-0126.

Additional information

The issue stems from the firmware upgrade operation where users are upgrading IBM ServeRAID M5000 Series code from a very old firmware package to the latest firmware package.

The RAID configuration was created in 12.0.1-0097, or later the controller was upgraded to 12.12.0-0098 or 12.12.0-0111, using the required interim 12.12.0-0085 firmware upgrade.

Firmware 12.0.1-0097 does not support CacheCade, and therefore the reserved bits that later versions of firmware will use for CacheCade are set by default to '0xFF' in the metadata. After users upgrade to 12.12.0-0098 or 12.12.0-0111, the next metadata read (such as startup after restart) recognizes the '0xFF' as valid for a non-CacheCade volume and behaves properly.

However, if a logical disk state change occurs (such as consistency check, cache properties change, learn cycle, or rebuild), then the firmware updates the metadata, but makes an improperly qualified decision also to set the CacheCade flags to 'enabled' and 'cacheable.'

At the next restart, the firmware will read and consolidate the drive metadata. As it is interpreting the CacheCade information, it now finds '0xFD' instead of the previous '0xFF.' This tells firmware that CacheCade is enabled and that the CacheCade record contains valid data.

However, since the metadata originally was created by level 12.0.1.0097, the area that is now the CacheCade record is not initialized properly, and the validity check of this information will fail. This results in the volume being lost.

 
  • Consistency Check
  • Drive pull
  • Reconstruction
  • Write Policy change
  • Full initialization of the Array
  • Any Virtual Disk property change
  • Rebuild
  • Copy-back

The ServeRAID M5000 series firmware version 12.12.0-0126 will allow users to flash from firmware 12.0.1-0097, or earlier versions, to a later version without experiencing the describe symptom.

Applicable countries and regions

 


Document id:  MIGR-5091589
Last modified:  2013-04-19
Copyright © 2014 IBM Corporation