Отправлено sHaggY_caT, 02-Июн-10 11:44 
А речь была об этом (начало мана Maintenance Best Practices for Adaptec RAID Solutions):

Run regular consistency checks on the system:
Verification is designed to proactively detect hard disk media defects while the array is online and
redundant. A RAID-5 or RAID-6 array is inconsistent when the data and parity do not match. Likewise, a
RAID-1 array is inconsistent when the data and mirror do not match.
The verification process issues commands to each drive in the array to test all sectors. When a bad
sector is found, the RAID controller instructs the hard drive to reassign the bad sector, and then
reconstructs the data using the other drives. The affected hard drive then writes data to the newly
assigned good sector. These operations continue so that all sectors of each configured drive are
checked, including hot spares. As a result, bad sectors can be remapped before data loss occurs.

Резюме: консистент-чек, в, например, кроне рулит.

И перед этим выше:

Hard drive media defects and other drive quality issues have steadily improved over time, even as drive sizes
have grown substantially. However, hard drives are not expected to be totally free of flaws. In addition, normal
wear on a drive may result in an increase in media defects, or “grown defects,” over time. The data block
containing the defect becomes unusable and must be “remapped” to another location on the drive. If a bad
block is encountered during a normal write operation, the controller marks that block as bad and the block is
added to the “grown defects list” in the drive’s NVRAM. That write operation is not complete until the data is
properly written in a remapped location. When a bad block is encountered during a normal read operation, the
controller will reconstruct the missing data from parity operations and remap the data to the new location. A
condition known as a double fault (“bad stripe”) occurs when a RAID controller encounters a bad block on a
drive in a RAID volume and then encounters an additional bad block on another hard drive in the same data
stripe. This double fault scenario can also occur while rebuilding a degraded array, leaving the controller with
insufficient parity information to reconstruct the data stripe. The end result is a rebuild failure with the loss of
any data in that stripe, assuming the stripe is in the user data area.

То есть, можем и восстановить, и отремапить бэд-сектор.

Что касается тонкостей, как это делает raidz, или, например, raid1e, raid5, raid6, это лишь тонкости, и плюсы-минусы конкретных реализаций, не позволяющие отказаться от бэкапа, или мониторинга _работы_ приложений, и лишь добавляющие какие-то единицы к общей стабильности сервиса, но не гарантии сохранности данных.

Что касается моего поведения, повторюсь, оно действительно было недостойным.


