Computer systems themselves are complex devices, and the more complex these systems are, the higher the probability of a sudden breakdown. One of the types of such complex systems are servers, and in particular RAID-arrays. The main reason for the failure of the array is still the human factor. As you know, data storage systems require constant monitoring and maintenance, which is often neglected by system administrators. Each case associated with a freeze or decrease in the speed of the computer must be analyzed. However, it often happens that system administrators hope that these problems will be eliminated by themselves, which, as a rule, leads to serious breakdowns, or even to the need to replace all drives.
Another common cause of incorrect RAID performance is insufficient cooling. As a rule, ignoring this problem fails one or two hard drives at the same time, the restoration of which is not always possible. There are often cases when several different drives are combined into a RAID array, and some of them are not designed to work in such conditions. For example, when using HDD with different read / write speeds, it is almost impossible to bypass synchronization losses, which leads to a slowdown of the entire system and the appearance of a huge number of errors.
Failure in the array controller though are not the most common problems, but still there is a place to be. Such failures can occur for a variety of reasons, ranging from physical and logical inconsistencies of equipment, ending with the failure of the controller itself. An unpleasant moment is the problems associated with hard drives, combined into a RAID-array. Such damage is rarely noticeable at first sight. However, they lead to a huge number of errors, which in the end “kills” the entire system.
Causes of damage of Raid arrays
So there are two main reasons for the failure of RAID-arrays.
- The frivolous attitude of system administrators to the first signs of failure is the most common cause. During RAID-5 operation, one of the disks may fail, and the array will continue to function with a noticeable decrease in speed performance. Some system administrators in this case are in no hurry to take active steps, hoping that the hard drives for some time will be able to continue their work. Such an opinion is often can be a mistake.
- Another common reason for the failure of arrays is the simultaneous transfer of several hard disks to the offline mode at once. This is due to the accumulation of blocks. As long as the number of blocks does not exceed a certain value, the disk is working, but as soon as there are too many blocks, the array stops running. Despite the status of offline, the disc starts to sound normally and is correctly determined by the controller. The reason is that the controller is not able to read the necessary data, or it defines the disk as failed.
Malfunctions in RAID-arrays are more characteristic of cheap controllers, however, a problem may arise in the course of using expensive hardware.
Important! If one of the disks fails, it is recommended to immediately back up important information, thus completing the recovery of all data from RAID arrays. Then you need to replace the failed drive and reload the array. A preliminary backup is necessary because sometimes during a reboot the process hangs. This is usually observed after detecting a block on a disk while reading / writing data (the controller cannot read information from the sector).
Regardless of the type of malfunction, you should not try to solve the problem yourself, it is better to contact a specialized raid recovery services, which can be found on numerous announcements like “restore the RAID array”. It should be remembered that any intervention by incompetent persons can lead to serious consequences that are fraught with loss of time, nerves and money.