IT Knowledgebase
< All Topics
Print

What is RAID and How Does it Work?

Redundant Array of Independent Disks, commonly known as RAID, is a technology used to store data across multiple hard drives in a way that provides fault tolerance and/or increased performance. RAID is commonly used in enterprise environments, as well as by individuals who want to protect their valuable data from hard drive failures. There are several different RAID levels, each with its own unique combination of fault tolerance, performance, and storage capacity. Some RAID levels provide redundancy by mirroring data across multiple disks, while others use parity data to reconstruct lost data in the event of a disk failure.

RAID incorporates multiple physical disks into a single logical entity that uses special hardware or software. Hardware RAID entities come in different versions. Some are inbuilt on motherboards, while some take the form of well-established enterprise Network Attached Storage or Storage Area Network servers. RAID is customarily implemented on servers and can also be utilized on workstations. The adoption of RAID on workstations is typical for computer applications that demand high storage capacities and data transfer speeds.

RAID works by assigning data on multiple disks and by facilitating the overlap of input/output operations in a standardized way. The result of this is improved performance. Since multiple disks prolong the mean time between failures, redundant data storage also increases fault tolerance. Usually, RAID arrays are present on an operating system as a single logical drive, and it utilizes unique technological methods such as disk mirroring and disk striping.

Years ago, RAID configurations such as RAID 5 and RAID 6 were seen to have a performance penalty because of the extra work being done by the CPU/Resources of the NAS keeping them running smoothly and safely.

However, in recent years thanks to the improvements in NAS CPUs being used and the software that is running on them, RAID 5/6 doesn’t have anywhere near the performance loss it once did. In fact, RAID 5 and RAID  6 can grant you some great benefits. This is thanks to a RAID allowing read and write activity being spread across multiple disks at once (as opposed to a single drive being accessed in a normal 1 drive setup).

Different RAID configurations result in different benefits (with a RAID 0 being the fastest, but utterly lacking any kind of redundancy/safety-net):

RAID Levels

Devices with a redundant array of inexpensive disks adopt different versions called levels. The original blueprint that brought about the term and developed the RAID setup enumerated several RAID levels. With these numbered systems, IT professionals could easily differentiate RAID versions. Recently, the number of RAID levels has been categorized into three groups:

  • Standard RAID Levels
  • Nested RAID Levels
  • Non-Standard RAID levels.

Standard RAID Levels

  • RAID 0

RAID 0 simply entails merging multiple disks into a single volume. This augments the speed of operation as users are simultaneously reading and writing from multiple disks. A single file can then use the speed and capacity of the entire drive.

The demerit of RAID 0 is that it lacks redundancy. If a single disk is lost, there will be complete data loss. It is not advisable to use RAID 0 in a server environment is not advisable. Still, it can be used for other purposes where speed is vital and data loss doesn’t cause significant havoc, such as cache.

  • RAID 1

RAID 1 uses mirroring. Compared to RAID 0, RAID 1 can carry out more sophisticated configurations. The most common use case of RAID 1 is where users possess a pair of similar disks that identically replicate the data across the entire drives in the array.

The primary objective of RAID 1 is redundancy. If users lose a drive, additional drives will keep on running the operation. Also, if there’s a drive failure, users can replace the faulty drives without any downtime. Furthermore, RAID 1 provides users with better read performance. As a result, data can be read off on any of the drives present in the array. Nevertheless, this comes with a downside which is a slightly higher write latency. This is because users need to write data on both drives in the array and only the capacity of a single drive is available.

  • RAID 2

Generally, RAID 2 is rarely used practically. RAID 2 stripes data at the bit level and uses a Hamming code to rectify errors. The disks in RAID 3 are synchronized by the controller, which causes them to spin at corresponding angles such that they attain index points at the same time. Therefore, RAID 2 cannot efficiently handle multiple requests at the same time. Notwithstanding, contingent upon the rate of the Hamming code, several spindles would operate in parallel to ensure a simultaneous transfer of data, such that very high data transfer rates are feasible.

Since error corrections are implemented on all hard disk drives, the complexity of an external Hamming code offers an advantage over uniformity. For this reason, RAID 2 has been infrequently implemented, and it is the only standard RAID level that is currently unused.

  • RAID 3

RAID 3 entails byte-level striping with a committed parity disk. Among the features of RAID 3 is that it can not effectively monitor multiple requests simultaneously. The reason for this is that any single block of data will be transmitted across all the entire set members and will occupy the exact physical location on each disk. Therefore, any input/output operation will require activity over the whole disks, as well as synchronized spindles.

For these reasons, RAID 3 is suitable for applications that require the highest transfer rates in long chronological reads and write. This RAID level will perform woefully for applications that make miniature reads and writes from random disk locations.

  • RAID 4

RAID 4 entails block-level striping with a dedicated parity disk. The layout of RAID 4 provides good performance of random reads even though the performance of random writes is low because of the need to write the entire parity data to a single disk. This can be taken care of if the filesystem is RAID-4-aware and compensates for that.

RAID has the edge over others because it can be quickly extended online. This doesn’t require parity re-computation, as long as the newly added disks are filled with 0-bytes.

  • RAID 5 and 6

RAID 5 and RAID 6 use similar techniques. For RAID 5 to be used, there must be at least three drives. On the other hand, RAID 6 requires at least four drives. This level incorporates the idea of RAID 0 and stripes data across multiple drives to augment performance.

However, it also adjoins the aspect of redundancy by distributing parity information across the disks. In essence, RAID 5 can lose only one disk and maintain operations without interruption. RAID 6 can lose two disks and still maintain operations and data without any hitches. RAID 5  and 6 provide users with better read performance. However, the write performance is contingent upon the utilized RAID controller.

RAID 5 and 6 require a dedicated hardware controller because of the need to compute the parity data and write it across the entire disk. Hence, RAID 5 and 6 are suitable options for file servers, standard web servers, and other systems where most of the transactions are read.

Nested RAID Levels

Nested RAID levels are obtained from the combination of standard RAID levels. Some examples of nested RAID levels are:

  • RAID 10 (RAID 1+0)

When RAID 1 and RAID 0 are combined, RAID 10 is produced. RAID 10 is more expensive than RAID 1 and also offers better performance than RAID 1. The data in RAID 10 is mirrored, and the mirrors in this RAID are striped.

  • RAID 03

RAID 0+3, otherwise known as RAID 53 or RAID 5+3, adopts RAID 0’s striping method and RAID 3’s virtual disk blocks. This produces a higher performance compared to RAID 3, but at a higher cost.

Non-Standard RAID levels

Non-Standard RAID levels differ from standard RAID levels and are usually developed by companies primarily for exclusive use. Some examples of non-standard RAID levels are:

  • RAID 7

RAID 7 is a non-standard RAID level obtained from RAID 3 and RAID 4. RAID 7 entails caching via a high-speed bus, real-time incorporated operating system as a controller, and other features of a solitary computer.

  • Adaptive RAID

Adaptive RAID allows the RAID controller to determine how the parity on disks will be stored. Adaptive RAID chooses between RAID 3 and RAID 5.

RAID and Data Backup and Recovery

RAID is often used in conjunction with data backup and recovery strategies to help protect against data loss.

RAID arrays can provide redundancy and fault tolerance, which means that if one or more hard drives fail, data can still be accessed and recovered from the remaining disks in the array. However, RAID is not a replacement for a proper backup strategy. While RAID can protect against hardware failures, it cannot protect against data loss due to software issues, human error, or other factors.

It’s important to have a separate backup strategy in place that includes regular backups of important data to an external location or cloud-based storage. This ensures that if data is lost or corrupted for any reason, it can be easily restored from the backup.

Messenger