Talk:RAID/Archive 5

This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

←

Archive 3

Changing Hardware Economics: Controller Cost vs Disk Cost, Etc.

In addition to performance and robustness, cost and complexity are major criteria for RAID level selection.

Most of the standard RAID levels and much of the popular wisdom regarding their application go back to a time when disks were smaller, slower, and more expensive.

Since then, disks have gotten large enough to make using RAID 5 at all a bit questionable and using RAID 6 pretty much requires an expensive (for now, anyway) hardware controller in most cases.

There are probably many cases that would have justified the cost and complexity of RAID 5 a few years ago which can now be addressed more cost-effectively by simply mirroring bigger faster disks and using more main memory.

Of course, a couple of years from now we may see amazing RAID 6 (or whatever comes next) hardware controllers so cheap that nobody will think twice about using them.

The point is that the cost/benefit ratios of these techniques are always in flux. It might be worth pointing that out somewhere. —Preceding unsigned comment added by 69.248.248.11 (talk) 17:22, 5 December 2009 (UTC)

OS X Raid Support

The operating systems section only mentions OS X Server, but support of at least RAID 1 is also built into plain old OS X.

Nick 3216 (talk) 21:04, 26 December 2009 (UTC)

Duplicated content in Organization and Standard Levels

The same information is present in both sections. I don't see a way to quickly combine them, so I'll just mark my concern and hopefully someone working more intensively on this article can consider it. It seems that both sections were done independently. The article doesn't read as an organized whole. —Długosz (talk) 17:23, 4 November 2009 (UTC)

I don't see why the entire Organization section shouldn't just be removed. Rees11 (talk) 02:04, 23 December 2009 (UTC)

Automate archiving?

Does anyone object to me setting up automatic archiving for this page using MizaBot? Unless otherwise agreed, I would set it to archive threads that have been inactive for 60 days.--Oneiros (talk) 15:46, 24 December 2009 (UTC)

Done The bot should start in the next 24 hours.--Oneiros (talk) 22:39, 26 December 2009 (UTC)

Standard nested levels terminology?

I make a motion that we decide on a standard terminology for describing nested RAID: one term should be preferred, and the other should be listed as an additional commonly used term.
For example, I'm referring to how RAID 1+0 (or 10) is referred to using 10 in some places and 1+0 in others. I would like for one term to be the preferred usage over the other, and then for all locations referencing the idea to use the preferred term while including the other only as necessary.

I vote for "x+x" to be preferred, since without the '+' they appear as a completely different number, which could be confusing. The '+' helps to show that "10" is not its own standard RAID level, as if it came after a level 9.
Caveats: AFAIK, most motherboards or RAID chips or cards use "10" rather than "1+0". Also, there's MD RAID 10, which is not nested 1+0.

Opinions? Comments? Suggestions?
-Garrett W. {☎ ✍} 06:11, 4 January 2010 (UTC)

Uncited claim

The paragraph beginning with "RAID is not a good alternative to backing up data" needs citing for this claim. Also, the term "unsafe" sounds like an opinion. All of Section 3 is uncited. --jacobmlee 10:13, 28 Oct 2009 CDT

I am not sure that this is relevant to the article. I know it is a well meaning warning, but it is more pertinent to the subject of backup methodology then it is a discussion of RAID technology. --Andrew L Rice 20:15, 29 Jan 2010 PST —Preceding unsigned comment added by 64.134.234.30 (talk)

I don't think this statement needs citation, as in the paragraph it is well described, why this his the case. It's rather a matter of common sense and understanding what RAID does and what Backup does, than citing what others have written about this thematic. --frank —Preceding unsigned comment added by 194.94.44.4 (talk) 10:36, 17 February 2010 (UTC)

Article inconsistent regarding backup suitability of RAID

In section RAID 0 the article states, that "[...] RAID 1 mirrors the contents of the disks, making a form of 1:1 ratio realtime backup [...]" and "[...] RAID 1 offers a good backup solution. [...]" whereas in section RAID is not data backup it says, that "[...] RAID is not a good alternative to backing up data [...]". That is irritating and inconsistent. IMHO the latter is correct and stating, that RAID can offer a backup solution is technically wrong and dangerous. The section RAID 1 is more about the suitability to back up data than about how RAID 0 is organized. I suggest to rewrite the section completely --194.94.44.4 (talk) 10:49, 17 February 2010 (UTC)

WP:AT

I imagine the name of this article is a touchy subject but I had to edit the tortured lead. WP:AT calls for alternate names in the first sentence so I felt that was more important than fighting over which came first Redundant array of inexpensive disks or Redundant array of independent disks. I've listed them both as alternate names and then moved up the description of what a RAID is as per MOS. Why is it notable. The remainder of the lead is about those Berkley folks defining it but the marketers redefining it. Sorry, I think I'm following the rules here without losing the point that there's controversy about the "I" in the title. Hutcher (talk) 18:37, 2 March 2010 (UTC)

I spent years working as a storage engineer and readily acknowledge that there's no universal agreement about whether the "I" in RAID represents "independent" or "inexpensive" - and in many implementations, the disks in question are neither. However, I'm removing the reference to "Random Array of Insignificant Devices." This is pure whimsy, has no basis in common terminology, and miserably fails the Google test. Bonehed (talk) 01:55, 11 March 2010 (UTC)

References

Some references are out of date:

Reference 14 says it is deprecated and replaced by http://raid.wiki.kernel.org/index.php/Linux_Raid

Reference 15 no longer exists —Preceding unsigned comment added by 131.111.85.79 (talk) 13:11, 10 March 2010 (UTC)

In section 9.1 http://en.wikipedia.org/wiki/RAID#Operating_system_based There is an uncited sentence under windows raid: "RAID functionality in Windows is slower than hardware RAID, but allows a RAID array to be moved to another machine with no compatibility issues." Isn't that true for all software raids? Why is that listed only under windows? —Preceding unsigned comment added by 94.113.6.192 (talk) 16:12, 14 April 2010 (UTC)

Standard Levels - RAID 0

Any disk failure destroys the array, which has greater consequences with more disks in the array (at a minimum, catastrophic data loss is twice as severe compared to single drives without RAID).

I know little about RAID, so am not editing, but it seems to me that the sentence above should read:

Any disk failure destroys the array, and the liklihood of failure increases with more disks in the array (at a minimum, catastrophic data loss is twice as likely compared to single drives without RAID).

Apologies if I'm all wrong on this. Nonukes (talk) 16:53, 30 April 2010 (UTC)

Nah, you are right. Go ahead and change it. (Yes, I'm being lazy here :-) ) 24.203.68.10 (talk) 05:43, 13 May 2010 (UTC)

Done.
-Garrett W. {☎ ✍} 09:17, 30 May 2010 (UTC)

Standard Levels - 1 - Mirroring

I reworked the definition of RAID 1 to include the concept of Mirrored Sets, and that protection is at the SET level. . I think the original phrasing did not emphasize this fact, and could lead someone to believe that failure of all but one drive IN THE SYSTEM still protected their data. . Paulmmn (talk) 22:34, 9 June 2010 (UTC)

RAID1 disks being plain old mirror images is the norm, isn't it?

> Additionally, Software RAID1 disks (and some hardware RAID1 disks, for example Silicon Image 5744) can be read like normal disks

I think that should be changed to "most hardware RAID1..", and singling out a particular device that exemplifies this is rather pointless.. I have yet to encounter a RAID1 setup where this isn't the case - Dell controllers, enclosures all produce plain mirror images which can be pulled out and accessed separately, and I think this is the reasonable thing to expect from RAID1.. Examples where this isn't the case, how and why would be more interesting (and alarming). The key benefit of RAID1 is that it's dead simple because if anything falls apart, you still end up with 2 or more mirror images to attempt recovery from one or the other, so "RAID1" devices that don't have this property would be something of a "buyer beware" to point out.

If anyone knows examples of this, or that it's indeed not the norm, please provide details, otherwise I suggest changing "some" -> "most" and removing the example, as there's nothing notable about something that is typical and reasonable.

—Preceding unsigned comment added by 173.33.205.5 (talk) 06:22, 21 June 2010 (UTC)

Suggestion: Move section "Raid is not a backup" to "Problems"

I just do not think this deserves its own section. It should be listed as a criticism of using RAID as a back-up solution. —Preceding unsigned comment added by Elzair (talk • contribs) 21:36, 9 July 2010 (UTC)

Read/Write Benefits

There is no section talking of read or write benefits based on the type of raid as well as how many drives are part of that array. I have added a column for each of these with the data that I could find on RAID0, RAID1, and RAID5. I am still searching for the correct data for the other raid sets. If anyone has that data, please add it. —Preceding unsigned comment added by Just1morerifle (talk • contribs) 15:06, 11 September 2010 (UTC)

Just remember that sequential and non-sequential performance will differ for some raid types such as RAID5, as well as depending on implementation, and the number of threads accessing it, etc. So some might have multiple different benefits for read and write depending on the access types Neo-Vortex (talk) 05:25, 26 September 2010 (UTC)

Got a citation for that claim?

Sequential and nonsequential write performance in RAID is essentially identical, in that modern SANs do write-combining at the cache layer; the instant a write operation has been received from a host, the SAN simply tells the host the write has been committed, even though it may linger in write cache for quite a while before it's actually destaged. That being said, your biggest factor in read or write performance will be IOPS capability per spindle, and the degree of parallelism in the array in question. Hope that helps you kids. —Preceding unsigned comment added by 12.187.145.131 (talk) 07:19, 30 September 2010 (UTC)

HBVS, the OS-based mirroring provided by the VMS operating system, should be mentioned...

...as it is one of the oldest and best such implementations and is used quite often on VMS systems. In connection with VMS's unique clustering, redundancy and availability apply not just to disks but to computers (cluster nodes) as well.

—Preceding unsigned comment added by 193.29.77.101 (talk) 17:23, 11 October 2010 (UTC)

raid

The first sentence of the article contains a logical inconsistency: "RAID, an acronym for Redundant Array of Independent Disks (formerly Redundant Array of Inexpensive Disks), is a technology that provides increased storage reliability through redundancy, combining multiple relatively low-cost, less-reliable disk drives components into a logical unit where all drives in the array are interdependent." How can the disks be both "independent" and "interdependent"?

I further edited the article and I think this has been fixed. Nexus501 (talk) 17:40, 9 January 2011 (UTC)

Why the "!" after RAID in the intro? And why "formerly" - surely "inaccurately" is more appropriate? Bpdlr (talk) 16:58, 28 January 2011 (UTC)

++++

SAN abbreviation

Can someone write down what the "SAN" stands for? This is an encyclopedia article; personally, I don't come to read things I already know so I would expect the page be written by knowledgeable people and read by people trying to educate themselves. So finding such acronyms dumped on the page without the first occurrence bearing a "(S... A... N...)" "explicitation", so to speak, riles me just as much as reading stupid things, way too common on Wikipedia, like "as of now", "not yet", "to this date", "recently", "currently" and any other mention of time that does not bear an indication of when that "now" or "yet" is. Amenel (talk) 20:21, 11 December 2010 (UTC)

Agreed, added the definition / link to the first ref in the article / section Playclever (talk) 21:11, 9 January 2011 (UTC)

Logical volume management vs RAID

Under "Implementations" > "Volume manager support" it states that linux's LVM provides raid1 and raid0 functionality.

I can't speak to the raid0 functionality, but I do know that it has mirroring, not raid1. Mirroring in LVM is not synchronous. It writes to one side of the mirror, and in the background syncs to the mirror(s).

I've used it for several years on my personal systems, and have found it both stable and responsive. That said, it's not raid. If the master dies during a write operation, there can be data loss. Also, I saw none of the read benefits that one is supposed to see in raid1 (I admit, I didn't do scientific tests).

Before updating the article, can anyone speak to whether or not other logical volume management systems do something similar for mirroring, or if they are more similar to real raid1? — Preceding unsigned comment added by Kyleaschmitt (talk • contribs) 17:14, 7 February 2011 (UTC)

You're probably referring to something I added. I thought LVM's mirroring counted as RAID, but if it doesn't, please correct it. Ketil (talk) 08:17, 9 February 2011 (UTC). BTW, please also update the RAID definition with the requirement of synchronous operation, and (ideally) include a reference.

Paragraph added

I have added the following paragraph on low-end on-board RAID5:

"In low end small RAID systems, RAID5 has issues that make it of debatable utility. Low-end means using on-board RAID controllers on the motherboard, which use Windows or host OS drivers for the rebuilding. This means that if the host OS is on a RAID5 array controlled via a motherboard chip, and a drive fails, the entire RAID5 array is lost. The only remedy is to attach the degraded array to a higher-cost hardware RAID5 controller that with luck, is compatible and able to rebuild the array."

Reading the comments in this discussion, I see a marked divide between commments on high-end RAID arrays as used in SANS and the like, and comments on low-end RAID arrays that are more accessible to the masses.

Perhaps it would be worth having a section or two in the article specifically addressing issues at the high end, and issues at the low end, separately. RedTomato (talk) 22:31, 15 February 2011 (UTC)

RAID 10 versus RAID 5 in Relational Databases

-->Who on earth would suggest a RAID 5 write is faster than a RAID10 write? Someone who hasn't the first clue about what a RAID 5 write penalty is. This entire section on RAID5 now makes Wikipedia a laughably incorrect source on RAID. Read up on the RAID5 write penalty, and review benchmarks that use similar hardware comparing RAID5 vs RAID 10, and RAID 10 wins on every benchmark.

The ONLY purpose for RAID5 is for file storage or databases with accesses that are >75% reads. —Preceding unsigned comment added by 205.193.144.2 (talk) 15:17, 1 March 2011 (UTC)

I have removed the section titled RAID 10 versus RAID 5 in Relational Databases as there are no citations, and a considerable amount of searching could not back up any of the claims in the section, but rather indicate the inverse is true, even for benchmarks posted in the last two years. Furthermore the changes were committed by a single user who was not logged in and whose only other contribution, albeit a long time ago, was blatant vandalism.

Please do not re-add this section unless you have substantial benchmarks and citations showing otherwise. Neo-Vortex (talk) 05:01, 26 September 2010 (UTC)

I wrote the section you deleted. I've been working with various RAID implementations for the past 13 years, in every capacity from low level engineering to enterprise level SAN administration. I was an engineer at IBM Storage Systems Division for 5 of those years.. I am my own citation. If it would help you feel better, I suppose i could have another whitepaper published or perhaps write a blog post about it and cite myself. I don't think it would matter much, however---your deletion seems to have been fuelled by disagreement over what the section said--otherwise, you would have simply added a "needs citation" tag to the section and moved on.

As for vandalism, I think Bea Arthur would have found it hilarious that her Wikipedia page referred to her as a man. She was a comedienne, after all, and Wikipedia needs all the comedy it can get. :) Or were you referring to that one politician I referred to as a douchenozzle? I found it to be an accurate depiction. I'm all about accuracy.—Preceding unsigned comment added by 12.187.145.131 (talk) 07:09, 30 September 2010 (UTC)

Seems to me like a classical situation - one person has the knowledge of how to edit wikipedia and the other the knowledge about the subject. So lets try to combine the two to improve the article. I just added a citation. 12.187.145.131: Wikipedia Articles all rely on "reliable, published sources" [1]. If something is included in an article that's "original research" (>> The term "original research" refers to material—such as facts ... not already published by reliable sources.<<) [2] then it is correct to delete it according to Wikipedia rules, so it was correct from Neo-Vortex to delete your entries to the article as long as there are not published sources saying what you wrote. It was correct to delete it, even if it is 100% correct and important to the article what you wrote. So the solution to improving the article is to cite "reliable sources" about the facts that you wrote. As long as there are no citations it is the proper way to NOT include the writing in the article until there are the citations. So please don't be offended, if your writing or part of it should get deleted again when reliable sources are missing, whoever wants something included in an article has the obligation to deliver the reliable sources. Hope this helps a bit for the understanding. --Orangwiki (talk) 19:59, 30 September 2010 (UTC)

12.187.145.131: Here is a 2:10 minute film that explains a bit more how to get the information into the article: [3] --Orangwiki (talk) 20:57, 30 September 2010 (UTC)

It's not quite the classical situation you describe: the one person who thinks he has the knowledge doesn't have it: he misses the whole reason for deprecating RAID 5 (recoverability) and some of his performance statements are in any case dubious (for example the IBM paper cited shows an advantage to RAID 10 for a multiple single row updates, while he claims an advantage to RAID 5 for all write-intensive workloads). I've added a paragraph on recoverability and a refernce to Inaltra (chosen to be contemporary with the IBM reference already there) in the hope that this will leave a resonably balanced picture. Michealt (talk) 18:15, 3 October 2010 (UTC)

Personally, I agree. I'd rather have the facts present. Not opinions or experiences.

It's fairly obvious this guy deleted the section because he didn't agree with it, which borders on censorship. Otherwise, his response would be to simply pepper the section with [needs citation], which would prompt me to provide a more thorough explanation.. Which i'd gladly do. I've been working with RAID for over a decade, and I and others, do it for a living on modern, enterprise-level gear. The citations you're looking for are provided by people like me. I have built and tested the appliances you use.

Granted, RAID is a contentious subject due to its relative complexity plus the diversity of opinions and experiences of people who work with it, even those who work with it on an enterprise level. The point you and others may be missing is that opinions and experiences (including my own) mean nothing in the presence of facts. I provided the math and the reasoning which backs up what I stated. Saying something like, "A RAID which must stripe a mirrored set must first write to the mirror itself" doesn't require a citation, no more than saying "1+1=2" requires a citation. It's a fact of RAID 10, and thus not open for debate. It's not up for debate that all the members of a RAID 5 must participate in the rebuild process after a spindle dies. Unless you believe in unicorns that magically arrive on a rainbow carrying missing parity data, you're dead unless all the remaining spindles participate.

Lets' boil down the section in detail:

Blanket statements about RAID 10 being superior to RAID 5 when it comes to relational databases are false. Historically, this was a true statement, but as of about 10 years ago, it's no longer true. It's an easily demonstrated fact.

RAID 5 traditionally had a write penalty. It's since been largely masked. That's a fact too.

RAID 10 cannot write a discrete payload to greater than half of the drives in the array at a time, since the data must first be mirrored prior to being striped.....Unless your storage appliance contains magic unicorns capable of time travel, and can stripe data that those unicorns have peered into the future and seen written in the mirrored set, striping must occur after mirroring. Sure, those other drives in the stripe set can be busy writing other things, but laying out data in a RAID 10 is essentially a two-stage process. These are facts, and last I checked, the $1.2 million SAN I maintain 20 feet away from me did not have a magical time-travelling unicorn license.

The choice of RAID 5 versus 10 boils down to an examination of the required and preferred characteristics of the database and the array itself. Again, a fact, and not arguable.

It doesn't take much for the preference of RAID 10 to change to RAID 5 (and vice versa) as difference circumstances develop. Again, fact.

Rather than turn this into a pissing or dicklength() war over RAID, my advice would be to defer to those who have facts, rather than those with opinions who own unicorns, or those who simply cite others with opinions which happen to agree.

—Preceding unsigned comment added by 68.227.254.182 (talk) 16:56, 4 October 2010 (UTC)

Currently in the article: "Additionally, the time required to rebuild data on a hot spare in a RAID 10 is significantly less than RAID 5, in that all the remaining spindles in a RAID 5 rebuild must participate in the process, whereas only half of all spindles need to participate in RAID 10. In modern RAID 10 implementations, all drives generally participate in the rebuilding process as well, but only half are required, allowing greater degraded-state throughput over RAID 5 and overall faster rebuild times."

Factually incorrect:

The exact reason why RAID 10 rebuilds are so fast is because 2, and only 2, disks are involved in rebuilding the contents of a failed disk, NOT half the drives in the array. That would be true of RAID 0+1. These two drives involved in the rebuild are the original mirror partner of the failed drive and the replacement drive. You reference [7] http://www.bytepile.com/raid_class.php#10 as your source for the RAID 10 rebuild information. That source mentions absolutely _nothing_ of the sort!

From your discussion comment above: "RAID 10 cannot write a discrete payload to greater than half of the drives in the array at a time, since the data must first be mirrored prior to being striped....."

Factually incorrect:

RAID 10 writes are striped, THEN mirrored. The same is true of RAID 0+1. The mechanics are identical regarding the chunk calculations and phases. The mirror operation occurs last in both RAID 10 and RAID 0+1. The only difference is the size, and the number of the mirror operations. Think of it this way for RAID 10: How can you mirror a chunk when the chunk hasn't yet been created by the striping process? You can't. In the case of RAID 0+1: The mirror operation is simply a duplicate of the stripe write with different device IDs--those of the second set of disks--the mirror stripe.

Additionally, what makes any personnel at bytepile.com an expert source for RAID information? bytepile.com is absolutely _NOT_ acceptable as a reference source on Wikipedia. Maybe work published by IBM, SUN, HP, SGI, EMC, one of the universities or even a smaller SAN vendor, etc, but _not_ this tiny VAR/ISP in Hawaii with no track record that no one has ever heard of.

The author of (parts of this section) does not have correct knowledge or full understanding of the subject material. The article absolutely needs to be modified post haste, as Wikipedia is spreading false information on RAID technology, to people who come to Wikipedia looking for FACTS, and _need_ the facts. Lay people don't read a RAID article on Wikipedia. Technical people do. Every second the current text is left in place Wikipedia is spreading false information.

I agree with Neo-Vortex. This RAID 10 vs 5 section is rife with misinformation, pure opinion not supported by fact, and reference sources about as trustworthy as Bernard_Madoff. It needs to be removed, period. Then, when a a legitimate research paper whose entire focus is RAID 5 vs 10, from the likes of IBM, a Fortune 500, reputable SAN vendor such as EMC, etc, can be found, THEN a coherent, accurate account of RAID 5 vs 10 can be added back into the article. As of now, all the current crap does is confuse people who have read elsewhere on the topic, and find the information on Wikipedia to be contrary to most everything they've read elsewhere, and "elsewhere" being reputable sources. If anyone is so vested in this RAID 10 vs 5 thing that they are compelled to reinstate it, banning the sucker from this article would not only be justified, but NEEDED.

Hardwarefreak (talk) 12:44, 30 December 2010 (UTC)

Mr 68.227.254.182, I do have RAID knowledge and can read the article you used to support your opinions. It does not support your opinion that "choice of RAID 5 over RAID 10 often comes down to the preference of the storage administrator." The article shows that indeed on less than full stripe writes RAID10 is superior. I quote form the conclusion "The performance of RAID 10 may be better for very high random write workloads. The amount of improvement will vary." As should be expected. Your argument would also be better served if you had less opinion and more data. As is its useless for most and dangerous for some. It should be deleted or improved.

BTW are you like 12? Because your flippant language belittles your argument. Lonerock (talk) 21:05, 4 October 2010 (UTC)

Hi. I didn't cite any article. That citation was added by someone else, but, feel free to kick and scream at me anyway. I don't mind. I do this for a living.

Lets look at the quote which puts a bee in your bonnet, in its full context:

"Again, modern SAN design largely masks any performance hit while the RAID array is in a degraded state, by virtue of selectively being able to perform rebuild operations both in-band or out-of-band with respect to existing I/O traffic. Given the rare nature of drive failures in general, and the exceedingly low probability of multiple concurrent drive failures occurring within the same RAID array, the choice of RAID 5 over RAID 10 often comes down to the preference of the storage administrator, particularly when weighed against other factors such as cost, throughput requirements, and physical spindle availability. However, for some deployments, particularly business critical database deployments, RAID 10 tends to be favored despite its larger footprint."

That entire paragraph is true. Let's break it down:

Do modern SAN designs largely mask performance degradation during rebuilds? Yes. Do modern SAN designs have the ability to perform rebuilds as an in-band and out-of-band process? Yes. Are drive failures rare? Yes. Are multiple drive failures in the same RAID array exceedingly rare? Yes. Do multiple criteria come into consideration when choosing an appropriate RAID level? Yes. Because the needle usually doesn't tip heavilly to one side or the other from a performance standpoint, does it often come down to the individual preference of the administrator based on other factors? Yes. However, does RAID 10 tend to be favored for business-critical deployments? Yes.

These are facts, not opinions.

It doesn't really matter what anyone's -opinion- is, at that point, because opinions mean nothing in the presence of facts. It's like having an opinion on 1+1=2. If you want to discuss opinions, start a RAID Opinions page, and fill it with as many witty anecdotes, "well, from MY experience" stories, and pictures of magical parity unicorns as you like. This isn't the place for it.

Ah, this again. I've noticed that my headers for original research and citations, along with all my needs citation tags were silently removed by an IP user not long after they were put there, with no additional citations added. Since then, the citation header has been re-added to the section (albeit with November instead of October as the date) again by a different IP user in the most recent revision. No matter how much you claim your logic is as simple as 1+1=2, the fact of the matter is that someone else has questioned it, and it appears I am not alone in questioning it, as such, you need to provide citations. Your own experience and personal research is not acceptable. Please see WP:OR and WP:Verifiability for Wikipedia's policies on those topics. Considering this topic has been going on for almost 2 months now, with no citations, I am strongly considering removing the section again unless sufficient citations are provided to verify the claims posted in the article.

Regarding the claims of censorship, I attempted to locate sources to verify the claims listed, and upon failing to do so considered the section to quite possibly be elaborate vandalism, or accidental misinformation. I brought it up on this page just in case I was incorrect so that the matter could be discussed fully, while mitigating any impact of misinformation.

If Mr. Anonymous RAID Expert(s?) continues to revert deletion of the section once removed again (assuming sufficient citations are not provided in the next couple of days) without adding relevant citations, I suspect Article Protection might be in order, as the as they do not seem willing to follow WP policy regarding original research and verifiability.

If anyone has any good reason against removal of the section again, please leave a note here.

Neo-Vortex (talk) 21:27, 13 November 2010 (UTC)

I'm fine with leaving the paragraph there. The point i'm trying to make is that the statements given in it are clearly obvious to anyone who does it for a living. If you like, however, i'll add some citations, albeit largely unnecessary ones.

RAID, as a topic, tends to be clouded with opinions rather than facts. 16 year olds who figure out how to get RAID working between two IDE drives on a $70 PC motherboard and a half-million dollar enterprise SAN solution have very, very little to do with eachother in the areas discussed in the paragraph. This is why they were removed. They're not only inapplicable, they're incorrect. —Preceding unsigned comment added by 204.68.32.6 (talk) 23:59, 23 November 2010 (UTC)

I have removed the section again as there have been no relevant citations to back up any of the stated claims. Please do not re-add content from this section without first citing sufficient relevant sources to back up -ALL- claimed facts, regardless of what you know, or think you know. This is Wikipedia policy and if you wish to discuss it, here is not the place. Please see WP:OR and WP:Verifiability for the relevant policies. 150.101.188.110 (talk) 09:10, 16 November 2010 (UTC)

Bah, my login expired. The removal of the section, and the comment above this one, were both done by me. Neo-Vortex (talk) 09:14, 16 November 2010 (UTC)

Citations added.. Might I suggest a "RAID Myths Debunked" section, by the way? —Preceding unsigned comment added by 204.68.32.6 (talk) 00:17, 24 November 2010 (UTC)

Sigh* getting 2 irrelevant citations that clearly do not agree with what you are saying in the section and pasting it across the article is not what is required. I am removing the section again. You stated that modern hardware allows for RAID 5 to be a viable alternative for databases, yet the citations are from 2003 and 2003, in addition neither clearly agree with your statements in the section. A raid myths debunked section might actually be a good idea, I could fill it with citations showing that RAID10 outperforms RAID5 in random read/write IOPS constrained workloads - like databases... Neo-Vortex (talk) 12:26, 25 November 2010 (UTC)

First, a disclaimer: I don't know much about RAID except just what it took to set up RAID 10 on my own machine. So I don't know who's right about what. But if you'd like to object to citations from 2003 as being too out of date and could do so by filling the article with lots of better, newer citations that disagree, then you should do it. Produce the citations. Everything else is irrelevant. Msnicki (talk) 20:45, 26 November 2010 (UTC)

2003 and 2003 aren't within the past 10 years? Ok. Allow me to refer you to the Wikipedia page on basic math. Perhaps calendars, too.

On the technical side, I agree with your statement, actually. RAID 10 does outperform RAID 5 on random read/write IOPS constrained workloads. You'll see I also noted in the same exact section, "The strengths and weaknesses of each type only become an issue in atypical deployments, or deployments on overcommitted or outdated hardware". Having an IOPS constrained workload at the enterprise layer is both atypical as well as overcomitted. Allowing circumstances like these would be good reason to fire your SAN admin, since that's his or her job--to stave off those sorts of things, and do so proactively.

A database would only BE bottlenecked at the SAN layer if there were 1) insufficient write cache at the SAN layer, 2) spindle count was too low, to the point where parallelism suffers, 3) heavy block-level fragmentation on the volume/RAID group which contains the LUN. Short of those cases, a typical DB admin, be it Oracle or something MUMPS based, would not know the difference. The majority of relational databases in the real world have a heavy read or write bias. Rarely will you see one that's doing about as much reading as it does writing. That's more the domain of marts and warehousing. The subject of this section addresses the majority of cases, while also acknowledging exceptions to those cases. What are you arguing about?

You have to keep in mind that there are basically two schools of RAID. People who work with it for a living, and people who think they're experts because they set up a software RAID on a Linux box. They practically only have the word "RAID" in common. A plumber is not the same as a civil sanitation engineer. A redneck with a meth lab is not a chemical engineer. A guy who sets up a RAID is not the same as a professional SAN admin. Having been in both places, I can tell you that will have about 10% of the picture compared to doing the same work on an enterprise-grade SAN solution. This section refers to the later. There's a difference between playing with the BIOS screen on an Adaptec RAID card, and working with something like this:

http://www.youtube.com/watch?v=gDg9W1cCWKc

The things which are talked about in this section are, to a professional SAN admin, obvious by inspection--That's why you don't see actual SAN admins crying foul about the section, only people with PC-level experience in basically ghetto RAID setups. For example, I had to remove a section not too long ago from some dingbat who thought there was no such thing as write cache for RAID 5. This is what I mean.

I'm more than happy to teach you the difference (I do enjoy teaching), and answer questions about some of the finer points of the section(s) i've penned, but disagreement by virtue of ignorance isn't grounds for removing the section. —Preceding unsigned comment added by 98.225.124.14 (talk) 20:13, 26 November 2010 (UTC)

What you are saying now is that 'throw enough disks at it, and of course IO isn't your bottleneck' - sure, that applies to everything, throw enough CPU at something and an inefficient algorithm will run at an acceptable speed.

The way the current section of the article is written does not sound like that though, it sounds more like 'given <x> disks, a RAID5 setup will allow you to perform just as many transactions per second as a RAID10 setup on the same hardware'.

The first line of the article, A common myth (and one which serves to illustrate the mechanics of proper RAID implementation) is that in all deployments, RAID 10 is inherently better for relational databases than RAID 5, due to RAID 5's need to recalculate and redistribute parity data on a per-write basis. [2]. Now, tell me, is RAID 10 inherently better if you were stressing your disks? I'm sure if you're running 1 transaction per second on 10,000 disks, you'll never care what's happening in the backend, because it's so massively overprovisioned that it just doesn't matter. Does that mean RAID 10 isn't better because both perform the same in non-IO constricted workloads? No, it just means if you throw enough hardware at it, both work.

Rewrite the article to say something along the lines of "when IO is not the bottleneck, RAID5 is acceptable for backend storage of a DRMS, however when IO is bottlenecking, RAID10 will maximise performance of a given number of disks" and I'll have no issues with it. This article is about RAID as a concept, not a 'if you throw enough disks at it, anything works'. Also while you're at it, get some better sources, rather than 2 sources repeated throughout the entire section... —Preceding unsigned comment added by Neo-Vortex (talk • contribs) 06:24, 27 November 2010 (UTC)

I wanted to jump in, because it pains me to see an edit war... I don't think it's productive to delete the whole section, but I do think it could use some improvement:

The section needs to acknowledge that the performance difference is not a thing of the past or a "myth" or no longer an issue "on modern systems". The correct comparison would be enterprise (and to some extent mid-range) systems vs. low-end and hobbyist systems, or older enterprise systems. The section tries too hard to make it seem there's no longer an issue, whereas of course it depends..
The section also needs to make clear that it is the design of the system, and not any improvements to RAID-5 itself, that mitigate against lower RAID-5 write performance. The section makes reference to deficiencies in "past RAID-5 implementations", which might suggest that it's the RAID-5 code itself providing improvements, rather than improvements above and below RAID in the I/O stack, from larger cache and faster drives.
I'm not keen on the description of drive failures as "rare", as this depends on your perspective - the statement would be better if quantified. In a day, drive failures are rare, but measured over years they quickly become inevitable. It's a fair assessment that simultaneous drive failures are rare, though (ignoring nasty bugs where one failure actually causes the second).
The section is titled "RAID 10 versus RAID 5 in Relational Databases", but does not explain how the Relational Database use case differs from other applications. It would be good to either explain the relevance of relational databases to the discussion, or change the section title.
To my mind, the section would benefit from being framed as a more general discussion of deployment choices, explaining the impact of cache and I/O subsystem bottlenecks, and touching briefly on the mitigation of the RAID-5/RAID-10 performance gap in carefully designed enterprise systems.

Good luck :) Playclever (talk) 11:23, 27 November 2010 (UTC)

Hi PlayClever -- Original section author here. THANK YOU. Upon reading, I agree with each of your points. To comment:

1) Agreed. What i've written reflects strictly enterprise-level SAN gear. Its probably not abundantly clear to hobbyists that hobbyist-level RAID differs almost violently from enterprise-level implementations. I'll modify the section title to fit, feel free to tweak.

2) I thought I was clear enough, but, this goes hand in hand with #1. I don;'t think it would be much hassle for me to fix this. Thanks for the heads up.

3) Also an easy fix.

4) I've been thinking about breaking the section out into 2 or 3 examples which would illustrate why one RAID level over another might be favored.

5) Agreed. Let's see what I can come up with. —Preceding unsigned comment added by 98.225.124.14 (talk) 18:25, 27 November 2010 (UTC)

The article says " The consideration of RAID 5's "write penalty" when deciding between RAID 10 and RAID 5, at least on modern systems, can be effectively ignored." and references http://www-03.ibm.com/systems/resources/systems_storage_disk_ess_pdf_raid5-raid10.pdf but that paper actually does not indicate that it can be effectively ignored. In fact it states " RAID 10 outperforms RAID 5 for workloads dominated by random writes." Perhaps it would be better to rephrase it as "The effect of a write penalty when using RAID 5 is mostly a concern when the workload has a high amount of random writes (such as in a database) while in other workloads modern RAID systems can be on par with RAID10 performance." —Preceding unsigned comment added by 64.102.254.33 (talk) 23:16, 3 December 2010 (UTC)

I've added some templates (NPOV and OR) to this section, as it reads like an opinion piece on somebody's blog, not an encyclopedic article. And it only seems to reference the author's own web page, lots of unsubstantiated claims etc. I'm not disputing that the author is knowledgeable or even correct, but this material would fit better on a personal blog, which could be referenced along with contrary views, there seems to be plenty of sources discouraging RAID5 for RDBMSes, and although the author may think it is approprate to dismiss them as "myth", I don't think this belongs in WP. 82.134.28.194 (talk) 11:33, 3 February 2011 (UTC)

I don't like all these references to teh Bytepile website being used to pretend to justify things that just are not mentioned on that website (the claims that in a modern RAID10 system all the drives participate in recovery os a single disc failure or that half the discs are needed to recover from a single disc failure, for which a reference to that website was given and which that website certasinly does not support). Nor do I feel that bitepile's marketting literature, which is close to being void of technical detail and certainly displays no benchmarking results, could reasonably be used to support the claims being made evern if the text on the site did in fact support them. I've shanged the RAID 10 recovery nonsense to make it correct, but for now left the rest as it is. I still believe that deleting the whole section would be the best treatment for it, as it's mostly one man's opinion and that man clearly does not understand recoverability (which tends to be pretty critical for databases) and is unable to avoid showing a very clear bias (he clearly assumes that a RAID 10 holding the same data as a RAID 5 has the same number of spindles (where in fact it has 2N spindles when the RAID 5 has N+1, giving it a 2N to N+1 read performance advantage) for the purpose of his READ performance claims, but also says RAID 10 is more expensive, so clearly he knows that it uses more spindles when it suits his argument, but not when it doesn't. He (or perhaps someone else who shares his prejudices) has deleted references that disagree with his PoV, regardless of their usefulness. Some time back EMC published a performance paper indicating clearly that the only workloads on which RAID5 came close to RAID 10 (with equal capacity, not equal number of spindles) or outperformed it were those dominated by serial writes: and dominated meant that more than 80% of the workload was serial writes. OK, we could maybe use a RAID 5 array to hold an RDBMS's transaction log if we didn't mind having that log on a virtual drive with poor recoverability but I would be very surprised if anything else in the database world had a workload that dominated by writes. Of course things have moved on since then, with writing the parity being pushed forward into slack periods, but you can't put off writing the parity for ever and of course we can defer writing mirrors too, and from what I've read the situation is still that RAID 5 can only win on seriously serial write dominated workloads. Michealt (talk) 21:13, 22 February 2011 (UTC)

Independent/Inexpensive

I've been stupid. Thanks to Prof. Patterson for replying to me anyway! Bpdlr (talk) 18:21, 19 February 2011 (UTC)

It does stand for Inexpensive rather than Independent; though, from my understanding, the latter is often used, and it is no longer required to be disk-level. Here: Introduction to redundant arrays of inexpensive disks (RAID) No1unorightnow (talk) 18:38, 25 February 2011 (UTC)

RAID 1

"With appropriate operating system support, there can be increased read performance, and only a minimal write performance reduction." http://www.tomshardware.com/charts/raid-matrix-charts/Throughput-Read-Maximum,223.html as a single example, this disproves the statement about RAID 1. My personal experience supports Toms hardware. Are there any citations for this statement? Down below, the table for RAID 1 also shows its read performance as nX, showing it as always true that 2 drives will give aprox double read performance. Anonymous Coward 143.138.26.178 (talk) 15:25, 26 January 2011 (UTC)

Also, that table says that space efficiency is (1/2)n. From a layman's perspective, it seems like this should be just 1/n. As long as nobody objects, I will change it.Qbert203 (talk) 00:48, 20 March 2011 (UTC)

I would object since the space efficiency really is (1/2)n where n is an even number, which is not equal to or the same as what you suggest for any number other than 2. I have no citation for this except for just about any Grade 8 math book on fractions, (1/2)4=2.0 while 1/4=.25 . Also, fault tolerance is incorrect. In a mirror, the fault tolerance is 1. Lets say in a 4 drive mirror, drives 0 and 1 are mirrored and drives 2 and 3 are mirrored. Should drives 2 and 3 fail, you will have data loss. Sure, you might get lucky depending upon which 2 fail there is the possibility that you will not lose data, but as far as fault tolerance goes the rating is 1 drive. 06:02, 5 April 2011 (UTC)John Obeda —Preceding unsigned comment added by 66.103.46.25 (talk)

If you are looking at a pure "RAID-1" (as opposed to a "RAID 1+0", "RAID 1+5", or other combined RAID levels), each drive is a pure/perfect mirror of the other. For space efficiency, if I have 17 1TB drives, my RAID-1 array will have 1TB of user-accessible capacity copied 17 times over. In this (exaggerated) case, my fault tolerance would be 16 or n-1; 16 drives can fail and I still have a full copy of my original data. I don't have a reputable citation for this either, but it is based on how I understand pure RAID-1 works. Jsowoc (talk) 21:14, 2 May 2011 (UTC)

Various Linux distributions will refuse to work with "fake RAID"

This personification is incorrect. Various hardware manufacturers refuse to release sufficient documentation to allow various Linux distributions to work with "fake RAID." or Various hardware manufacturers offer only Windows drivers and fail to offer Linux driver or sufficient documentation for the opensource community to write drivers.

17:37, 29 March 2011 (UTC)ldillon —Preceding unsigned comment added by 72.175.233.250 (talk)

RAID 1 math errors?

In the Standard levels chart, it seems that there are some incorrect math for the RAID 1 for two columns:

space efficiency (given as 1/n)
fault tolerance (given as n−1 disks)

Wouldn't the space efficiency be 1/2? The existing formula yields laughable results for if you have 100 disks. And for fault tolerance, doesn't mirroring allow you to conceivably have up to 50% of the disks fail? The existing formula doesn't make sense.

MeekMark (talk) 21:42, 29 April 2011 (UTC)

The capacity is 1/n of the disk capacity you have overall (see example above: using 17 1TB disks in RAID-1 you have 1TB capacity which is 1/17 of your overall capacity of 17TB). —Preceding unsigned comment added by 194.11.254.132 (talk) 13:36, 3 May 2011 (UTC)

Anonymous, 15:38, 02 May 2011 (UTC) —Preceding unsigned comment added by 194.11.254.132 (talk)

RAID 2 recovery

Can't RAID 2 always recover from at least 2 disk failures.

If any disk dies if can be recovered by assuming it contained only zeroes, and use a correction algorithm to get all ones.

But all disks have a unique setup of which parities discs, it affects, (1) and all disks affects at least 2 disk. (2) (The minimum of affected parities for a disk is two, it is not sure it can recover 3 disk failures.) For example; disk 0 affects parity 0 and 1, and disk 1 affects parity 1 and 2.

If only parities fails, it is always recoverable. If one data disk and one parity disc fails, there is always at least one remaining parities from which the data disk can be recovered (see (2)). If two data disk fails and no parities fails, one of the discs is recoverable according to (1), and then the next disk is recoverable. —Preceding unsigned comment added by 130.237.49.105 (talk) 11:09, 3 May 2011 (UTC)

RAID "Read and Write Benefit"

The article gives what it calls "write benefit" and "read benefit" of RAID as a function of n and X. Presumably X is number of operations that can be performed per unit time to one of the member disk drives and n is the number of disk drives in the array. If so, then much of this is plainly wrong.

RAID 6:

For reads, all disks in a RAID 6 array can process a read and contain non-parity data. The "read benefit" is thus nX.

Write IO operations done to parity RAID arrays fall into two categories, full stripe writes and small writes. Small writes are more common. For a small write to a RAID 6 array the system must read the target data and its respective parity on each of two disks. It must then calculate both new parities and then it must write all three, target and two parity blocks, thus a RAID 6 small write is 6 disk operations. The "write benefit" of a RAID 6 array is nX/6, at least for small writes.

RAID 5:

For reads, all disks in a RAID 5 array can process a read and contain non-parity data. The "read benefit" is nX.

For a small write to a RAID 5 array the system must read the target data and its respective parity. It must then calculate the new parity, and then it must write both data and parity, thus a RAID 5 small write is 4 disk operations. The "write benefit" of a RAID 5 array is nX/4, again for small writes.

RAID 4:

For reads, all but one disk in a RAID 4 array can process a read and contain non-parity data. The "read benefit" is (n-1)X.

For a small write to a RAID 4 array the system must read the target data and its respective parity. It must then calculate the new parity, and then it must write both data and parity, thus a RAID 4 small write is 4 disk operations. This is the same as RAID 5. However the parity disk is dedicated rather than distributed so all parity writes must go to a single disk. The "write benefit" of a RAID 4 array is X, for small writes. Put another way, a RAID 4 array provides the write capability of one of its member disks regardless of the number of disks in the array. This is very inefficient. — Preceding unsigned comment added by 75.69.175.8 (talk) 02:02, 20 June 2011 (UTC)

Apparent mistake under "Increasing recovery time"

Re. the sentence "The re-build time is also limited if the entire array is still in operation at reduced capacity." should this not be "The re-build speed is also limited..."?

Kevin 99.244.184.166 (talk) 05:26, 20 July 2011 (UTC)

No introduction of RAID10

There is a whole discussion of RAID10 with absolutely no description or preface as to what that means. When I read this article, and hear about RAID10 vs. RAID5, I have no idea what RAID10 is, is used for, and why there is even a discussion of RAID10 vs. RAID5.

The entire first section describes the "Standard RAID Levels" but what would a non-standard RAID level represent? This should go before a discussion of the standard RAID levels begins.

Robtoth1 (talk) 03:40, 6 July 2011 (UTC)

You need to look again Robtoth1 - RAID10 is a RAID1 nested in a RAID0. But yes the entire article needs revision to lead the reader more easily through the potentially complex details. For example, the history section at the end needs to be brought up to the start (IMHO) in an edited form - I could do without the bit about Crosfield unless it can be justified and explained in more detail.

But having said all that, I knew nothing about RAID and found the whole article extremely helpful and reasonably comprehensible. Sorry I am not expert, and can not contribute.

AndyB (talk) 19:19, 25 July 2011 (UTC)

I just made some links to help people find more descriptive sections. Mfwitten (talk) 20:57, 25 July 2011 (UTC)

Definitions and clarifications

In reading the article, I note that the terms disk and drive appear to be used interchangeably. Having been in the computing field for many years and I may be dating myself here, but we used the term disk to refer to a platter in the drive. Although the entire assembly is called a hard disk drive it was and is comprised of numerous disks (platters) on a single drive spindle. Perhaps it would aid the clarity of the presentation to include a list of definitions for the various terms used and to use a single term for each separate item, e.g., use either disk or drive to refer to a hard disk drive, but not both. The Sandman Too (talk) 20:04, 5 September 2011 (UTC)

I support using 'drive' whenever possible, as it abstracts away the underlying implementation (for instance, a RAID itself could be used as a 'drive' in another RAID, as with nested RAIDs). Mfwitten (talk) 00:06, 6 September 2011 (UTC)

RAID 10 versus Raid 5 in relational Databases

The article section with that title contains a lot of statements based on largely non-technical marketing material the website of one manufacturer which are contradicted by other manufacturer's websites and by academic studies. This led to a POV mark carrying the standard message "The neutrality of this article is disputed. Please see the discussion on the talk page. Please do not remove this message until the dispute is resolved. " at the head of that section. Despite this, all references to that section were removed from the talk page some time in teh last six months. Is that how it's supposed to work? I don't think so. Michealt (talk) 23:34, 16 August 2011 (UTC)

I have to say that this part is very much opinion rather than hard fact and in my opinion some of the conclutions are even wrong. I recomend that the chapter is deleated because it is not up to wikipedia standards. — Preceding unsigned comment added by 194.218.202.38 (talk) 07:44, 1 September 2011 (UTC)

This section doesn't belong on WP. It should be on its author's blog where a conversation on the merits of the arguments can take place in the comments. Nono1234 (talk) 14:50, 14 September 2011 (UTC)

To me, it feels like this section provides interesting reading, and I think it would probably be a mistake to throw it out wholesale. As this section states, it "serves to illustrate the dynamics of proper RAID deployment"; it is providing some practical context with which the reader might develop a more intuitive understanding of the RAID levels. Mfwitten (talk) 21:55, 14 September 2011 (UTC)

The section contains a highly questionable claim that the write back cache of the controller "near always in enterprise environment" will not hold the CPU till data are written to the disk. This is only safe if used together with battery backup or similar expensive add-ons like [4], usually sold separately from controllers and I do not think they are very widespread. Without battery, I think transaction must be written to the disk before continuing. Audriusa (talk) 10:37, 20 September 2011 (UTC)

Right or wrong, this is not wikipedia ready. It's an opinion, and the only way anything like this should be included is if someone rewrites it to talk about general raid issues with relational databases.208.67.168.72 (talk) 16:41, 24 October 2011 (UTC)

Read and Write Benefit calculations.

In the table describing read and write benefit, among other items, what do the variables n and X represent, there doesn't seem to be any definition of them.

If they represent number of physical drives in the RAID set and the number of operations per second that each drive can process respectively then some of the claims are wrong.

For RAID 1, the write benefit only can be 1*X for drives that are rotationally synchronized. For non-synchronized drives, the normal case as of 2011, the write benefit will be slightly less than 1*X because the write to the RAID can not complete until all of the individual drives complete, and thus has to wait for the most rotationally distant drive to update. The degree of this additional overhead will increase with the number of disks in the RAID, tending toward an additional 1/2 revolution time of latency for large numbers of disks.

The RAID 4 claim of write benefit of (n-1)*X actually contradicts the paragraph directly above it! That paragraph is correct as far as it goes. In RAID 4, for non-full stripe writes which are the typical case, all writes to the RAID must read and then write the *single* parity drive. Thus the rate at which writes can be done to the RAID is 1/2 the rate which they can be done to a single non RAID drive! Truly a bottleneck, as the paragraph above the table suggests.

Similarly for RAID 5, your need to do 4 operations to complete the RAID write but parity is distributed and so no one disk is a bottle neck. Thus the write benefit is n*X/4. RAID 6 needs 6 operations for a similar write and so the write benefit is n*X/6. The read benefit for both is n*X, not (n-1)*X nor (n-2)*X as the table claims. Again this is for the typical non-full stripe write. — Preceding unsigned comment added by 75.69.175.8 (talk) 01:17, 18 October 2011 (UTC)

Silly, but easy to understand

I think we should include this. I've sent the image to sales staff, and they've instantly understood it. Yes, it's humorous, but it's a really good example. http://img264.imageshack.us/img264/659/raidbottlesqu4.jpg — Preceding unsigned comment added by 86.176.79.51 (talk) 12:19, 9 July 2011 (UTC)

Nice pic (!) but better to include a link to it at the end, for what it is as humour, rather than actually include it in the article. AndyB (talk) 19:21, 25 July 2011 (UTC)

I hear that they're particularly susceptable to ESD (RAIDs Lost in the Arc). Pedantrician (talk) 16:03, 16 September 2011 (UTC)

RAID, redundant array of independent disks (also called redundant array of inexpensive disks) is a method of logically combining two or more physical disks into one (logical) storage unit, advantages are speed up hard disk performance and redundancy (automatic protection against data loss in case of hard disk failure). This array of drives appear as a single drive to the computer operating system. — Preceding unsigned comment added by Pcslc (talk • contribs) 12:34, 7 November 2011 (UTC)

Alternate Names (Inexpensive makes sense; Independent is redundant)

A comment inside the definition said to check the Talk page before switching the definition given with the alternate given. I looked, didn't see anything here, switched the two names and commented I would add this section.

As stated, "Inexpensive" is what the "I" stands for, although the article text implied it has changed. I saw a reference to the name either in a Shon Harris text or in the (ISC)2 CISSP exam text book. Which used this, I don't recall but do recall that it was used in only one of them. It happens to be the only reference to "Independent" I've ever seen. It doesn't make sense, so I changed it.

Disks are, by definition, independent. Its possible to configure software RAID to recognize different partitions on the same disk, as different disks when configuring RAID. (At least under Linux; Windows refers to a partition as a disk -- albeit a Logical disk). But setting up RAID using different partitions circumvents the intended purpose, of redundancy. Some folks used RAID to make stored data more difficult to read, which isn't RAID's purpose and isn't especially effective. The fact that Windows looks at a partition as a logical disk does provide an explanation for the use of "independent", but it doesn't seem to make sense otherwise.

In this century disks are remarkably cheap. This may be reason for some to believe "inexpensive" doesn't make sense. If the price of storage, at the time RAID was first used is considered, "inexpensive" does make sense.

Drives with higher MTBF ratings cost more than those with a lower MTBF rating, and is why RAID was developed. It allowed cheaper, less reliable disks to be used in a redundant configuration. At first, this made backups less hassle via mirroring. How the redundancy is used today depends still, on the desired result.

Hopefully this is a suitable explanation for why I made any change, and for why I made the particular change.

Kernel.package (talk) 23:13, 3 November 2011 (UTC)

Why should the cost of disk's determine the name of RAID, or even enter into it? RAID was developed as a Redundant Array of Independant Disk's. — Preceding unsigned comment added by 83.104.192.21 (talk) 14:09, 14 November 2011 (UTC)

Regardless of what makes sense, a strong citation has been provided for the current language:

RAID, an acronym for Redundant Array of Independent Disks (originally Redundant Array of Inexpensive Disks),^[1] is a storage technology...

Mfwitten (talk) 15:37, 14 November 2011 (UTC)

It is possible this should not be considered a strong citation. Of the four citations listed at foldoc.org, the two web references are broken links and the 2 published works use the term "inexpensive." Further, other citations in the Wikipedia article support reading RAID as "Redundant Arrays of Inexpensive Disks." Can we be sure that foldoc.org is not simply quoting Wikipedia?

173.160.232.106 (talk) 17:14, 23 November 2011 (UTC)

That's a very good point. However, foldoc.org lists "1995-07-20" as what I assume is the date at which the entry was last updated. That's long before Wikipedia. Mfwitten (talk) 19:27, 23 November 2011 (UTC)

The original definition of RAID is given in the paper that defined it by Katz, Patterson and Gibson. You can find it here: http://www.cs.cmu.edu/~garth/RAIDpaper/Patterson88.pdf The 'I' stood for inexpensive since at the time 5.25 inch winchester disks were considered unreliable compared to their big iron counterparts from IBM and others which the authors called single large expensive disks (SLEDs) and were 10X as capacious. Later when disks were commodititized and RAID mainstreamed the industry shyed away from the term inexpensive as RAID arrays were made robust for enterprise class computing. It would be challenging to keep a straight face while using the word inexpensive in a conversation about current day RAID arrays like an EMC VMAX or HDS USP. — Preceding unsigned comment added by 75.69.175.8 (talk) 08:59, 29 November 2011 (UTC)

Hi, I'm an IT professional and I came across this and decided that the matter needs clearing up. RAID used to stand for Redundant Array of Inexpensive Disks, however (not sure when) the I was changed to be Independent as manufacturers and suppliers don't want to call the disks inexpensive. Yes, disks by definition are independent, this is why we use the word independent since it is an ARRAY of these independent disks. I will be adding more sources (MCSE course material) to this article.Sincerely, He's Gone Mental 09:10, 29 November 2011 (UTC)

I have cited links to IEEE.org as well as NetApp, EMC, and Western Digital. Currently storage vendors are reporting Independent. You can see between 2002 and 2009, the IEEE changed their terminology. 64.129.112.20 (talk) 18:44, 6 December 2011 (UTC)

^ Howe, Denis (ed.), "Redundant Arrays of Independent Disks from [[Free On-line Dictionary of Computing|FOLDOC]]", Free On-line Dictionary of Computing, Imperial College Department of Computing, retrieved 2011-11-10 {{citation}}: External link in |publisher= (help); URL–wikilink conflict (help)

[1] Howe, Denis (ed.), "Redundant Arrays of Independent Disks from [[Free On-line Dictionary of Computing|FOLDOC]]", Free On-line Dictionary of Computing, Imperial College Department of Computing, retrieved 2011-11-10 {{citation}}: External link in |publisher= (help); URL–wikilink conflict (help)

[1]