Talk:File Allocation Table/Archive 3

This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Archive 2

Archive 3

Archive 4

Archive 5

Archive 6

Root directory entries limit

do you have a cite for that fact that the limit can be higher than 512 and do you know if there is a higher hard limit? Plugwash 17:53, 14 October 2006 (UTC)

Provided cite for 512 limit and reverted paragraph to previous version. If someone has a cite proving that limit can excess 512, then please cite it and restore Tempel's version. AlistairMcMillan 18:16, 14 October 2006 (UTC)

Well, the citation is bad because it is a support document for users, not a specification for programmers.

See the MBR's spec: It does not specify a fixed limit explicitly, but instead the number of the entries is specified as a variable value in the MBR. That means, that any properly written software should be able to handle larger root dirs just as well - unless someone can actually quote a spec from Microsoft saying that the max allowed value in this field is 512. I can not find such. The only reasonable hard limit would be 32767 entries, which is the limit for a signed 16 value as provided by the 2-byte field at offset 0x11 in the MBR.

Actually, see the MS article ref'd from our article, which is a specification for the FAT format: standhttp://support.microsoft.com/kb/140418/ -- it says: "Root Entries: This is the total number of file name entries that can be stored in the root directory of the volume. On a typical hard drive, the value of this field is 512" - it says nothing that it can't be higher than 512, though. Which tells me as a programmer that it may' be higher, of course!

So, see? The limit of 512 is rather arbitrary, as I wrote this morning. Convinced? --Tempel 19:42, 14 October 2006 (UTC)

Nope. The document I linked may be aimed at users but I don't think it is wrong when it says "All hard disk drives use 32 sectors of 512 bytes each to store the root directory. This limits the root directory on a hard disk drive to 16K..." and "Each directory entry uses 32 bytes..." 16K/32 = 512.

If you have an actual source that says otherwise, and by that I don't mean "if you take source A and source B, munge them together, you get conclusion C" I mean a source that explicitly says "the root directory of the X implementation of the FAT16 filesystem had more than 512 entries". Do you have anything that says that?

You know, given our policies of Wikipedia:No original research and Wikipedia:Verifiability AlistairMcMillan 20:11, 14 October 2006 (UTC)

Using the "mkdosfs -r 1000 -C disk0 256" command on linux i just created a diskette image with 1000 entries in root directory, copied it to Windows XP, mounted using the filedisk utility, and then could create exactly 1000 files in the root directory. That is original research. But the article could just say that "Third party formatting tools like mkdosfs allow to set the root directory to any size" given that the man page for mkdosfs (a "primary source") says so. I don't know the Microsoft doc Alistair referenced insists on 512 entries and why format.exe does not allow to specify the root directory size. Maybe this is related to compatibility with older MS OS-es and maybe to avoid having the user puzzled about too many choices. Adam Mirowski 19:55, 15 October 2006 (UTC)

Update: Let's not rejoice too soon, fans of "FAT is totally flexible" statement. XP does not properly handle my 1000-root entries disk at all. Files disappear after unmount, chkdsk does not help, strangerly named files appear in the FOUND.000 directory after chkdsk as if one of the files was interpreted as a directory, etc. Maybe mkdosfs messed-up things, or maybe Windows indeed does not support non-standard configs. Adam Mirowski 20:59, 15 October 2006 (UTC)

I looked long and hard for a reference because I was sure I reformatted several FAT volumes with a parameter specifying a larger-than-normal number of root directories. Alas, the MSDOS format command has no such parameter, nor does Central Point's PC Tools PCFORMAT. Did I alter format.com? Maybe, but I don't remember. The traditional handling was to avoid putting files in the root directory. I recall some extremists who created a subdirectory named ROOT where they put everything.

The root entries parameter is in the BIOS parameter block (BPB) in the volume boot sector (VBR)—not the MBR. Each volume on an MBR-partitioned drive could have a different cluster size, and different root directory size. The BPB is documented here and clearly shows an independent parameter. I feel there are ways to affect the value, but I don't know of them—today. — EncMstr 21:45, 15 October 2006 (UTC)

EncMStr - you wrote " I feel there are ways to affect the value, but I don't know of them". For a programmer such as me this is a nonsense remark: I write software that generates volume structures on disks and as such I can create the way to set the root dir size to any value I want. And I've done that myself in times. That's how I came to claim that it's possible in the first place. Just because there is not tool to alter the value available to most people, it's not good enough to say it's not possible (but I know you agree with this - I rather respond to the original criticism of my statement). -- Tempel 11:35, 20 November 2006 (UTC)

I tinkered with the VDISK.SYS source code years ago. Much of the confusion over what the various FAT-related technologies support is due to the fact that you have at least five different variables at work:

the limit as defined by the field sizes. Some field sizes have changed or even been replaced by different FAT versions
the limit as modified by the data type (signed vs unsigned fields); this sometime changes for third-party variants.
the limit as provided by the support tools (such as the FORMAT utility); this is known to change with different DOS and Windows versions.
the limit as provided by the device driver itself.
the limits of various DOS/Windows functions.

In some cases, you also have multiple fields describing the size of the same data structure, sometimes differing by orders of magnitude. For example, both the MBR and the volume boot record contain fields specifying the size of a partition.

I think you'll find that, out of necessity, the block device drivers in Windows XP, 2003, and Vista will support virtually any configuration physically supported by the data structures.

Many years ago, before FAT32 and GPT and maybe before LFNs, I sat down to figure out the hard limits on each structure, and the sources of them; I finally gave up on it, since there were so many variables. I'd probably find it much easier now, by using a spreadsheet to propagate and recalculate the limits at each step.

I'd recommend modifying the infobox to note that the limits listed are the standard limits for modern Windows OSes. For example, subdirectories were not supported before DOS 2. --Scott McNay 23:27, 12 November 2006 (UTC)

When I first altered the article to state that there's no one globally fixed limit of root dir entries, I did this in response to a statement that claimed that it was fixed - that statement was plain wrong because, as now others have shown, it's not only possible from the spec but even doable with existing tools. OTOH, there's also the possibility that some software can't properly handle more than 512 root dir entries, but that's more a limitation of the particular software, not something that invalidates my claim that more than 512 are possible.

Here's a proposal:

Old text

The FAT16 format limits the number of entries in the root directory to 512 (entries being file and/or folder names in the old 8.3 format).[2] Use of long file names reduces this further. Even today, this limitation still applies to certain MP3 players that require the use of the FAT16 file system format.

New text

The FAT16 format limits the number of entries in the root directory to a fixed number, defined at time of formatting. This number is usually 512 entries (entries being file and/or folder names in the old 8.3 format), but larger numbers are possible as well, depending on the formatting software. Use of long file names reduces the amount of possible items in the root directory further. Even today, this limitation still applies to certain MP3 players that require the use of the FAT16 file system format. Care should be taken when consdering to format a FAT16 disk with space for more than 512 root dir entries as not every driver can cope properly with this unusual change - it may even lead to permanent data loss.

Agreed? Then I'll change the article accordingly soon.

-- Tempel 11:35, 20 November 2006 (UTC)

Looks good, although I'd suggest not limiting it to MP3 players; use them as an example, instead. Actually, if an MP3 device has that limit, it probably won't work correctly with LESS than 512 entries either, so I'd say "...for other than 512...". --Scott McNay 13:02, 20 November 2006 (UTC)

Here is my proposal. Find a source that says that there are working FAT volumes with more than 512 entries and then edit the article appropriately. Please either try and work within Wikipedia's rules or if you really don't like them, then change them. Don't just ignore them. AlistairMcMillan 13:53, 20 November 2006 (UTC)

Alistair, haven't the comments from others above given you enough reasons that it's possible? How can you insist in asking for a quote of something that needs to explanation? The Spec for the BPB clearly provides a 16 bit value field and does not explicitly limit it to 512. Just because MSDOS has always used 512, because some number must be inserted into this field, does not mean that it can't have other values! Why is it so hard to accept that? Even other commenters here have agreed to this reasoning. And my proposal should clearly explain that: It says that it can have any value, but that some (badly written) software may not be able to expect something else than 512 there. Yet, I've written software myself that can deal with other sizes, and others have as well, as the comments above suggest. -- Tempel 09:03, 21 November 2006 (UTC)

Here's a source: mozillaquest.com/aboutcomputers/FATData1.html. It contains this statement:

FAT16 partitions have a maximum of 1,024 root directory entries (512 entries is normally used with newly formatted partitions).

So, I still think that Tempel's suggested paragraph is good.

I do not know why there is a 1024-entry limit, especially when there is a full two-byte field to record this, but that's probably irrelevant here. Anything other than 512 for non-floppy media is certainly non-standard. Do you want a source for that statement, or will you accept it as safe?

--Scott McNay 05:57, 21 November 2006 (UTC)

Yes, that 1024 limit is as much nonsense as any other. Where do these people get such ideas from?! It's obvious nonsense, unless they'd give an explanation, which they don't! -- Tempel 09:03, 21 November 2006 (UTC)

My problem with this quoting business is that no one guarantees that the quoted articles are right! The only reliable reference in a tech issue like this is its specification. And such a spec I have quoted above. It quite officially describes the BPB contents, comes from Microsoft's tech department (while Alistair's first quote comes from the user support department and is not a spec), and as I have argued before, that spec is not stating a limit. And any programmer should surely agree that from the spec I refer to one can imagine any possible value up to 32767 at least. If other software can't deal with such big limits - well, that's just a bug in their software then. And that's what my proposal even deals with, so what's the problem? -- Tempel 09:03, 21 November 2006 (UTC)

Alistair, what do you want us to prove here, really? You surely can't say that it's not possible to create a FAT16 volume with more than 512 root dir entries, or do you? Because logic tells us that this is possible - it needs no quote. Agreed?

So, all you can ask for is to prove that some software can properly deal with such a volume, by using all available root dir entries, right? But What tells you that this is not possible, either? Where do you see a need to find a quote for this? Why can't simple intelligence tell us that it works if the software is written accordingly to handle it?

I do not have to prove to you by a citation that 4² is 16, right? Instead it follows from the knowledge that the power of two means to multiply the argument by itself. And 4 mult by itself is 16. That's a logical deduction, it needs no individual quote to prove it. So, where's the difference here? I just don't get it.

-- Tempel 09:03, 21 November 2006 (UTC)

What do you want? Should I just copy and paste the entire contents of WP:VERIFY here? "...logical deduction..."??? How about the entire contents of WP:NOR while I'm at it?

"...I do not have to prove to you..." Actually you do. Have you even read WP:VERIFY? "512" is sourced. Unless you can come up with a credible source that proves otherwise, then that beats any "logical deduction" on your part. AlistairMcMillan 10:45, 21 November 2006 (UTC)

There is a cite but its citing a source that is dubious at best (an end user help page written LONG after the format was devised probablly by a totally different person at MS) Plugwash 11:23, 21 November 2006 (UTC)

Great, love your edit. However could you cite a source to back up this statement: "However microsoft tools have always used a maximum of 512 entries with the actual value depending on the media type"? Oh and while you are at it, could you cite a source for this statement: "Some third party tools like mkdosfs allow the user to set this parameter but there are likely to be compatibility issues in use of nonstandard values."? AlistairMcMillan 12:40, 21 November 2006 (UTC)

We do have a spec, in a way: the long-obsolete Media Descriptor at offset 0x15 in the BPB. 0xF8 is the code for a fixed disk (basically anything other than a floppy). Single density media (180K) has 4 sectors, double density media (360K and 720K) has 7, high density media (1.2M and 1.44M) has 14, and hard drives up through at least 40 MB have 32. I would not say "maximum", since each type of media always had a specific number of root directory records. "Maximum" implies that anything goes, up to the limit.

My suggestion would be to simplify to something like "For historical reasons, FAT12 and FAT16 media generally use 512 root directory entries on non-floppy media, and other sizes may be incompatible with some software or devices."

Ref: "The New Peter Norton Programmer's Guide to the IBM PC & PS/2", by Peter Norton and Richard Wilton, Microsort Press, 1985.

Media descriptor values, page 104, figure 5-3. Offset of media descriptor (0x15) and offset and size of field with number of root dir entries (0x11, 1 word), page 111, figure 5-9. Number of entries for various media, page 111, figure 5-10.

--Scott McNay 06:21, 22 November 2006 (UTC)

2 TiB limit

The article states that a FAT32 volume should "support a total of approximately 268,435,438 (< 228) clusters, allowing for drive sizes in the range of 2 terabytes". The same limit is stated in the summary window. Howewer, this site states that with a cluster size of 32 KB, you can reach a size of 8 TiB. I'm no expert of clusters and how they work so i didn't dare to change the article.

I changed it to 8TiB but some time after doing so i noticed that http://www.ridgecrop.demon.co.uk/index.htm?fat32format.htm says "FAT32 itselft should be OK to 2TB, limited by a 32 bit sector count in the boot sector". This needs investigating in more detail. Plugwash 23:39, 8 November 2006 (UTC)

I think that the 8TiB limit is incorrect for released versions of FAT32. If you consider the volume mount process, the volume size in sectors is determined by a 32 bit field in the boot sector. This limits volumes to 2TiB, 2³² sectors * 512 bytes, for 512 byte sectored devices like hard disks. Raymond Chen agrees. Incidentally, this limit could be removed quite easily by adding a 32 bit "volume size in clusters" to the FSInfo sector - then you could use 8TiB volumes. Maybe it was meant to be like this originally, leading to the 8TiB limit in the MSKB article but that change was never implemented. But the fact is the released version of the FAT32 specification specifies volume size as a 32 bit sector count, and is thus limited to 2TiB on hard disks. I edited the article, but I can't edit the info box.

Undoing edits without a citation - not this way!

This is about the recent addition of the "Fragmentation" topic. I read it, understood and agreed with it technically and generally. From someone who has a lot of experience writing software for many file systems (Commodore VC-1541, Apple DOS 3.3, CP/M, DOS FAT, HFS, ISO 9660, Joliet, UDF, and others) for over 20 years I think I have a saying about this.

What happened is, that after someone added a long section, someone else removed it again, referring to WP:VERIFY.

I do not agree with the policy that someone thinks he can again remove such additions without a discussion here, simply by pointing out missing citations. This had happened to an addition of mine before as well, so this is not the first time this happens. And that's why I bring it up now.

Alistair - if you do not agree with technical details others have added, then please first discuss them here before you remove them. Just because no citication can be found does not mean it's wrong. Citations are only needed when it's not common knowledge or when it's not deductible. The techinical details in these kind of articles here, though, need to citations when they can be deducted from logic, common sense or experience.

If you need it verified, then please ask instead of just invalidating the addition by removing it.

What do others think? If you agree, please speak up in order to tell Alistair not to do this again this way. Or tell me if I stepped out of line.

-- Tempel 14:29, 16 November 2006 (UTC)

WP:VERIFY says that the author has the responsibility to source any changes. What I've seen in other articles is that text that other editors agree with is left alone, text that looks plausible but other editors would rather see a source to confirm their feeling is left in with either a cite tag or just a question on the talk page, text which an editor has doubts about is MOVED to the talk page until it can be sourced, and "patent nonsene" simply gets deleted outright. As you can see, WP:AGF is applied.

If an author claims to be or seems to be a subject matter expert, or at least reasonably knowledgeable, and evidence seems to back this up, my tendency is to AGF as long as stuff seems to be consistent. --Scott McNay 02:58, 17 November 2006 (UTC)

...after someone added a long section...

The comment by AlistairMcMillan from here was moved down to a new section called Fragmentation section discussion.

I'm sorry if I appear to have acted hastily in removing Adam's edit, but I do spend a lot of time tidying up other people's edits, searching for sources and adding them when other people can't be bothered (even though the words "Encyclopedic content must be verifiable." are right there between the main edit box and the edit summary box). Sometimes it is just easier to remove the whole thing and hope they'll try again with sources. Sorry for any offence caused. AlistairMcMillan 17:12, 18 November 2006 (UTC)

The comment by Scott McNay from here was moved down to a new section called Fragmentation section discussion.

Now we've reached the point where we actually discuss the contents of the added chapter. That's all I asked for. I also accept Alistair's apology for a -perhaps- too hasty reaction. Thanks. I will take the liberty and split up this section into two - one about my original topic and one about the content discussion, I hope that's OK with you both, Alistair and Scott. -- Tempel 10:33, 20 November 2006 (UTC)

Fragmentation section discussion

This is about the Fragmentation section recently added by "Adam". Some editors have raised concerns about the text.

Adam described his addition as a "discussion" which is not appropriate for an encyclopedia. Unless of course he was just mentioning a discussion which had taken place elsewhere, which without cited sources we have no way of telling.

His addition suggested ways to fix FAT, which is not appropriate for an encyclopaedia. Unless of course he was mentioning things that other people had suggested as fixes, however without cited sources again we have no way of telling.

A quick glance at his Talk page and Contribution list shows that the last time someone asked for sources to back up a contribution to this article, they were met with silence.[1]

Another thing that bugged me about the discussion, is that it isn't clear which version of FAT we are talking about. Are all of them affected in the same way? Is it just some of them? Were any changes made between versions to attempt to lessen the problem?

AlistairMcMillan 17:12, 18 November 2006 (UTC):

Ok, I went and found the edit in question, and pasted it below with my comments.

Fragmentation

Because of its simplicity, the FAT filesystem does not contain mechanisms which prevent newly written files from becoming scattered accross the partition. Scattering implies latency-causing disk head movements during subsequent accesses.

I'd reword this slightly, but otherwise ok.

Such mechanisms could have been the presence of a bitmap indicating used and available clusters, which could then be quickly looked up in order to find free contiguous areas (improved in exFAT). A similar mechanism could be the linkage of all free clusters into one or more lists (as is done in Unix filesystems). Instead, the FAT has to be scanned like an array in order to find free clusters, and nowadays it generally cannot fit in memory.

I'd either delete or rewrite from the viewpoint of other filesystems.

In fact, computing free disk space on FAT is one of the most expensive operations, as it requires reading the entire FAT linearly. A justification Microsoft offered for limiting the maximum size of FAT32 partitions created on Windows was the time required to perform a simple "DIR" operation, which always displays the free disk space as the last line. Displaying this line took longer and longer as the number of clusters increased.

I seem to recall that FAT32 has a free-space field to hold this. I've noticed the slowness in displaying the free space on older FATs, but this is more appropriate elsewhere in the article, not in a fragmentation section.

Another mechanism could have been the division of the disk space into bands where multiples files opened for simultaneous write could be expanded separately.

Delete

Some of the perceived problems with fragmentation resulted from operating system and hardware limitations.

Delete

The single-tasking DOS and the traditionally single-tasking PC hard disk architecture (only 1 outstanding input/output request at a time, no DMA transfers) did not contain mechanisms which could alleviate fragmentation by asynchronously prefetching next data while the application was processing the previous chunks.

Delete

Similarly, write-behind caching was often not enabled by default with Microsoft software (if present) given the problem of data loss in case of a crash, made easier by the lack of hardware protection between applications and the system.

Delete

MS-DOS also did not offer a system call which would allow applications to make sure a particular file has been completely written to disk in the presence of deferred writes. Disk caches on MS-DOS were operating on disk block level and were not aware of higher-level structures of the filesystem. In this situation, cheating with regard to the real progress of a disk operation was most dangerous.

I seem to recall this being added by MS-DOS at some point. In any event, doesn't seem relevant to fragmentation.

Modern operating system have introduced these optimizations to FAT partitions, but optimizations can still produce unwanted artifacts in case of a system crash. A Windows NT system will allocate space to files on FAT in advance, selecting large contiguous areas, but in case of a crash, files which were being appended will appear larger than they were ever written into, with dozens of random kilobytes at the end.

Edit

It should be noted that with the large cluster sizes, 16 or 32K, forced by larger FAT16 partitions, the external fragmentation does not play a great role, and internal fragmentation, ie. disk space waste starts to be a problem instead.

Does the article define internal and external fragmentation?

Would it be more appropriate to have a single sentence or para, and point to a Fragmentation article?

--Scott McNay 05:50, 19 November 2006 (UTC)

Followup: I see that Adam restored his section, and added some cites. The text appears to be the same, so my comments still stand. I don't have a problem with cites, as it all pretty much fits what I know; my concern is, as Alistair mentions, unencyclopedic tone. Even if every sentence were cited as being someone's quote, it still wouldn't fit.

Instead of deleting, some rewriting could be done, comparing the FAT way of doing things with other filesystems. I agree that describing how to fix it is not appropriate here, but the same discussion, written as a comparison (so that people can understand how significant the limitations are) would probably be appropriate. --Scott McNay 06:02, 19 November 2006 (UTC)

I agree with some of both Alistair's and Scott's concerns, not with others. In summary, I'd like to see the fragmentation issue "discussed" in some way to answer the question: Do FAT file systems avoid fragmentation, and how, any why (...not)? This would be a deeper analysis of the introduction in the header of the article where it's stated that FAT vols have a rather poor performance, partly due to quick fragmentation. Would you agree? I ask because I am not sure what Alistair meant: Does he only disapprove of the form or of the content in general?

Provided you all agree that in general such a section is worthy of being in this article, I'll comment on the contents and form now:

I would not delete as much as Scott would, but his questions about some details are valid. Some of the more vague statements rather belong in a "Fragmentation" article than here.

Perhaps we should see that we rewrite the entire section to be much more brief and just state the relevant facts such as:

The inherent structure of a FAT volume requires relatively (compared to other FSs) large amounts of memory and CPU power to avoid fragmentation (examples of better uses such as in HFS and other FSs should be mentioned)
Perhaps we can give an example of the memory consumption for a modern "full" FAT hard disk
I believe that this is generally relevant to FAT16 as well as to FAT32 (to answer one of Scott's questions)
The parts in Adam's article that mention how operating system and their drivers (write-behind cache, multi-threaded FS, a missing file pre-allocation API function, etc.) could better deal with the problem could be mentioned in a much briefer form, if at all (in my opinion, on a modern PC with lots of RAM the whole thing is of much less importance as it used to be a few years ago). After all, this is not even a FAT-specific issue.
The part with "Windows NT system will allocate space to files on FAT in advance," should be kept, as it explains some shortcomings many users might see (that's one reason why one wants to run scandisk after a crash). However, this is not unique to FAT, either, and should perhaps instead mentioned in a Fragmentation article.

Adam, are you still following this? Would you be willing to rewrite the article and move the more general parts (i.e. about OS behavior) to a separate article on fragmentation?

-- Tempel 11:17, 20 November 2006 (UTC)

You said "Do FAT file systems avoid fragmentation, and how, any why (...not)?". From my viewpoint, I'd have to say that I don't think it IS possible to avoid fragmentation, although you can minimize it. And, the FAT system is sufficiently flexible that any method used by another filesystem can probably be adapted to FAT (someone with more experience with other filesystems could probably comment on this more intelligently), since much, most, or all of it is just Al Gore-ithms (ha ha) in the OS anyway (deciding where to put x, and then allocating y amount of space). I gotta get to work; I can write later on the memory consumption (which should be in the article anyway, and might already be, in the FAT section) and so forth. Much of this discussion is OS-specific (implementation-specific). --Scott McNay 13:16, 20 November 2006 (UTC)

Sure but if the specs don't say anything about avoiding fragmentation and none of the major implementations do anything much to avoid it then imo it is reasonable to say that fat as practically implemented is fragmentation prone. Plugwash 13:36, 21 November 2006 (UTC)