Characteristics of the fat file system. fat file system. Description of the FAT16 file system

Prior to the advent of the Microsoft Windows NT operating system, personal computer users rarely had the problem of choosing a file system. All owners of operating systems (OS) MS-DOS and Microsoft Windows used one of the varieties of the file system called FAT (FAT-12, FAT-16 or FAT-32).

Now the situation has changed. When installing Microsoft Windows NT/2000/XP OS, when formatting a disk, you need to make a choice between three file systems - FAT-16, FAT-32 or NTFS.

In this article, we will talk about the internal structure of the listed file systems, consider their inherent disadvantages and advantages. Armed with this knowledge, you will be able to make an informed choice in favor of a particular file system for Microsoft Windows.

Briefly about the FAT file system

The FAT file system appeared at the dawn of the development of personal computers and was originally intended for storing files on floppy disks.

Information is stored on disks and floppy disks in portions, in sectors of 512 bytes. The entire space of a floppy disk was divided into regions of a fixed length, called clusters. A cluster may contain one or more sectors.

Each file occupies one or more clusters, possibly non-contiguous. File names and other information about files, such as size and date of creation, are located in the initial area of ​​the floppy disk dedicated to the root directory.

In addition to the root directory, other directories can be created in the FAT file system. Together with the root directory, they form a tree of directories containing information about files and directories. As for the location of file clusters on the disk, this information is stored in the initial area of ​​\u200b\u200bthe diskette, called the file allocation table (File Allocation Table, FAT).

For each cluster, the FAT table has its own individual cell, which stores information about how this cluster is used. Thus, the file allocation table is an array containing information about clusters. The size of this array is determined by the total number of clusters on the disk.

The directory stores the number of the first cluster allocated to a file or subdirectory. The numbers of the remaining clusters can be found using the FAT file allocation table.

When developing the FAT table format, the task was to save space, because The floppy disk has a very small size (from 180 KB to 2.44 MB). Therefore, only 12 binary digits were allocated to store the cluster numbers. As a result, the FAT table was packed so tightly that it occupied only one sector of the floppy disk.

The FAT table contains critical information about the location of directories and files. If the FAT table becomes corrupted as a result of hardware, software or malware failure, access to files and directories will be lost. Therefore, for the purpose of safety net, two copies of the FAT table are usually created on the disk.

Various versions of FAT

After the advent of large-capacity hard disks (in those days, disks of 10-20 MB in size were considered large), the number of clusters increased, and 12 bits were not enough to store their numbers. A new 16-bit file allocation table format was developed, where two bytes were allocated to store the number of one cluster. The old file system designed for floppy disks became known as FAT-12, and the new one became FAT-16.

The enlarged FAT-16 table no longer fits in one sector, however, with large disk volumes, this drawback did not play a significant role. As before, for insurance, two copies of the FAT table were stored on the disk.

However, when the volume of the disk began to be measured in hundreds of MB and even in gigabytes, the FAT-16 file system again became inefficient. In order for cluster numbers to fit into 16 digits, when formatting large disks, you have to increase the cluster size to 16 KB or even more. This caused problems when disk storage was needed a large number small files. Since file storage space is allocated in clusters, even a very small file has to allocate too much disk space.

As a result, another, apparently, the last attempt to improve the FAT file system was made - the cell size of the file allocation table was increased to 32. This made it possible to format disks of hundreds of MB and units of GB using a relatively small cluster size. The new file system became known as FAT-32.

Standard 8.3

Before the advent of Microsoft Windows 95, personal computer users were forced to use the very inconvenient "8.3 standard" for naming files, in which the file name had to consist of 8 characters plus 3 extension characters. This limitation was imposed not only by the programming interface of the MS-DOS operating system, but also by the directory entry structure of the FAT file system.

After modifying the structure of directory entries, the limit on the number of characters in a file name was practically removed. The filename can now be up to 255 characters long, which is obviously sufficient in most cases. However, this modified FAT file system became incompatible with the MS-DOS operating system, as well as with the Microsoft Windows version 3.1 and 3.11 shell running in its environment.

You can read more about the formats of internal FAT structures in our article "Data Recovery in FAT Partitions" published on this site.

FAT file system limitations

When deciding whether to use the FAT file system to format a drive, you should be aware of its inherent limitations. These restrictions concern, first of all, the maximum size of a FAT drive, as well as the maximum size of a file located on this drive.

The maximum size of a FAT-16 logical drive is 4 GB, which is very small by modern standards. Microsoft, however, does not recommend creating FAT-16 disks larger than 200 MB, as thus the disk space will be used very inefficiently.

Theoretically, the maximum size of a FAT-32 drive can be 8 TB, which should be enough to deploy any modern applications. This value is obtained by multiplying the maximum number of clusters (268,435,445) by the maximum cluster size allowed in FAT-32 (32 KB).

However, in practice the situation looks a little different.

Due to internal limitations, the ScanDisk utility in Microsoft 95/98 is unable to work with disks larger than 127.53 GB. A year ago, such a limitation would not have caused problems, but today inexpensive 160 GB disks have already appeared on the market, and soon their volume will be even larger.

As for the new Microsoft Windows 2000/XP operating systems, they are not able to create FAT-32 partitions larger than 32 GB. If you need partitions of this size or more, Microsoft will suggest that you use the NTFS file system.

Another significant limitation of FAT-32 is imposed on the size of files - it cannot exceed 4 GB. This limitation will affect, for example, when recording video clips to disk or when creating large database files.

A FAT-32 directory can store a maximum of 65534 files.

Disadvantages of FAT

In addition to the limitations discussed above, the FAT file system has other disadvantages. The most significant, apparently, is the complete absence of access control tools, as well as the possibility of losing information about the location of all files after the destruction of a fairly compact FAT table and its copy.

By booting the computer from a system floppy disk, an attacker can easily access any files stored on disks with the FAT file system. It will not be difficult for him to then copy these files to a ZIP device or some other external storage medium.

When using FAT on server disks, it is impossible to provide reliable and flexible differentiation of user access to directories. That is why, and also because of its low fault tolerance, FAT is not commonly used on servers.

The presence of compact FAT file allocation tables makes this file system a vulnerable target for computer viruses - it is enough to destroy the initial fragment of a FAT disk, and almost all data will be lost.

NTFS file system

The modern NTFS file system, developed by Microsoft for its Microsoft Windows NT operating system, is devoid of the limitations and disadvantages of FAT. Since its inception, the emerging NTFS file system has undergone several enhancements, the most recent of which (at the time of this writing) has been made in Microsoft Windows XP.

In the NTFS file system, all file attributes (name, size, location of file extents on disk, etc.) are stored in the hidden $MFT system file. To store information about each file (and directory) in $MFT is allocated from one to several KB. With a large number of files stored on disk, the size of the $MFT file can reach tens or even hundreds of MB.

Small files (on the order of hundreds of bytes) are stored directly in $MFT, which significantly speeds up access to them.

Note, however, that the overhead of NTFS for storing system information, although it exceeds the overhead of FAT, is still not very large compared to the volume of modern disks. Due to the fact that the $MFT file is usually located closer to the middle of the disk, the destruction of the first tracks of an NTFS disk does not lead to such fatal consequences as the destruction of the initial areas of a FAT disk.

The NTFS file system has many features not found in FAT. They allow you to achieve much more flexibility, reliability and security compared to FAT.

Let's list some of the most interesting features of NTFS in modern versions.

Access control tools

Means of differentiation NTFS access are quite flexible and allow you to manage access at the level of individual files and directories, granting (or blocking) access to them to individual users or groups of users.

Although at first glance it may seem that access control tools are needed only for file servers, they will also be required if several users have access to the computer.

File encryption

The access control tools mentioned above will be useless if the physical NTFS disk falls into the hands of an attacker. Using modern utilities, the contents of such a disk can be easily read in any operating system environment - DOS, Microsoft Windows or Linux.

In order to protect user files from unauthorized access, Microsoft Windows 2000/XP operating systems provide additional encryption of files stored on NTFS partitions. And although the strength of such encryption may not be very high, it is quite sufficient in most cases.

Software RAID

Using NTFS, you can create a so-called software array RAID 1 (Mirrored set). This array, composed of two physical or logical drives of the same volume, allows you to duplicate (or, as they say, "mirror") files.

Such an array can save your files in the event of a physical failure of one of the disks that make up the array, so it is often used to increase the reliability of the disk system.

Volume Sets

The NTFS file system allows you to combine several partitions located on one or more physical disks into one logical volume. This may be necessary, for example, to store large database files that do not fit on one physical disk, or to create a directory with a total volume of files that exceeds the size of the physical disk.

Sets created from several partitions or physical disks are called Volume Set (in Microsoft Windows NT terminology) or Spanned Volume (in Windows 2000/XP terminology).

Packing files

To save disk space, you can use the ability of NTFS to pack (compress) files. In addition, NTFS allows you to create so-called sparse (sparse) files that contain areas of null data. Such files can be large but take up little disk space because only the significant bytes of the file are actually stored.

Note that packing files will result in some slowdown. This circumstance, however, will not always matter. For example, office documents can be packaged without a noticeable decrease in speed, but this cannot be said about database files accessed by a large number of users at the same time. With relatively inexpensive, high-capacity discs on the market, packaging media should only be used when really needed. This, however, applies to other NTFS features as well.

Multithreaded files

If necessary, several streams of information can be stored in one file recorded on an NTFS disk. This makes it possible, in particular, to supply document files with additional information, store several versions of documents in one file (for example, in different languages), store program code and data in separate streams of one file, etc.

hard ties

Hard links (hard links) allow you to assign several different names to one physical file by placing these names (ie links to the file) in different directories. Deleting a link does not delete the file itself. Only when all links of the file are destroyed will the file itself be deleted.

Note that such features are typical for file systems used in Unix-like operating systems, for example, in Linux, FreeBSD, etc.

Override points

NTFS system objects such as reparse points allow you to override any file or directory. In this case, for example, rarely used overridden files or directories can actually be stored on magnetic tape, loaded to disk only when necessary.

Transitions

Using NTFS transitions, you can mount another hard drive or CD into the drive's directory. This feature originally existed in the file systems of Unix-like operating systems.

Disk space quota

The NTFS file system, used in Microsoft Windows 2000/XP, allows you to quote or limit the disk space available to users. This feature is especially useful when creating file servers.

Change Logging

In the course of its work, the operating system performs various actions on files (creation, modification, deletion). All such changes are stored in a special journal created on the NTFS volume and can be used by backup programs, indexing systems, etc. Logging changes increases the reliability of the file system, allowing in some cases to continue working after non-critical failures of the operating system and hardware. Although, of course, most serious failures result in the need to recover data from backup or using special data recovery utilities.

NTFS limitations

Despite the abundance of features, the NTFS file system also has some limitations. However, in most cases they do not play a significant role.

The maximum size of an NTFS logical drive is approximately 18,446,744 TB, which is obviously enough for all modern applications, as well as applications that will appear in the near future. The maximum file size is even larger, so this limitation is also not significant.

There is no limit to the number of files stored in a single NTFS directory, so this also has an advantage over FAT.

Comparison of NTFS and FAT for file access speed

In terms of future, functionality, security, and reliability, NTFS is far ahead of FAT. However, comparing the performance of these file systems does not give an unambiguous result, since performance depends on many different factors.

Because FAT is much simpler in operation and internal structures than NTFS, FAT is likely to be faster when dealing with small directories. However, if the contents of the directory are so small that they fit entirely within one or more $MFT file entries, or vice versa, if the directory is very large, NTFS will "win".

The palm will most likely go to NTFS when searching for non-existent files or directories (because it does not require a complete scan of the contents of the directory), when accessing small files (on the order of hundreds of bytes), and also in case of severe disk fragmentation.

To increase the performance of NTFS, you can increase the cluster size, but this can lead to wasteful use of disk space when storing a large number of files that are larger than 1-2 KB and amount to tens of KB. By increasing the cluster size to 64 KB, you can get the maximum performance improvement, but you will have to forego packing files and using defragmentation utilities.

Packing files located on small disks (about 4 GB) may increase performance, while compressing large disks may decrease performance. In any case, the packaging will cause additional load on the CPU.

So what to choose - FAT or NTFS?

As you can see, NTFS has numerous advantages over FAT, and its limitations are negligible in most cases. If you are faced with choosing a file system, consider using NTFS first and FAT second.

What might be the barriers to replacing FAT with NTFS?

The most serious obstacle is the need to use Microsoft Windows NT/2000/XP. This OS requires at least 64 MB to run properly. random access memory and a processor with a clock frequency of at least 200-300 MHz. However, these requirements are not met only by very old computers that are not capable of running modern versions of Microsoft Windows.

If your computer can run under Microsoft Windows 2000/XP, and you do not have a single application designed exclusively for Microsoft Windows 95/98/ME, we recommend that you switch to a new operating system as soon as possible, replacing this FAT to NTFS.

At the same time, you will also get a noticeable increase in the reliability of work, because. after installing all the necessary service packs, as well as the correct versions of peripheral device drivers, Microsoft Windows 2000/XP will work very stably.

In some cases, you have to combine several file systems within one physical disk. For example, if your computer has three operating systems Microsoft Windows ME, Microsoft Windows XP and Linux, you can create three file systems - FAT, NTFS and Ext2FS. The first of them will be "visible" when working in Microsoft Windows ME and Linux, the second - only in Microsoft Windows XP, and the third - only in Linux (note that in LINUX there is also the possibility of accessing NTFS partitions).

But if you are creating a server (file, database or Web) based on Microsoft Window NT/2000/XP, then NTFS is the only reasonable choice. Only in this case it will be possible to achieve the necessary stability, reliability and security of the server.

There is also a generally accepted (and, in our opinion, erroneous) opinion that home computer users do not need either the Microsoft Window NT/2000/XP operating system or the NTFS file system.

Of course, if the computer is used exclusively for gaming, for compatibility reasons, it is best to install Microsoft Windows 98/ME and format the drives in FAT. However, if you work not only in the office, but also at home, it is better to use modern, professional and reliable solutions. This will allow, in particular, to organize protection against intrusion on your computer via the Internet, restrict access to directories and files with critical data, and also increase the chances of successful information recovery in the event of various kinds of failures.

Introduction

2.1 FAT16 system

2.2 FAT32 system

2.3 Comparison of FAT16 and FAT32

3.1 NTFS system

3.2 Comparison of NTFS and FAT32

Conclusion

Bibliography

Introduction

Currently, on average, several tens of thousands of files are recorded on one disk. How to understand all this diversity in order to accurately address the file? The purpose of the file system is an effective solution to this problem.

The file system, from the user's point of view, is the "space" in which files are placed. And as a scientific term, it is a way of storing and organizing access to data on an information carrier or its section. The presence of a file system allows you to determine the name of the file, where it is located. Since information is stored mainly on disks on IBM PC-compatible computers, the file systems used on them determine the organization of data on disks (more precisely, on logical disks). We will look at the FAT file system.

fat ntfs file system

1. History of creation and general characteristics FAT file system

The FAT (File Allocation Table) file system was developed by Bill Gates and Mark McDonald in 1977 and was originally used in the 86-DOS operating system. In order to achieve portability of programs from the CP/M operating system to 86-DOS, it retained the earlier restrictions on filenames. 86-DOS was later acquired by Microsoft and became the basis for MS-DOS 1.0, released in August 1981. FAT was designed to work with floppy disks smaller than 1 MB, and did not initially support hard disks. FAT currently supports files and partitions up to 2 GB in size.

FAT uses the following file naming conventions:

the name must start with a letter or number and can contain any ASCII character except for space and "/\ :; |=,^*?

The name is up to 8 characters long, followed by a dot and an optional extension up to 3 characters long.

case of characters in file names is not distinguished and is not preserved.

The structure of the FAT partition is shown in Table 1.1 The BIOS parameter block contains the information required by the BIOS about the physical characteristics hard drive. The FAT file system cannot control each sector separately, so it combines adjacent sectors into clusters. This reduces the total number of storage units that the file system has to keep track of. The cluster size in FAT is a power of two and is determined by the size of the volume when formatting the disk (Table 1.2). A cluster is the minimum space a file can occupy. This results in some disk space being wasted. The operating system includes various utilities (DoubleSpace, DriveSpace) designed to compact data on a disk.

Tab. 1.1 - FAT partition structure

Boot sector BIOS Parameter Block (BPB) FATFAT (copy) Root Directory File Area

FAT got its name from the file allocation table of the same name. The file allocation table stores information about logical disk clusters. Each cluster in the FAT has a separate entry that shows whether it is free, busy with file data, or marked as bad (corrupted). If the cluster is occupied by a file, then the address of the cluster containing the next part of the file is indicated in the corresponding entry in the file allocation table. Because of this, FAT is called a linked list file system. The original version of FAT, developed for DOS 1.00, used a 12-bit file allocation table and supported partitions up to 16 MB in size (a maximum of two FAT partitions can be created in DOS). To support hard drives larger than 32 MB, the FAT bit depth has been increased to 16 bits, and the cluster size to 64 sectors (32 KB). Since each cluster can be assigned a unique 16-bit number, the FAT supports a maximum of 216, or 65536 clusters per volume.

Table 1.2 - Cluster sizes

Partition sizeCluster sizeFAT type< 16 Мб4 КбFAT1216 Мб - 127 Мб2 КбFAT16128 Мб - 255 Мб4 КбFAT16256 Мб - 511 Мб8 КбFAT16512 Мб - 1023 Мб16 КбFAT161 Гб - 2 Гб32 КбFAT16

Because the boot record is too small to store the system file search algorithm on disk, the system files must be in a specific location for the boot record to find them. The fixed position of system files at the beginning of the data area imposes a hard limit on the size of the root directory and the file allocation table. As a result, the total number of files and subdirectories under the root directory on a FAT drive is limited to 512.

Each file and subdirectory in the FAT has a 32-byte directory entry that contains the file name, its attributes (archive, hidden, system, and read-only). ), date and time of creation (or the last changes made to it), as well as other information (Table 1.3).

Table 1.3 - Catalog elements

The FAT file system always fills free disk space sequentially from beginning to end. When creating a new file or expanding an existing one, it looks for the very first free cluster in the file allocation table. If in the course of work some files were deleted, while others changed in size, then the resulting empty clusters will be scattered across the disk. If the clusters containing the file's data are not in a row, then the file is fragmented. Heavily fragmented files significantly reduce work efficiency, since the read / write heads, when searching for the next file record, will have to move from one area of ​​\u200b\u200bthe disk to another. Operating systems that support FAT usually include special disk defragmentation utilities designed to improve the performance of file operations.

Another disadvantage of FAT is that its performance is highly dependent on the number of files stored in a single directory. With a large number of files (about a thousand), the operation of reading the list of files in the directory may take several minutes. This is due to the fact that in FAT a directory has a linear unordered structure, and the names of files in directories are in the order they were created. As a result, the more entries in the directory, the slower programs work, since when searching for a file, it is necessary to look through all the entries in the directory sequentially. Since FAT was originally designed for the single-user DOS operating system, it does not provide for storing information such as owner information or file / directory access permissions. It is the most common file system and most modern operating systems support it to one degree or another. Due to its versatility, FAT can be used on volumes that work with different operating systems.

Although there are no barriers to using any other file system when formatting floppy disks, most operating systems use FAT for compatibility. This can be partly explained by the fact that the simple FAT structure requires less space to store service data than other systems. The advantages of other file systems become noticeable only when they are used on media larger than 100 MB.

It should be noted that FAT is a simple file system that does not prevent file corruption due to abnormal computer shutdown. Operating systems that support FAT include special utilities that check the structure and correct inconsistencies in the file system.

2. Characteristics of the FAT16 and FAT32 file systems and their comparison

.1 FAT16 system

The FAT 16 file system, which is the main file system for DOS, Windows 95⁄98⁄Me, Windows NT⁄2000⁄XP operating systems, and is also supported by most other systems. FAT 16 is a simple file system designed for small drives and simple directory structures. The name comes from the name of the file organization method - File Allocation Table. This table is placed at the beginning of the disk. The number 16 means that this file system is 16-bit - 16 bits are used to address clusters. The operating system uses the File Allocation Table to find a file and determine the clusters that file occupies on the hard drive. In addition, the Table contains information about free and defective clusters. To make it easier to comprehend the FAT16 file system, imagine the table of contents of a book and how you work with this table of contents, that's exactly how the operating system works with FAT 16.

To read a file, the operating system must find the entry in the folder by the file name and read the number of the first cluster of the file. The first cluster represents the beginning of the file. Then it is necessary to read the FAT element corresponding to the first cluster of the file. If the element contains a label - the last one in the chain, then there is no need to look for anything further: the entire file fits in one cluster. If the cluster is not the last one, then the table element contains the number of the next cluster. The contents of the next cluster must be read after the first one. When the last cluster in the chain is found, then if the file does not occupy the entire cluster, it is necessary to cut off the extra bytes of the cluster. Excess bytes are trimmed according to the length of the file stored in the folder entry.

To write a file, the operating system must perform the following sequence of actions. A file description is created in a free folder element, then a free FAT element is searched for, and a link to it is placed in the folder entry. The first cluster described by the found FAT element is occupied. This FAT element is filled with the number of the next cluster or the sign of the last cluster in the chain.

The operating system acts in such a way as to collect chains from neighboring clusters in ascending order. It is clear that access to successive clusters will be much faster than to clusters randomly scattered across the disk. In this case, already occupied and marked in the FAT as bad clusters are ignored.

In the FAT16 file system, 16 bits are reserved for the cluster number. therefore maximum amount clusters is 65525 and the maximum cluster size is 128 sectors. In this case, the maximum size of partitions or disks in FAT16 is 4.2 gigabytes. When logically formatting a disk or partition, the operating system tries to use the minimum cluster size that will result in a maximum of 65525 clusters. Obviously, the larger the partition size, the larger the cluster size should be. Many operating systems do not work correctly with a 128 sector cluster. As a result, the maximum size of a FAT16 partition is reduced to 2 gigabytes. Typically, the larger the cluster size, the greater the wasted disk space. This is because the last cluster occupied by the file is only partially filled. For example, if a 17 KB file is written to a partition with a 16 KB cluster size, then the file will take up two clusters, with the first cluster being full, the second cluster writing only 1 KB of data, and the remaining 15 KB of space in the second cluster remaining unused. full and will not be writable by other files. If a large number of small files are written to large disks, then the loss of disk space will be significant. The following table 2.1 provides information about the possible wasted disk space for different partition sizes.

Tab. 2.1.1 - Loss of disk space

Partition SizeCluster SizeDisk Space Wasted127MB2Kb2%128-255MB4Kb4%256-511MB8Kb10%512-1023MB16Kb25%1024-2047MB32Kb40%2048-4096MB64Kb50%

There are two ways to reduce wasted disk space. The first is partitioning disk space into small partitions with a small cluster size. The second is using the FAT32 file system<#"center">2.2 FAT32 system

The FAT32 file system is a newer file system based on the FAT format and is supported by Windows 95 OSR2, Windows 98, and Windows Millennium Edition. FAT32 uses 32-bit cluster IDs, but reserves the upper 4 bits, so the effective cluster ID size is 28 bits. Since the maximum size of FAT32 clusters is 32 KB, FAT32 can theoretically handle 8 terabyte volumes. Windows 2000 limits the size of new FAT32 volumes to 32 GB, although it does support existing larger FAT32 volumes (created on other operating systems). The larger number of clusters supported by FAT32 allows it to manage disks more efficiently than FAT 16. FAT32 can use 512-byte clusters for volumes up to 128 MB.

The FAT 32 file system in Windows 98 is used as the default. This operating system comes with a special disk conversion program from FAT 16 to FAT 32. Windows NT and Windows 2000 can also use the FAT file system, and therefore you can boot your computer from a DOS disk and have full access to all files. However, some of the most advanced features of Windows NT and Windows 2000 are provided by its own NTFS (NT File System) file system. NTFS allows you to create partitions up to 2 TB on a disk (like FAT 32), but, in addition, it has built-in file compression, security and auditing functions that are necessary when working in a network environment. And Windows 2000 implements support for the FAT 32 file system. Windows systems NT starts on a FAT disk, but at the end of the installation, the data on the disk can be converted to NTFS format at the end of the installation.

You can also do this later using the Convert utility. exe that comes with the operating system. A disk partition converted to NTFS becomes inaccessible to other operating systems. To return to DOS, Windows 3.1, or Windows 9x, you need to remove the NTFS partition and create a FAT partition instead. Windows 2000 can be installed on a drive with the FAT 32 and NTFS file system.

The capabilities of FAT32 file systems are much broader than those of FAT16. Its most important feature is that it supports disks up to 2047 GB and works with smaller clusters, thereby significantly reducing the amount of wasted disk space. For example, a 2 GB hard drive in FAT16 uses 32 KB clusters, while FAT32 uses 4 KB clusters. To maintain compatibility with existing programs, networks, and device drivers wherever possible, FAT32 is implemented with minimal changes to the architecture, APIs, internal data structures, and disk format. But, since the size of the FAT32 table elements is now four bytes, many internal and disk data structures, as well as APIs, had to be revised or extended. Certain APIs on FAT32 drives are blocked to prevent legacy disk utilities from corrupting the contents of FAT32 drives. Most programs will not be affected by these changes. Existing tools and the drivers will work on FAT32 drives too. However, MS-DOS block device drivers (such as Aspidisk.sys) and disk utilities need to be modified to support FAT32. All disk utilities supplied by Microsoft (Format, Fdisk, Defrag, and ScanDisk for real and protected modes) have been redesigned to fully support FAT32. In addition, Microsoft is helping leading vendors of disk utilities and device drivers modify their products to support FAT32. FAT32 is more efficient than FAT16 when working with larger disks and does not require them to be partitioned into 2 GB partitions. Windows 98 necessarily supports FAT16, since it is this file system that is compatible with other operating systems, including third-party companies. In real mode MS-DOS and in safe mode Windows 98, the FAT32 file system is significantly slower than FAT16. Therefore, when running programs in MS DOS mode, it is desirable to include Autoexec. bat or PIF file command to download Smartdrv. exe, which will speed up disk operations. Some older FAT16 programs may report incorrect information about the amount of free or total disk space if it is more than 2 GB. Windows 98 provides new APIs for MS-DOS and Win32 that allow these metrics to be correctly determined.

.3 Comparison of FAT16 and FAT32

Table 2.3.1 - Comparison of FAT16 and FAT32 file systems

FAT16FAT32 Implemented and used by most operating systems (MS-DOS, Windows 98, Windows NT, OS/2, UNIX). Currently only supported on Windows 95 OSR2 and Windows 98. Very effective for logical drives smaller than 256 MB. Does not work with disks smaller than 512 MB. Supports disk compression, such as the DriveSpace algorithm. Does not support disk compression. Handles a maximum of 65,525 clusters, the size of which depends on the size of the logical drive. Since the maximum cluster size is 32 KB, FAT16 can only handle logical disks up to 2 GB. Able to work with logical drives up to 2047 GB with a maximum cluster size of 32 KB.

The maximum possible file length in FAT32 is 4 GB minus 2 bytes. Win32 applications can open files of this length without special processing. Other applications should use interrupt Int 21h, function 716C (FAT32) with open flag equal to EXTEND-SIZE (1000h).

In the FAT32 file system, 4 bytes are allocated for each cluster in the file allocation table, while in FAT16 - 2 bytes, and in FAT12 - 1.5 bytes.

The upper 4 bits of the 32-bit FAT32 table entry are reserved and do not participate in the formation of the cluster number. Programs that directly read the FAT32 table must mask these bits and prevent them from being changed when new values ​​are written.

So, FAT32 has the following advantages over previous implementations of the FAT file system:

supports drives up to 2 TB;

organizes disk space more efficiently. FAT32 uses smaller clusters (4 KB for drives up to 8 GB), which saves up to 10-15% of space on large drives compared to FAT;

the FAT 32 root directory, like all other directories, is now unlimited, it consists of a chain of clusters and can be located anywhere on the disk;

has higher reliability: FAT32 is able to move the root directory and work with a FAT backup, in addition, the boot record on FAT32 drives has been expanded to include a backup of critical data structures, which means that FAT32 drives are less sensitive to the occurrence of separate bad patches than existing FAT volumes;

programs load 50% faster.

Table 2.3.2 - Comparison of cluster sizes

Disk SizeFAT16 Cluster Size KBFAT32 Cluster Size KB256MB-511MB8Not Supported512MB-1023MB1641024MB-2GB3242GB-8GBNot Supported48GB-16GBNot Supported816GB-32GBNot Supported16More than 32GBNot Supported32

3. Alternative file system NTFS and its comparison with FAT32

3.1 NTFS system

(New Technology File System) is the most preferred file system when working with Windows NT, as it was specially designed for this system. Windows NT includes a convert utility that converts FAT and HPFS volumes to NTFS volumes. In NTFS, the possibilities for managing access to individual files and directories are significantly expanded, a large number of attributes are introduced, fault tolerance is implemented, dynamic file compression tools are implemented, and support for the requirements of the POSIX standard. NTFS allows filenames up to 255 characters long, and it uses the same short name generation algorithm as VFAT. NTFS is self-healing in the event of an OS or hardware failure, so that the disk volume remains available and the directory structure is intact.

Each file on an NTFS volume is represented by an entry in a special file, the MFT (Master File Table). NTFS reserves the first 16 table entries, about 1MB in size, for special information. The first table entry describes directly the main file table itself. It is followed by a mirrored MFT entry. If the first MFT entry is corrupted, NTFS reads the second entry to find a mirrored MFT file whose first entry is identical to the first MFT entry. The location of the MFT data segments and the MFT mirror file is stored in the bootstrap sector. A copy of the bootstrap sector is located at the logical center of the disk. The third MFT entry contains the log file used to restore files. The seventeenth and subsequent entries in the master file table are used by the actual files and directories on the volume.

The transaction log (log file) records all operations that affect the structure of the volume, including the creation of a file and any commands that change the directory structure. The transaction log is used to recover an NTFS volume after a system failure. The entry for the root directory contains a list of files and directories stored in the root directory.

The space allocation scheme on a volume is stored in a bitmap file. The data attribute of this file contains a bitmap, each bit of which represents one volume cluster and indicates whether the given cluster is free or occupied by some file. It also supports a bad cluster file for registering bad areas on a volume and a volume file A containing the volume name, NTFS version, and the bit that is set when the volume is corrupted. Finally, there is a file containing an attribute definition table that defines the types of attributes supported on the volume and whether they can be indexed, restored by system restore, etc. allocates space in clusters and uses 64 bits to number them , which makes it possible to have 264 clusters, each up to 64 KB in size. As in FAT, the cluster size can change, but does not necessarily increase in proportion to the size of the disk. Cluster sizes set by default when formatting a partition are shown in Table 3.1.

Partition size Cluster size< 512 Мб512 байт513 Мб - 1024 Мб (1 Гб) 1 Кб1 Гб - 2 Гб2 Кб2 Гб - 4 Гб4 Кб4 Гб - 8 Гб8 Кб8 Гб - 16 Гб16 Кб16 Гб - 32 Гб32 Кб>32 GB64 KB allows you to store files up to 16 exabytes (264 bytes) and has built-in real-time file compaction. Compression is one of the attributes of a file or directory and, like any attribute, can be removed or set at any time (compression is possible on partitions with a cluster size of no more than 4 KB). When compacting a file, in contrast to the compression schemes used in FAT, per-file compaction is used, so damage to a small area of ​​\u200b\u200bthe disk does not lead to loss of information in other files.

To reduce fragmentation, NTFS always tries to store files in contiguous blocks. This system uses a B-tree directory structure similar to the high performance HPFS file system, rather than the linked list structure used by FAT. This makes searching for files in a directory faster because filenames are stored sorted in lexicographic order. was designed as a recoverable file system that uses a transaction processing model. Each I/O operation that changes a file on an NTFS volume is treated as a transaction by the system and can be executed as an indivisible block. When a file is modified by a user, the log file service captures all the information needed to retry or rollback the transaction. If the transaction is completed successfully, the file is modified. If not, NTFS rolls back the transaction.

Despite the presence of protection against unauthorized access to data, NTFS does not provide the necessary confidentiality of stored information. To access the files, it is enough to boot the computer into DOS from a floppy disk and use some third-party NTFS driver for this system.

Beginning with Windows versions NT 5.0 (new Windows name 2000) Microsoft supports the new NTFS 5.0 file system. The new version of NTFS introduced additional file attributes; along with the access right, the concept of access prohibition was introduced, which allows, for example, when a user inherits group rights to a file, to prohibit him from changing its contents. The new system also allows:

impose restrictions (quotas) on the amount of disk space provided to users;

project any directory (both local and remote computer) to a subdirectory on the local drive.

An interesting feature of the new version of Windows NT is the dynamic encryption of files and directories, which increases the reliability of information storage. Windows NT 5.0 includes an Encrypting File System (EFS) that uses shared-key encryption algorithms. If the encryption attribute is set for a file, then when the user program accesses the file for writing or reading, the file is encrypted and decoded transparently for the program.

.2 Comparison of NTFS and FAT32

Advantages:

Fast access to small files;

The size of disk space today is practically unlimited;

File fragmentation does not affect the file system itself;

High reliability of saving data and the actual file structure itself;

High performance when working with large files;

Disadvantages:

Higher memory requirements compared to FAT 32;

Working with directories of medium size is difficult due to their fragmentation;

Lower operating speed compared to FAT 3232

Advantages:

High speed of work;

Low RAM requirement;

Efficient work with files of medium and small sizes;

Less disk wear due to less read/write head movement.

Disadvantages:

Low protection against system failures;

Inefficient work with large files;

Restriction on the maximum size of the section and file;

Reduced performance during fragmentation;

Reduced performance when working with directories containing a large number of files;

So, both file systems store data in clusters, the minimum size of which is 512 b. As a rule, the usual cluster size is 4 Kb. This is where the similarities probably end. Something about fragmentation: NTFS speed drops dramatically when the disk is 80 - 90% full. This is due to the fragmentation of service and working files. The more you work with such a loaded disk, the more fragmentation and the lower the performance. In FAT 32, fragmentation of the working area of ​​the disk also occurs at earlier stages. The point here depends on how often you write / erase data. As with NTFS, fragmentation greatly reduces performance. Now about RAM. The volume of the FAT 32 spreadsheet itself can take up several megabytes in RAM. But caching comes to the rescue. What is cached:

Most used directories;

Data on all currently used files;

Information about free disk space;

But what about NTFS? Caching is difficult for large directories, and they can reach several tens of megabytes in size. Plus MFT, plus information about free disk space. Although it should be noted that NTFS still consumes RAM resources quite economically. In the presence of a successful data storage system, in the MFT each entry is approximately equal to 1 Kb. But still, the requirements for RAM are higher than for FAT 32. In short, if your memory is less than or equal to 64 Mb, then FAT 32 will be more efficient in terms of speed. If more, the difference in speed will be small, and often not at all. Now about the hard drive itself. To use NTFS, Bus Mastering is desirable. What's this? This is a special mode of operation of the driver and controller. When using BM, the exchange occurs without the participation of the processor. The absence of a VM will affect system performance. In addition, due to the use of a more complex file system, the number of read / write head movements increases, which also affects the speed. The presence of a disk cache has an equally positive effect on both NTFS and FAT 32.

Conclusion

The advantages of FAT are low data storage overhead and total compatibility with a huge number of operating systems and hardware platforms. This file system is still used to format floppy disks, where the large volume of the partition supported by other file systems does not play a role, and low overhead allows you to economically use a small disk space (NTFS requires more space to store data, which is completely unacceptable for diskettes ).

The scope of FAT32 is actually much narrower - this file system is worth using if you are going to access partitions with both Windows 9x and Windows 2000/XP. But since the relevance of Windows 9x today has practically disappeared, the use of this file system is not of particular interest.

Bibliography

1. http://yura. puslapiai. lt/archiv/per/fat.html

FAT file systems

FAT16

The FAT16 file system predates MS-DOS and is supported by all Microsoft operating systems for compatibility. Its name File Allocation Table (file location table) perfectly reflects the physical organization of the file system, the main characteristics of which include the fact that the maximum size of a supported volume (hard disk or partition on a hard disk) does not exceed 4095 MB. In the days of MS-DOS, 4 GB hard drives seemed like an impossible dream (20-40 MB drives were a luxury), so such a reserve was quite justified.

A volume formatted to use FAT16 is divided into clusters. The default cluster size depends on the size of the volume and can range from 512 bytes to 64 KB. In table. Figure 2 shows how the cluster size depends on the volume size. Note that the cluster size may differ from the default value, but must have one of the values ​​specified in Table 1. 2.

It is not recommended to use the FAT16 file system on volumes larger than 511 MB, since disk space will be used extremely inefficiently for relatively small files (a 1-byte file will take 64 KB). Regardless of the cluster size, the FAT16 file system is not supported for volumes larger than 4 GB.

FAT32

Starting with Microsoft Windows 95 OEM Service Release 2 (OSR2), Windows introduced support for 32-bit FAT. For Windows NT-based systems, this file system was first supported in Microsoft Windows 2000. While FAT16 can support volumes up to 4 GB, FAT32 can support volumes up to 2 TB. The cluster size in FAT32 can vary from 1 (512 bytes) to 64 sectors (32 KB). FAT32 cluster values ​​require 4 bytes to store (32 bits, not 16 as in FAT16). This means, in particular, that some file utilities designed for FAT16 cannot work with FAT32.

The main difference between FAT32 and FAT16 is that the size of the disk logical partition has changed. FAT32 supports volumes up to 127 GB. At the same time, if when using FAT16 with 2 GB disks, a 32 KB cluster was required, then in FAT32 a 4 KB cluster is suitable for disks from 512 MB to 8 GB (Table 4).

This accordingly means more efficient use of disk space - the smaller the cluster, the less space is required to store the file and, as a result, the disk becomes less fragmented.

When using FAT32, the maximum file size can be up to 4 GB minus 2 bytes. If when using FAT16 the maximum number of entries in the root directory was limited to 512, then FAT32 allows you to increase this number to 65,535.

FAT32 imposes restrictions on the minimum volume size - it must be at least 65,527 clusters. At the same time, the cluster size cannot be such that the FAT occupies more than 16 MB - 64 KB / 4 or 4 million clusters.

When using long filenames, the data required for access from FAT16 and FAT32 does not overlap. When creating a file with a long Windows name creates the corresponding 8.3 format name and one or more directory entries to hold the long name (13 characters from the long filename per entry). Each subsequent occurrence stores the corresponding part of the filename in Unicode format. Such entries have the attributes "volume id", "read-only", "system", and "hidden", a set that is ignored by MS-DOS; on this operating system, a file is accessed by its "alias" in 8.3 format.

NTFS file system

Microsoft Windows 2000 includes support for a new version of the NTFS file system, which, in particular, provides work with Active Directory directory services, reparse points, information security tools, access control, and a number of other features.

As with FAT, the basic unit of information in NTFS is the cluster. In table. Figure 5 shows the default cluster sizes for volumes of various capacities.

When you create an NTFS file system, the formatter creates a Master File Table (MTF) file and other areas for storing metadata. Metadata is used by NTFS to implement the file structure. The first 16 entries in the MFT are reserved by NTFS itself. The location of the metadata files $Mft and $MftMirr is recorded in the boot sector of the disk. If the first entry in the MFT is corrupted, NTFS reads the second entry to find a copy of the first. A complete copy of the boot sector is located at the end of the volume. In table. 6 lists the main metadata stored in the MFT.

The remaining MFT entries contain entries for each file and directory located on the volume.

Typically, one file uses one entry in the MFT, but if the file has a large set of attributes or becomes too fragmented, additional entries may be required to store information about it. In this case, the first record about the file, called the base record, stores the location of the other records. Data about files and directories of small size (up to 1500 bytes) is completely contained in the first entry.

File attributes in NTFS

Each occupied sector on an NTFS volume belongs to a particular file. Even the file system metadata is part of the file. NTFS treats each file (or directory) as a set of file attributes. Elements such as the file name, its protection information, and even the data in it are attributes of the file. Each attribute is identified by a specific type code and, optionally, by an attribute name.

If the attributes of a file fit within a file record, they are called resident attributes. These attributes are always the name of the file and the date it was created. In cases where the information about a file is too large to fit into a single MFT record, some of the file's attributes become non-resident. Resident attributes are stored in one or more clusters and represent a stream of alternate data for the current volume (more on that below). To describe the location of resident and non-resident attributes, NTFS creates an Attribute List attribute.

In table. 7 shows the main file attributes defined in NTFS. This list may be expanded in the future.

CDFS file system

Windows 2000 provides support for the CDFS file system, which conforms to the ISO'9660 standard, which describes the location of information on a CD-ROM. Long filenames are supported according to ISO'9660 Level 2.

At creating a CD-ROM For use under Windows 2000, keep the following in mind:

  • all directory and file names must be less than 32 characters;
  • all directory and file names must contain only uppercase characters;
  • the depth of directories should not exceed 8 levels from the root;
  • the use of filename extensions is optional.

Comparison of file systems

Under Microsoft Windows 2000, FAT16, FAT32, NTFS, or combinations of these file systems can be used. The choice of operating system depends on the following criteria:

  • how the computer is used;
  • hardware platform;
  • size and number of hard drives;
  • information security

FAT file systems

As you may have noticed, the numbers in the names of the file systems - FAT16 and FAT32 - indicate the number of bits required to store information about the cluster numbers used by the file. So, FAT16 uses 16-bit addressing and, accordingly, it is possible to use up to 216 addresses. In Windows 2000, the first four bits of the FAT32 file location table are needed for internal use, so FAT32 reaches 228 addresses.

In table. 8 shows cluster sizes for FAT16 and FAT32 file systems.

In addition to significant differences in cluster size, FAT32 also allows the root directory to expand (in FAT16, the number of entries is limited to 512 and can be even lower when using long filenames).

Benefits of FAT16

Among the advantages of FAT16 are the following:

  • the file system is supported by MS-DOS, Windows 95, Windows 98, Windows NT, Windows 2000, and some UNIX operating systems;
  • there are a large number of programs that allow you to correct errors in this file system and recover data;
  • if there are problems with booting from the hard disk, the system can be booted from the floppy disk;
  • this file system is quite efficient for volumes smaller than 256 MB.
Disadvantages of FAT16

The main disadvantages of FAT16 include:

  • the root directory cannot contain more than 512 entries. Using long filenames greatly reduces the number of these elements;
  • FAT16 supports a maximum of 65,536 clusters, and since some clusters are reserved by the operating system, the number of available clusters is 65,524. Each cluster has a fixed size for a given LUN. When the maximum number of clusters is reached at their maximum size (32 KB), the maximum supported volume is limited to 4 GB (under Windows 2000). To maintain compatibility with MS-DOS, Windows 95, and Windows 98, the size of a FAT16 volume must not exceed 2 GB;
  • FAT16 does not support built-in file protection and compression;
  • on large disks, a lot of space is wasted due to the fact that the maximum cluster size is used. The space for the file is allocated based on the size of the cluster, not the file.
Benefits of FAT32

Among the advantages of FAT32 are the following:

  • disk space allocation is performed more efficiently, especially for large disks;
  • the root directory in FAT32 is a regular chain of clusters and can be located anywhere on the disk. Because of this, FAT32 does not impose any restrictions on the number of items in the root directory;
  • due to the use of smaller clusters (4 KB on disks up to 8 GB), the occupied disk space is usually 10-15% less than under FAT16;
  • FAT32 is the more secure file system. In particular, it supports the ability to move the root directory and use a FAT backup. In addition, the boot record contains a number of critical data for the file system.
Disadvantages of FAT32

The main disadvantages of FAT32:

  • the volume size when using FAT32 under Windows 2000 is limited to 32 GB;
  • FAT32 volumes are not available from other operating systems - only from Windows 95 OSR2 and Windows 98;
  • boot sector backup is not supported;
  • FAT32 does not support built-in file protection and compression.

NTFS file system

When using Windows 2000, Microsoft recommends that you format all hard disk partitions to NTFS, except for configurations where multiple operating systems are used (except Windows 2000 and Windows NT). Using NTFS instead of FAT allows you to use the features available in NTFS. These include, in particular:

  • the possibility of recovery. This feature is "built into" the file system. NTFS guarantees the safety of data due to the fact that it uses a protocol and some information recovery algorithms. In the event of a system failure, NTFS uses the protocol and additional information to automatically restore the integrity of the file system;
  • information compression. For NTFS volumes, Windows 2000 supports single file compression. Such compressed files can be used by Windows applications without prior decompression, which occurs automatically when reading from the file. When closing and saving the file is packed again;
  • In addition, the following advantages of NTFS can be distinguished:

Some operating system features require NTFS;

Access speed is much faster - NTFS minimizes the number of disk accesses required to find a file;

Protection of files and directories. Only on NTFS volumes it is possible to set file and folder access attributes;

When using NTFS, Windows 2000 supports volumes up to 2TB;

The file system maintains a backup copy of the boot sector - it is located at the end of the volume;

NTFS supports the Encrypted File System (EFS) encryption system, which provides protection against unauthorized access to the contents of files;

When using quotas, you can limit the amount of disk space used by users.

Disadvantages of NTFS

Speaking about the shortcomings of the NTFS file system, it should be noted that:

  • NTFS volumes are not available on MS-DOS, Windows 95, and Windows 98. In addition, a number of features that are available in NTFS under Windows 2000 are not available on Windows 4.0 and earlier;
  • Small volumes containing many small files may experience performance degradation compared to FAT.

File system and speed

As we have already found out, for small volumes, FAT16 or FAT32 provides faster file access compared to NTFS, because:

  • FAT has a simpler structure;
  • directories are smaller;
  • FAT does not support protecting files from unauthorized access - the system does not need to check file permissions.

NTFS minimizes the number of disk accesses and the time it takes to find a file. Also, if the directory size is small enough to fit in a single MFT entry, the entire entry is read in one go.

One entry in the FAT contains the cluster number for the first cluster in the directory. Viewing a FAT file requires searching through the entire file structure.

When comparing the speed of operations performed for directories containing short and long file names, it should be taken into account that the speed of operations for FAT depends on the operation itself and the size of the directory. If FAT looks for a file that doesn't exist, it searches the entire directory, an operation that takes longer than searching the B-tree structure used by NTFS. The average time it takes to find a file in FAT is expressed as a function of N/2, in NTFS it is expressed as log N, where N is the number of files.

A number of the following factors affect the speed of reading and writing files under Windows 2000:

  • file fragmentation. If the file is highly fragmented, NTFS usually requires fewer disk accesses than FAT to find all the fragments;
  • cluster size. For both file systems, the default cluster size depends on the size of the volume and is always expressed as a power of 2. Addresses in FAT16 are 16-bit, in FAT32 they are 32-bit, in NTFS they are 64-bit;
  • the default cluster size in FAT is based on the fact that the file location table can have no more than 65,535 entries - the cluster size is a function of the volume size divided by 65,535. Thus, the default cluster size for a FAT volume is always larger than than the cluster size for an NTFS volume of the same size. Note that a larger cluster size for FAT volumes means that FAT volumes can be less fragmented;
  • location of small files. When using NTFS, small files are contained in an MFT record. The size of a file that fits into a single MFT record depends on the number of attributes in that file.

Maximum size of NTFS volumes

Theoretically, NTFS supports volumes with up to 232 clusters. But nevertheless, in addition to the lack of hard drives of this size, there are other restrictions on the maximum size of the volume.

One such limitation is the partition table. Industry standards limit the size of the partition table 2 to 32 sectors. Another limitation is the sector size, which is typically 512 bytes. Since the sector size may change in the future, the current size limits the size of a single volume to 2 TB (2 32 x 512 bytes = 2 41). Thus, 2TB is the practical limit for NTFS physical and logical volumes.

In table. Figure 11 shows the main limitations of NTFS.

Managing access to files and directories

When using NTFS volumes, you can set file and directory permissions. These access rights specify which users and groups have access to them and what level of access is allowed. Such access rights apply both to users working on the computer on which the files are located, and to users accessing files over the network when the file is located in a directory open for remote access.

Under NTFS, you can also set remote access permissions combined with file and directory permissions. In addition, file attributes (read-only, hidden, system) also restrict access to the file.

Under FAT16 and FAT32, it is also possible to set file attributes, but they do not provide file permissions.

The version of NTFS used in Windows 2000 introduced a new type of access permission called inherited permissions. The Security tab contains the option Allow inheritable permissions from parent to propagate to this file object, which is active by default. This option significantly reduces the time required to change the permissions for files and subdirectories. For example, to change the permissions of a tree containing hundreds of subdirectories and files, it is enough to enable this option - in Windows NT 4, you must change the attributes of each individual file and subdirectory.

On fig. Figure 5 shows the Properties dialog box and the Security tab (Advanced section) listing extended file permissions.

Recall that for FAT volumes, access can only be controlled at the volume level, and such control is possible only with remote access.

Compressing files and directories

Windows 2000 supports compression of files and directories located on NTFS volumes. Compressed files are readable and writable by any Windows application. For this, there is no need for their preliminary unpacking. The compression algorithm used is similar to that used in DoubleSpace (MS-DOS 6.0) and DriveSpace (MS-DOS 6.22), but has one significant difference - under MS-DOS, an entire primary partition or logical device is compressed, while under NTFS you can pack individual files and directories.

The compression algorithm in NTFS is designed to support clusters up to 4 KB in size. If the cluster size is larger than 4 KB, the NTFS compression features become unavailable.

Self-healing NTFS

The NTFS file system is self-healing and can maintain its integrity through the use of a log of actions taken and a number of other mechanisms.

NTFS treats every operation that modifies system files on NTFS volumes as a transaction and stores information about such a transaction in a log. A started transaction can either be completely completed (commit) or rolled back (rollback). In the latter case, the NTFS volume returns to the state prior to the start of the transaction. In order to manage transactions, NTFS writes all the operations involved in a transaction to a log file before it is written to disk. After the transaction is completed, all operations are performed. Thus, under NTFS management, there can be no pending operations. In the event of disk failures, pending operations are simply cancelled.

Under the control of NTFS, operations are also performed that allow you to identify bad clusters on the fly and allocate new clusters for file operations. This mechanism is called cluster remapping.

In this review, we examined the various file systems supported in Microsoft Windows 2000, discussed the design of each of them, noted their advantages and disadvantages. The most promising is the NTFS file system, which has a large set of features that are not available in other file systems. The new version of NTFS supported by Microsoft Windows 2000 has even more functionality and is therefore recommended for use when installing the Win 2000 operating system.

ComputerPress 7"2000

File system it's just a way of organizing data on the media, there is nothing complicated in this organization.

Perhaps you are thinking: “that the file system is a complex and incomprehensible thing, because operating systems work with it, and everything simply cannot be there ...”

You are partially right, but all the raisins are in the file system driver, i.e. in a program that provides an API for other application programs. It just does things like:

  • create a file
  • delete a file
  • rename
  • copy
  • show directory contents
  • move to another directory, etc.

The very principle of the organization of the file system is simple.

In this post, I will not consider how the driver works and how it creates / deletes files, I will tell you about the principle of file organization FAT16 systems.

(about how to write a driver, there is a separate one)

Why FAT16?

I find it the most convenient for learning, it is easy to comprehend. And knowing the idea, it is no longer difficult to learn other file systems - FAT32, NTFS, etc.

Why do I need to know how the file system works?

Knowing the principle of organizing the file system, you can develop your own driver or file manager on any computing device.

Description of the FAT16 file system

For your convenience, here is a list of questions to which you will find answers:

FAT16 file system divides the entire address space of the media into two areas:

  • system area
  • data area

For clarity, we will depict the entire address space as a rectangle. The small upper part of the rectangle (address space) is the system area, the lower massive one is the data area.

All data that we store on our media, i.e. all files and directories are stored in the data area. The system area, on the other hand, stores the parameters of this medium and the characteristics of files and directories - the file name, directory name, file attributes, etc.

Let's start with a simple one, a few words about the data area and how data is stored there

About the data area...

In order not to address every byte (although some storage media allow you to work byte by byte), a different minimum addressable unit is used in the file system - sector. The size sectors 512 bytes. In addition to the sector, the FAT16 file system also uses such a concept as cluster. cluster it one or more contiguous sectors.

This parameter (the number of sectors per cluster) is often manipulated when formatting storage media. Because the speed of work and the “degree of data packaging” depend on it. FAT16, like all file systems, uses the concept of a file. A file is a data area that has a name and some attributes. Physically, in the data area, this is one or more busy clusters, and the file occupies an integer number of clusters. Even if it occupies a little more than two clusters, three clusters will be considered for the file system occupied by the file. Therefore, the smaller the cluster size, the greater the “degree of data packing” and the more economically the data area is used. On the other hand, reading a file from large chunks of memory i.e. clusters faster than small ones. Therefore, the choice of cluster size is a matter of compromise.

File system FAT16 imposes limits on cluster size, no more than 128 sectors(i.e. no more than 64 kb) and on the number of clusters is not more than 65525 pieces. If you use everything to the maximum, i.e. the maximum size of sectors and the maximum number of clusters, it turns out that FAT16 cannot address more than 4.2 gigabytes of information.

If we perform formatting in automatic mode (when we do not specify the cluster size), then the cluster size is chosen to be minimal, at which the resulting number of clusters does not exceed 65525.

About the system area...

The system area is created when the media is formatted and is descriptive. It consists of the following parts:

Let's analyze each part in more detail.

1. Boot sector

The boot sector is parameter table and program loader. The boot sector size is usually 512 bytes, but it could be more.

Consider the structure of the boot sector.

Do not be afraid of a large number of fields in the boot sector, he is redundant. For example, it stores information that is not relevant for flash drives: the number of sectors on a track, the number of heads. So, not all parameters will be useful for us.

If you look HEX code, some media formatted in FAT16 format, then we will see the value of the fields. As an example, I will give the HEX code of an image in FAT16 format created in WinImage. For the convenience of orienting in the code, I marked with colors which fragment of the HEX code belongs to which parameter.

P.S. The value for each cell is considered from right to left, for example, if it is written 00 02 h, then it is actually 02 00 h, i.e. 512

P.S. The boot sector always ends at 55AAh.

It is important to pay attention to the parameter " ReservedSectors» - the number of reserved sectors, by offset 0Eh. At the very beginning, I said that the boot sector is usually 512 bytes in size, but it can be more. Its size is determined by the parameter " ReservedSectors", in our case ReservedSectors = 01h, so the boot sector occupies 1st sector or 512 bytes.

2. FAT

After boot sector with size 512* ReservedSectors bytes, table goes FAT1, its size is determined two-byte field - SectorPerFat (16h) boot sector. In the example above, the value of this field is equal to 0001h or 1 , i.e. one sector or 512 bytes.

What is FAT?

First of all, this is an abbreviation - File Allocation Table, meaning "file location table". it table With one column and 512/2 number of lines(if the size of the FAT table is 512 bytes or SectorPerFat is 0001h, as in our case). Each line FAT tables occupies 2 bytes of memory, so the number of lines for our case is 512/2 .

Table serves as a map across clusters, each line characterizes any cluster, the first line is the first cluster, the second is the second, and so on for all the clusters that are in the data area. The table is preceded by a table descriptor F8FFh(same value as 15h boot sector) and placeholder FFFFh. Next are the rows of the table, the values ​​of which can be the following:

  • 0000h- free cluster;
  • 0002h-FFEFh- number of the next element in the chain;
  • FFF0h-FFF6h- reserved;
  • FFF7h- defective;
  • FFF8h-FFFFh- the last one in the chain;

I will give an example HEX code with explanation.

Blue I have framed FAT1 table, red FAT2 table(copy of FAT1 table). painted over green square this is table descriptor F8FFh and placeholder FFFFh. Unfilled squares are table rows. I did not mark all the lines with a green frame, circled only non-zero ones.

How it is used and why FAT is needed, I will explain a little later.

3. Root directory

After the FAT tables comes " root directory". This is the area of ​​memory that contains 32-byte elements. Every element describes, any file or directory located in the root directory or another language "at the root" of the hard drive / flash drive. It turns out the root directory describes everything that is in the root.

The size of the root directory depends on the setting RootEntries (11h) boot sector. It indicates maximum number of 32-byte elements in the root directory. It turns out the size of the directory is RootEntries * 32, for our case it is 512 * 32 = 16384 bytes.

Each element has the following structure:

I will give an example of a HEX code with an explanation.

Green I have framed memory area responsible for the root directory, blue 32-byte root directory entries. Not empty 32-byte elements I painted over in blue.

Here are two non-empty 32-byte elements, means in the root directory store two "somethings", it can be both files and other directories. In this case, for simplicity of the example, two files are stored in the root " 1.txt" and " test.txt».

Let's take a closer look at these two 32-byte elements; for convenience, I marked the fragment of the HEX code and the corresponding parameter of the 32-byte element in the table with colors.

P.S.. If the first byte of the filename is replaced by "E5", then windows explorer will count it as remote. Such a file can be restored by replacing the first character E5 in the name with the previous value. I'm not completely sure, but I think this is how the recycle bin works in Windows. When placing it in the trash, the operating system saves the file name somewhere and replaces the first byte in the name with E5, and when restoring, it assigns the file its former name.

P.S.. File names in the FAT16 system are stored in the format 8.3 . Those. 8 -bytes allocated for name and 3 bytes allocated for extension. Names are encoded in the format ASCII, one character is one byte. Therefore, the name cannot be longer than 8 characters, and extensions more than 3. If the name shorter than 8 characters, then missing bytes are filled in 20h(space character in ASCII code).

P.S.. Let me remind you that the value for each cell is considered from right to left, for example, if it is written 00 02 h, then it is actually 02 00 h, i.e. 512 in the decimal system.

The most important parameter for us is located at 1Ah — « low word of first file cluster". It stores the number of the cluster in which the contents of the file are located, which means we can work with the information of this file, i.e. read it, edit it, etc.

For example " 1.txt» stored in a cluster number 0x0003 or 3 in the decimal system. And this means that if we let's move on to cluster №3 in the data area (remember, the data area is just consecutive clusters) we get to the contents of this file.

You may have a "practical" question, but how to find this third cluster? By what address is it?

How to find the cluster address knowing its number?

For this, you need to know how much system space do you have and how big are the clusters(i.e. how many sectors (or 512 bytes) does the cluster contain).

The following figure will help you find out the size of the system area:

Example for my case

Boot sector has volume 512*ReservedSectors bytes, in my case 512 bytes. Further, the FAT table occupies me one sector, those. 512 bytes(since SectroPerFat is 1). Table two(because NumberOfFATs is equal to 2), then two tables in total 512*2=1024 bytes. The size of the root directory is 512 32-character elements, i.e. 512*32=16384 bytes. We believe:

512 (boot sector) + 1024 (two FAT tables) + 16384 (root directory) = 17920 bytes or 4600 in hexadecimal system.

As a result, in our case, the data area starts with 0x4600, let's see:

We see the contents of some file, but not ours. The data of the file we are interested in (1.txt) is stored in cluster №3.

Now we need to find out the cluster size, the boot sector parameter will help us with this - SectorPerCluster(0xD, parameter size 1 byte). In our case cluster size 4th sectors, i.e. 512*4=2048 bytes or 800 in hexadecimal system. It is important to note that clusters are numbered from two, not from one (!).

We calculate from what addresses starts cluster №3:

0x4600 (system area) + 0x800 (second cluster) = 0x4E00

Let's calculate what address ends cluster number 3:

0x4E00 (beginning of cluster #3) + 0x800 (512*4 or size of one cluster in HEX) = 0x5600

As a result, the cluster No. 3 lies in the address range 0x4E000x5600.

Let's see the HEX code

blue framed I marked 1.txt file content. Everything above the frame is the contents of another file. Empty areas of the sector are filled with 0x00.

So why do we need a FAT table?

If the file occupies more than one cluster (in our case, if the file is larger than 2048 bytes), then the FAT table comes to the rescue. It is something like a "map" of clusters. Those. when will we know sector number, with which the file of interest to us begins, the first thing we need to look at same line number in FAT.

If the string matters 0xFF8-0xFFFF, then this means that this is the last cluster for a given file, i.e. file occupies just one cluster.

If the string matters 0x0002-0xFFEF, then this means that file stretched to another cluster. Number means next cluster number, which holds the continuation of the file. We must continue reading the file by given number cluster.

After reading a new cluster, you need to look at the value of the line at this number in the FAT. If the value of the line is 0x FF8-0xFFFF, then this means that this cluster is the last one in the file. If 0x0002-0xFFEF, then this is the number for the next cluster, read further and repeat the action. Reading a file is a conditional loop.

So we figured out the files, now it's time to deal with the directories.

What is a directory?

The directory for the FAT16 file system (and for many others) is a special zero-size file that stores a list of its contents.

Let's say we added the directory " TEST_DIR» with file « in_dir.txt". Then in the root directory a new 32-byte element will appear, it describes a directory same as file, but with slight differences.

I marked in red the parameters specific to directories, these are 0x10- directory label and 0x00000000- file size.

As you can see in the blue square, we have a directory in cluster №5 let's see what's there.

The contents of the "file" TEST_DIR in fact, this is the same root directory, i.e. set of 32-byte elements. I have marked each element with a green border.

The elements describe the name of the file or directory, attributes and the number of the cluster in which its data is located. In any folder, always there two directories With name "." and "..".

The first lies in the cluster №5 , i.e. this is the same directory, a the second one is for cluster number 0. Underneath it number means "root directory", i.e. this is the output to the root directory.

Description of the file " in_dir.txt» standard, as for the root directory (see root directory). For us, the main thing is the number of the cluster in which the contents of this file are located (marked with a red square).

We are watching cluster №6 and see the content of the file in_dir.txt". I marked the beginning of the cluster with the red line.

You will be interested:


In FAT, filenames are in 8.3 format and consist only of ASCII characters. Support for long (up to 255 characters) filenames has been added to VFAT. Long File Name, LFN) encoded in UTF-16LE, with LFNs stored at the same time as 8.3 names, retrospectively referred to as SFNs. Short File Name). LFNs are case-insensitive when looking up, however, unlike SFNs, which are stored in upper case, LFNs retain the case specified when the file was created.

Structure of the FAT system

In the FAT file system, contiguous disk sectors are combined into units called clusters. The number of sectors in a cluster is equal to a power of two (see below). An integer number of clusters (at least one) is allocated for storing file data, so, for example, if the file size is 40 bytes and the cluster size is 4 kbytes, only 1% of the space allocated for it will actually be occupied by the file information. To avoid such situations, it is advisable to reduce the size of clusters, and vice versa to reduce the amount of address information and increase the speed of file operations. In practice some compromise is chosen. Since the capacity of a disk may well not be expressed in an integer number of clusters, there are usually so-called "units" at the end of the volume. surplus sectors - a "residue" less than a cluster in size, which cannot be allocated by the OS for storing information.

The FAT32 volume space is logically divided into three contiguous areas:

  • reserved area. Contains service structures that belong to a partition boot record (Partition Boot Record - PBR, to distinguish it from Master Boot Record - the master boot record of a disk; also PBR is often incorrectly called a boot sector) and are used when initializing a volume;
  • An area of ​​a FAT table containing an array of index pointers ("cells") corresponding to data area clusters. Usually there are two copies of the FAT table on the disk for reliability purposes;
  • The data area where the actual contents of the files are written - that is, the text text files, encoded image for picture files, digitized sound for audio files, etc. - as well as the so-called. metadata - information about file and folder names, their attributes, creation and modification times, size and location on disk.

FAT12 and FAT16 also have a dedicated area for the root directory. It has a fixed position (immediately after the last entry in the FAT table) and a fixed size in sectors.

If a cluster belongs to a file, then the cell corresponding to it contains the number of the next cluster of the same file. If the cell corresponds to the last cluster of the file, then it contains a special value (FFFF 16 for FAT16). Thus, a chain of file clusters is built. Zeros correspond to unused clusters in the table. "Bad" clusters (which are excluded from processing, for example, because the corresponding area of ​​the device is unreadable) also have a special code.

When a file is deleted, the first character of the name is replaced with a special code E5 16 and the chain of file clusters in the allocation table is reset to zero. Since the information about the file size (which is located in the directory next to the file name) remains intact, if the file clusters were located sequentially on the disk and they were not overwritten with new information, it is possible to recover the deleted file.

boot record

The first structure of a FAT volume is called BPB. BIOS parameter block ) and is located in the reserved area, in sector zero. This structure contains information identifying the type of file system and the physical characteristics of the medium (floppy disk or hard disk partition).

BIOS Parameter Block

In principle, BPB was absent in FAT, which served MS-DOS 1.x, since at that time only two different types of volume were assumed - one- and two-sided five-inch 360 kb floppy disks, and the volume format was determined by the first byte of the FAT area. BPB was introduced in MS-DOS 2.x in early 1983 as a mandatory boot sector structure from which to determine the volume format henceforth; the old FAT first byte detection scheme is no longer supported. Also in MS-DOS 2.0, a hierarchy of files and folders was introduced (before that, all files were stored in the root directory).

The BPB structure in MS-DOS 2.x contained a 16-bit "total number of sectors" field, which meant that this version of FAT was fundamentally inapplicable for volumes larger than 2 16 = 65,536 sectors, that is, more than 32 MB with a standard sector size of 512 bytes. In MS-DOS 4.0 (1988), the above BPB field was extended to 32 bits, which meant an increase in the theoretical volume size to 232 = 4,294,967,296 sectors, i.e. up to 2 TB with a 512-byte sector.

The next modification of BPB appeared with Windows 95 OSR2, which introduced FAT32 (in August 1996). The two-gigabyte limit on volume size has been removed, a FAT32 volume can theoretically be up to 8 TB in size. However, the size of each individual file cannot exceed 4 GB. The BIOS Parameter Block FAT32 repeats BPB FAT16 up to and including the BPB_TotSec32 field for compatibility with earlier versions of FAT, followed by differences.

The FAT32 "boot sector" is actually three 512-byte sectors - sectors 0, 1 and 2. Each of them contains the signature 0xAA55 at address 0x1FE, that is, in the last two bytes, if the sector size is 512 bytes. If the sector size is more than 512 bytes, then the signature is contained both at address 0x1FE and in the last two bytes of the zero sector, that is, it is duplicated.

FSInfo

The boot record of a FAT32 partition contains a structure called FSInfo, used to store the value of the number of free clusters on the volume. FSInfo, as a rule, occupies sector 1 (see the BPB_FSInfo field) and has the following structure (addresses relative to the beginning of the sector):

  • FSI_LeadSig. The 4-byte signature 0x41615252 indicates that the sector is being used for the FSInfo structure.
  • FSI_Reserved1. The interval from the 4th to the 483rd bytes of the sector, inclusive, is reset to zero.
  • FSI_StrucSig. Another signature is located at 0x1E4 and contains the value 0x61417272.
  • FSI_Free_Count. The four-byte field at address 0x1E8 contains the last number of free clusters on the volume known to the system. The value 0xFFFFFFFF means that the number of free clusters is unknown and must be calculated.
  • FSI_Nxt_Free. A four-byte field at address 0x1EC contains the cluster number from which the search for free clusters in the index pointer table should begin. Typically, this field contains the number of the last FAT cluster assigned to store the file. The value 0xFFFFFFFF means that the search for a free cluster should be carried out from the very beginning of the FAT table, that is, from the second cluster.
  • FSI_Reserved2. Reserved 12-byte field at address 0x1F0.
  • FSI_TrailSig. Signature 0xAA550000 - the last 4 bytes of the FSInfo sector.

The point of introducing FSInfo is to optimize the system performance, since in FAT32 the index pointer table can be large and its byte-by-byte lookup can take a significant amount of time. However, the values ​​of the fields FSI_Free_Count and FSI_Nxt_Free may not correspond to reality and should be checked for adequacy. In addition, they are not even updated in the FSInfo backup, which is usually located in sector 7.

Determining the type of FAT volume

Determining the type of FAT volume (that is, the choice between FAT12, FAT16 and FAT32) is made by the OS based on the number of clusters in the volume, which in turn is determined from the BPB fields. First of all, the number of sectors of the root directory is calculated:

RootDirSectors = (BPB_RootEntCnt * 32) / BPB_BytsPerSec

DataSec = TotSec - (BPB_ResvdSecCnt + (BPB_NumFATs * FATSz) + RootDirSectors)

Finally, the number of data area clusters is determined:

CountofClusters = DataSec / BPB_SecPerClus

By the number of clusters, there is a one-to-one correspondence with the file system:

  • CountofClusters< 4085 - FAT12
  • CountofClusters = 4085 ÷ 65524 - FAT16
  • CountofClusters > 65524 - FAT32

According to the official specification, this is the only valid way to determine the FAT type. Artificially creating a volume that violates the specified mapping rules will cause it to be handled incorrectly by Windows. However, it is recommended to avoid values ​​of CountofClusters that are close to the critical values ​​(4085 and 65525) in order to correctly determine the type of the file system by any, often incorrectly written, drivers.

Over time, FAT became widely used in various devices for compatibility between DOS, Windows, OS / 2, Linux. Microsoft has shown no intention of forcing them to be licensed [ clarify] .

In February 2009, Microsoft sued TomTom, a maker of Linux-based in-car navigation systems, for patent infringement.

Notes

  1. http://cd.textfiles.com/megademo2/INFO/OS2_HPFS.TXT
  2. www.microsoft.com/mscorp/ip/tech/fathist.asp at archive.org
  3. Microsoft Extensible Firmware Initiative FAT32 File System Specification 1.03. Microsoft (December 6, 2000). - Document format Microsoft Word, 268 Kb. archived
  4. What About VFAT? . TechNet Archive. Microsoft (October 15, 1999). Archived from the original on August 22, 2011. Retrieved April 5, 2010.
  5. Do not confuse the VFAT file system extension with the file system driver of the same name, which appeared in Windows for Workgroups 3.11 and is designed to process MS-DOS function calls (INT 21h) in protected mode (see: KB126746: Windows for Workgroups Version History . VERSION 3.11 → Non-Network Features. Microsoft (November 14, 2003). Archived from the original on August 22, 2011. Retrieved April 5, 2010.)
  6. Federal Patent Court declares FAT patent of Microsoft null and void . heise online. Heise Zeitschriften Verlag (March 2, 2007). archived
  7. Brian Kahin. Microsoft Roils the World with FAT Patents. The Huffington Post (March 10, 2009). Archived from the original on August 22, 2011. Retrieved March 10, 2009.
  8. Ryan Paul. Microsoft suit over FAT patents could open OSS Pandora's Box (English) . Ars Technica. Condé Nast Publications (February 25, 2009). archived
  9. Glyn Moody.(English) . Computerworld UK. IDG (March 5, 2009). Archived from the original on August 22, 2011. Retrieved March 9, 2009.
  10. Steven J. Vaughan-Nichols. Linux companies sign Microsoft patent protection pacts. Blogs. IDG (March 5, 2009). Archived from the original on August 22, 2011. Retrieved March 9, 2009.
  11. Erica Ogg. TomTom countersues Microsoft in patent dispute. CNet (March 19, 2009). Archived from the original on August 22, 2011. Retrieved March 20, 2009.

Links

  • ECMA-107 FAT standard