|
In brief it can be described as….
The
allocation of file space within a DOS partition is recorded and maintained
within DOS's File Allocation Tables
(FATs). The FATs make up a map of
the
utilization of space on any floppy or hard disk with one entry in the FAT for
each allocatable cluster of sectors. Each entry in the FAT can indicate one of
four possible conditions for the clusters of sectors it represents: It can be
unused and available for allocation, unused and marked as bad to prevent its
use, in use and pointing to the next cluster of the file, or in use as the last
cluster of a file.
If each entry in the FAT points to the next, who points to the first entry? This
is the role of the file's directory entry. It contains the name of the file,
the file's exact length, the time and date of the file's last modification,
file attribute flags, and the identity of file's first cluster. In a sense, a
file's directory entry forms the head of the file's allocation chain with each
link thereafter pointing to the next link in the chain.
This
system, while quite workable and efficient, does have its dangers. These
dangers center around the fact that the FAT contains the ONLY record of disk
space utilization and a stubborn failure to correctly read a single sector of
the FAT could render hundreds of files unrecoverable. This danger explains the
popularity of several utility programs, which create a back-up copy of the File
Allocation Table and Root Directory with each system boot-up. They provide some
hope of recovery from the cataclysmic loss of the FAT's data.
The
original designers of DOS were aware of the importance of the FAT and do
provide a duplicate copy immediately following the first, but its physical
proximity to the original renders it little better than none, and DOS has long
been notorious for failing to intelligently utilize this extra copy of FAT
information even in the event of a primary FAT failure. (DOS 3.3 seems to be
much smarter in this regard.)
Important
as FAT reliability is, it's not generally the prime source of DOS file
corruption, since even with perfect data retrieval, it's still possible to
scramble DOS's files like crazy. The primary cause of DOS file system troubles
are user error, program bugs, and "glitches." The advent of TSR
"rule breaking" resident multitasking-style software has further
complicated the scene.
When a
new file is created or "opened," information about it is maintained
inside DOS. The file's name, status, and first cluster are all held in internal
tables. Then, as the file grows, free clusters are "checked out" of
the File Allocation Table and allocated to the file's chain of clusters.
Now here's the crucial fact which causes so much trouble: No matter how big the
newly created file becomes, a directory entry for the file is ONLY created when
the file is finally and properly CLOSED. Until then the file exists only as a
chain of allocated clusters filled with the file's data. If anything occurs to
prevent the error-free closing of this file we have a real problem because the
file's data is occupying a chain of "checked out" disk clusters, but
there is no anchoring directory entry to point to the first cluster in the
chain!
A chain of clusters without an anchoring directory entry is called a "lost
chain." It exists, it contains data, but there's no record of the file's
name, exact size, or purpose.
Lost cluster chains are frequently created when programs abort abnormally, when
TSR's crash the system suddenly, when the computer user forgets to write a
TSR's files out to disk before shutting the system down, or when a task in a
multi-tasking system is not terminated. (It's easy to forget that a file was
left open in a suspended background task.) Additionally, any damage to DOS's
root directory or subdirectories can "liberate" chains of lost
clusters.
DOS
provides the CHKDSK (pronounced Check Disk) command to help its users keep an
eye on just these sorts of problems. CHKDSK provides a comprehensive
verification of DOS's filing system integrity and provides a means for
straightening things out. When the CHKDSK command is given, the parentage of
all cluster chains is checked, allocation chains are "followed" to be
sure they don't cross over other chains (creating cross-linked files), and
several other system integrity checks are performed.
In the
case of lost chains, CHKDSK will offer to convert these into files by anchoring
them to the root directory. Then any suitable text editor can be used to open
these new files for the sake of identifying them and moving them back to where
they belong.
Unfortunately
the structure of DOS filing systems lacks the fundamental redundancy required
to provide simple and error-free recovery from many forms of damage. Even the
tools and techniques available from third party suppliers can't surmount these
problems. The best bet is to understand DOS's weak spots, make certain that all
opened files are closed successfully, perform a weekly CHKDSK command to collect
accumulating file fragment "debris" and back up your hard disks
regularly.
"Disk
Optimizers" which promise to increase the throughput and performance of
old and well used hard disk drives number among the most popular of the general
use hard disk utilities.
We've seen how DOS's file allocation system operates. Files are composed of
clusters, which in turn are composed of sectors. And while the group of sectors
which comprise a cluster are by definition contiguous, the cluster linking
scheme which DOS employs allows a file's clusters to be scattered across the
disk's surface. Since the file's directory entry specifies the file's first
cluster, and each succeeding cluster entry in the file allocation table
specifies the next one, the file's contents could be literally anywhere on the
disk. The term "file fragmentation" refers to the condition where a
file's clusters are not consecutively numbered. Let's first examine how a
disk's files might become fragmented.
When a
file is deleted from a disk, its directory entry is flagged as unused and each
cluster, which the file occupies, is flagged in the system's FAT as being free
for use. If the surrounding clusters are still in use by other files, this
creates a "hole" of free space in the disk.
Now suppose that a new file is copied from a floppy disk onto the hard disk. As
DOS reads the new file's data from the floppy, it must allocate space for this
file on the hard disk. So each time another cluster of sectors is needed, DOS
searches through the file allocation table to find the next available cluster.
In our example, DOS would discover the clusters, which had been freed by the
first file we deleted and allocate them for use by
the new file. Then, when all of the clusters in the free space hole had been
used, DOS would be forced to continue its search deeper into the drive. When
space was found further in, the file's contents would be partially stored near
the beginning of the disk and partially nearer to the end. The file would then
consist of at least two fragments.
During
the normal course of daily computer usage, many files are being constantly
created, copied, extended, deleted, and replaced. When a word processor creates
an automatic backup file, the original file is typically renamed to identify it
as a backup file and a new file is created. Every new file creation is an
opportunity for fragmentation. The files, which are being modified most often,
are most subject to extensive fragmentation since any search by DOS for a free
file cluster is almost guaranteed to produce a new discontinuity. With
continued use, it's typical for much of the disk's file data to become
haphazardly scattered across the surface of the disk drive.
But
since DOS's cluster allocation scheme was specifically designed to manage such
scattering, what's the problem? Any time the drive's head moves, two things
occur: Time is consumed, and the drive experiences some mechanical wear and
tear. If a file's data is scattered across the surface of the disk, the drive's
head is forced to move a large distance many times to read a single file. If
the file is a database whose records are being accessed at random, this
excessive head motion can degrade the overall system performance tremendously
and induce many other wear-related disk drive problems.
The extra
time wasted in cluster fragment chasing is directly proportional to the drive's
average head access time. The prior generation of 65 to 80 millisecond stepping
motor drives lose far more performance to fragmentation than the latest sub-28
millisecond drives.
Disk optimisers like SoftLogic Solutions' DISK OPTIMIZER, Norton's SPEEDDISK,
Central Point's COMPRESS, and Golden Bow's VOPT operate by physically
rearranging the allocation of files on the disk. They relocate file cluster
fragments while simultaneously updating the system's File Allocation Tables to
reflect the new cluster locations. When finished, every file on the disk
consists of a single contiguous run of consecutively
numbered clusters. Once the disk drive's head has been positioned to the beginning
of the file, the entire file can be read or randomly accessed with an absolute
minimum of head motion. Besides improving the system's overall performance,
file defragmentation minimizes the mechanical wear and tear placed upon the
drive's hardware. If some disaster should befall your system's Root Directory
or File Allocation Table, contiguous
files are also much easier to find and recover than files with severe
fragmentation.
Since
file fragmentation is a continually occurring fact of living with DOS, periodic
defragmentation, like hard disk backup, should become part of every serious DOS
user's regimen.
c)
File allocation table
Figure 5: File Allocation Table
- Allocation of disk blocks to files recorded in a file allocation
table (e.g. MS-DOS, OS/2)
- Sequential Access
- Hold all or part of file allocation table in store to reduce number of
disk accesses
- Maximum of two disk accesses per file block access if blocks for the
file are referenced on one file map block (i.e. ideally file blocks should
be clustered on the disk)
| |