Basically, files fragment because any time a file is edited and bits added either directly by the user or indirectly by the system, the file size may not fit the original saved space. In the case of the Windows filing system, the file will be saved elsewhere or the extra cluster(s) used to hold the changes will be dropped in an available free space.
This is how the filing system works in the simplest way I can put it.
Remember the Windows 98 defragmenter? The little blue squares? Each square represented one cluster, and what you saw was a representation of clusters being put back into proper order. When a hard drive is formatted in NTFS format for in 9x, XP, Vista and Windows 7, clusters are being created and each cluster is sized at 4k by default.
Have you ever wondered what a cluster contains? I will tell you. It contains a complete file assuming that the file is less than 4k in size, or it contains part of a file which is larger than 4k. It can’t contain a file which contains an exact data size of 4k because there has to be space allocated in each cluster for information relating to the name given to it when you saved it and the overall size. Without this information, Windows Explorer would not be able to find anything meaningful unless each file was only 4k in total.
So now you basically know what is contained in each cluster, but there is another limitation and it is the reason why a defragmenter can’t free up space. All it can do is consolidate clusters into contiguous file groups and free groups..
A cluster can only hold information on ONE saved file. A cluster can’t be shared by two files with different saved names. So, if you create a 1k file, it will be saved on a cluster and nothing else can occupy the space left vacant up to the limit of 4k. In other words, if you saved two 1k files, the hard drive space taken up by them would be 8k (two clusters), even though the actual combined file size is only 2k. A defragmenter does not defragment files per se. It defragments cluster groups, regardless of what is saved on each cluster..
As you can see, a 4k cluster size is not very efficient with regards to space when saving very small files. Fortunately, most files found on a computer are way larger than 1 – 4k. With files over 4k, the most amount of space wasted for each file name can only be 3k maximum. The downside is that there are millions of clusters on the average hard drive, and each one has the potential to be displaced. A 5mb PowerPoint show consumes a great many clusters and can in theory be scattered far and wide.
An obvious way to reduce the capacity for so many clusters to be displaced is to increase the size of each cluster to a higher number than 4. If cluster size was raised to 8k or 16k, there would be many less clusters which could get displaced. Files saved on them would be in larger pieces and fragmentation would not be the problem that it is because the hard drive heads would not have to put in so much work locating and bringing together less parts. If only life was so simple. In the case of 8k and 16k clusters, wasted space would in worst case would increase to 7k and 15k respectively per every file name saved.
The people who decided on the cluster size for NTFS had to make a compromise between the amount of fragmentation that would occur against the amount of space that would be wasted, and 4k seemed like a good idea at the time.
You are probably wondering why we still have to defragment in 2009. It is the same reason that has dogged Windows development for years, namely backwards compatibility. When Windows first appeared, it was a graphical cover slipped over the top of DOS, a low end single user desktop operating system. Windows NT took it up a level, but whatever has carried the name ‘Windows’ has at the very least had to be compatible with the previous version or immediate relative. Microsoft Windows has come a long way despite the compatibility issues, and it is still the cheapest route to what is a highly polished and capable computer operating system. We should not forget that.
Some Linux distros may be free, and they do have the advantage of being developed from what was always a multi-user system, but they lack the finesse and versatility of Microsoft Windows. If the occasional defrag is a price to pay, then pay up willingly. And for those who are quick to point out that Windows is in the sights of every hacker and script kiddie, with all respect, I think that if Linux had 180 million users world wide and Windows was the underdog, it wouldn’t be us Windows users desperately looking for ways to keep our systems clean..
So next time you defrag your system, and you do have to defrag occasionally, do it with pride. You are a member of the largest computer based family on the planet Earth.