A history of S_IFMT
S_IFMT is a mask identifying the bits of an inode's mode that indicate the file's type, i.e. whether it is a directory, a symbolic link, a socket, and so on. It is conventionally
0170000, which corresponds to the top 4 bits of a 16-bit mode.
I saw someone asking the other day why 4 bits are used when POSIX only defines 7 types, and so could be stored just as well in 3 bits. The straightforward answer is that it allows room for expansion, and indeed many Unixes define several more. Solaris, for example, has an additional 3 types: doors, event ports, and ACL shadows (though the latter is not exposed in userspace).
But that's not the whole story. The question I'm going to answer in this post is not why 4 bits are used, but why they're used the way they are. If you have a look at the standard file types, their values seem pretty arbitrary, when you might expect a simple count upwards.
I saw some patterns in there, but I couldn't work it out, so I had a look at some historical manuals and header files.
1st Edition UNIX
1st Edition UNIX (1971) had no type field as such. The top 4 bits of the mode had the following layout. A dot (
.) means that the bit's value doesn't matter.
|Inode is allocated|
|Has been modified|
|Large file storage|
We can see the origin of
S_IFDIR here, but the other bits had completely different meanings. In fact, 1st Edition had a very different layout for the mode in general. For one thing, groups had yet to be introduced. The bottom 6 bits were used, from higher to lower, to mean: setuid, executable, owner-read, owner-write, other-read, and other-write. And so 1st Edition ls might write
--xrwr- to mean something like
020000 was apparently always set to 1, and so was likely just ignored by the time of the 1st Edition. Bit
100000 was also always set to 1 for allocated inodes, but this allowed the file system to distinguish between an unallocated inode and a regular file with no permissions (
4th Edition UNIX
The mode layout changed in 4th Edition UNIX (1973), coinciding with the addition of groups and a switch to the modern
-rwxrwxrwx layout for the file permissions. This was the first Unix to have a mask for these inode types, though it was only 2 bits wide, taking the place of the directory bit and modification bit.
The allocation bit (
IALLOC) and large file bit (
ILARG) were still used as in the 1st Edition.
7th Edition UNIX
The next change happened in 7th Edition UNIX (1979), when the mask was extended to the present 4 bits, by extending it by a single bit in each direction, displacing
ILARG. Yet each bit retained its absolute position in the mode, which is why the earliest types are not counted from 1. In addition, regular files kept their highest bit set (as it will have been when
IALLOC was in use), so as to distinguish between an unallocated inode (stored with a fully zeroed mode), and a regular file with no permissions (
Also added were two types no longer in use, multiplexed special files, which had the same codes as their uniplexed counterparts, but with their lowest bit set. These types did not however last long.
|Multiplexed character special|
|Multiplexed block special|
System III (1982) added named pipes, starting at the lowest value now possible.
4.3BSD (1986) added symbolic links and sockets, also counting up but only using the top 3 bits,
160000, presumably so as not to step on AT&T's toes.
Something interesting (to me) about how this layout has come about is that, if you twiddle the bits a little, you can end up with a reasonably chronological numbering of the types. Specifically, in code:
fmt = mode >> 12; // drop file permissions, leaving IFMT if (fmt == 010) return 0; // if only IALLOC bit is set, clear it return ((fmt >> 1) | (fmt << 2)) & 07; // fold rightmost bit onto leftmost bit
And this gives us:
So anyway, those are the reasons for the unusual