The du (i.e., disk usage) command reports the sizes of directory trees inclusive of all of their contents and the sizes of individual files. This makes it useful for tracking down space hogs, i.e., directories and files that consume large or excessive amounts of space on a hard disk drive (HDD) or other storage media.
A directory tree is a hierarchy of directories that consists of a single directory, called the parent directory or top level directory, and all levels of its subdirectories (i.e., directories within a directory). Any directory can be regarded as being the start of its own directory tree, at least if it contains subdirectories. Thus, a typical computer contains a large number of directory trees.
du is commonly employed by system administrators as a supplement to automated monitoring and notification programs that help prevent key directories and partitions (i.e., logically independent sections of a HDD) from becoming full. Full, or even nearly full, directories and partitions can cause a system to slow down, prevent users from logging in and even result in a system crash. Although visually identifying heavy consumers of disk space can be practical if there are relatively few users on a system, it is clearly not efficient for large systems with hundreds or thousands of users.
A minor limitation of du is the fact that the sizes of directories and files it reports are approximations, not exact numbers, and there is frequently a small discrepancy between these sizes and the sizes reported by other commands. However, this rarely detracts from its usefulness.
Also, du can only be used to estimate space consumption for directories and files for which the user has reading permission. Thus, an ordinary user would generally not be able to use du to determine space consumption for files or directories belonging to other users, including those belonging to the root account (i.e., the system administrator). However, as du is used mainly by system administrators, this is usually not a problem.
The basic syntax for du is:
The items in the square brackets are optional. When used with no options or arguments (i.e., names of directories or files), du lists the names and space consumption of each of the directories (including all levels of subdirectories) in the directory tree that begins with the current directory (i.e., the directory in which the user is currently working). The space consumption of any directory consists of the space occupied by all of the files in it and all of its subdirectories at all levels inclusive of all of the files in them. A final line at the end of the report gives the total space consumption for the directory tree.
du can provide information about any directory trees or files on the system whose names are given as arguments. For example, the following will report the names and sizes for each directory in the directory tree that begins with a directory named directory2 that resides in a directory named directory1, which, in turn, is located in the current directory:
du can accept any number of arguments, and they can be any combination of files and directories. When there are multiple arguments, no grand total is provided by default, although a total is still provided for each argument.
As is the case with most commands on Linux and other Unix-like operating systems, du has a number of options, a few of which are commonly used. The options can vary somewhat according to the particular operating system and the version of du.
One of the most useful options is -h (i.e., human readable), which can make the output easier to read by displaying it in kilobytes (K), megabytes (M) and gigabytes (G) rather than just in the default kilobytes. Thus, the following command can be used to show the sizes of all the subdirectories in the current directory as well as the total size of the current directory, all formatted with the appropriate K, M or G:
The -s (for suppress or summarize) option tells du to report only the total disk space occupied by a directory tree and to suppress individual reports for its subdirectories. Thus, for example, the following would provide the total disk space occupied by the current directory in an easy-to-read format:
The output is the same as the last line of a report issued by du with only the -h option.
The -a (i.e., all) option tells du to report not just the total disk usage for each directory at every level in a directory tree but also to report the space consumption for each individual file anywhere within the tree. Thus, for example, the following would list the name and size of every directory and file in the /etc directory (which contains system configuration files) for which the user has reading permission:
A somewhat similar report is provided by using the star ( * ) wildcard, which will match any character or characters. For example, the following command would list the sizes of all directories that are in the tree that begins with the current directory:
However, the only files listed are those in the the parent directory, not those in its subdirectories. Also, no total for the directory tree as a whole is provided.
The use of the -s option and the star wildcard together would cause du to report the names and sizes of only the files and directories contained directly in the top level directory itself (and to not list the names of any of its subdirectories and the files in them). The size of each listed directory is, of course, inclusive of all of its files and subdirectories (including all of the files in them). For example, such a report about the directory tree beginning with the current directory would be provided by the following:
The wildcard can also be used to filter the output to list only those items whose names begin with, contain or end with certain characters or sequences of characters. For example, the following would report the names and sizes of all of the directories and files in the current directory whose names begin with the letter s as well as the names and sizes of all levels of subdirectories of those directories regardless of what their names begin with:
The -c option can be added to provide a grand total for all of the files and directories that are listed. In the case of the above example, this would be
As another example of the use of the wildcard, the following command would report the name and size of each gif (one of the two most popular image formats) file in the current directory as well as a total for all of the gifs:
Another useful option is --max-depth=, which instructs du to list its subdirectories and their sizes to any desired level of depth (i.e., to any level of subdirectories) in a directory tree. For example, the following would cause du to list only the first tier (i.e., layer) of directories in the current directory and their sizes (inclusive of all of their contents, including those of their subdirectories):
The total space consumption for the current directory tree will also be reported, and it will, of course, be the same regardless of the depth of the files listed.
Setting --max-depth= to zero tells du to not list any of the subdirectories within the selected directory, i.e., to list only report the size of the selected directory itself. The result is the same as using the -s option.
As is the case with other commands on Unix-like operating systems, du can be linked with pipes to filters to create powerful pipelines of commands. A filter is a (usually) small and specialized program that transforms data in some meaningful way.
For example, to arrange the output items according to size, du can be piped to the sort command, whose -n option tells it to list the output in numeric order with the smallest files first, as follows:
As du will often generate more output than can fit on the monitor screen at one time, the output will fly by at high speed and be virtually unreadable. Fortunately, it is easy to display the output one screenful at a time by piping it to the less filter, for example,
The output of less can be advanced one screenful at a time by pressing the space bar, and it can be moved backward one screenful at a time by pressing the b key.
The output of du can likewise be piped to less after it has been passed through one or more other filters, for example,
The grep filter can be used to search through du's output for any desired string (i.e., sequence of characters). Thus, for example, the following will provide a list of the names and sizes of directories and files in the current directory that contain the word linux:
One way in which du can be used to produce a list of (mostly) directories and files in a directory tree that are consuming large amounts of disk space is to use grep to search for all the lines that contain the upper case letter M (i.e., for megabytes) or G (for gigabytes), such as
The only problem with this approach is that it will also select directories and files that contain an upper case M or G in their names even if the file size is not measured in megabytes or gigabytes. (However, this problem could be overcome through the use of regular expressions, an advanced pattern matching technique).
There are several other ways of monitoring disk space consumption and reporting file sizes. Although very useful tools, they are generally not good substitutes for du.
Among them is the df command, which is likewise used by system administrators to monitor disk usage. However, unlike du, it can only show the space consumption on entire partitions, and it lacks du's fine-grained ability to track the space usage of individual directories and files.
du is not designed to show the space consumption of partitions. The closest that it can come is to show the sizes of the first tier of directories in the root directory (i.e., the directory which contains all other directories and which is represented by a forward slash), several of which may be on their own partitions (depending on how the system has been set up). This is accomplished by becoming the root user and issuing the following command:
The ls (i.e., list) command can provide the sizes of individual files by using its -s option, and its -h option (which is similar to du's -h option) can be added to make the output easier to read. For example, the following would list the names and sizes of the files in the current directory:
Although the names of the first tier of directories within the current directory are also listed, the size data accompanying them does not represent their actual disk space consumption (i.e., inclusive of their contents). Nor does ls report the contents of any lower tiers of directories, unless such directories are specifically listed as arguments.
A convenient alternative for finding the sizes of files and directory trees when using a GUI (graphical user interface) is to click with the right mouse button on the icon (i.e., a small picture or symbol) for that item and then select Properties from the menu that appears. Although this is frequently sufficient, it does not provide the detailed control and reporting that du provides.
Created August 21, 2004. Last updated April 18, 2007.
Copyright © 2004 - 2007 The Linux Information Project. All Rights Reserved.