Table of Contents
Before we move on to look at practical filesystem operations, we are going to look at a more theoretical overview of how filesystems on UNIX-like systems work. Slackware Linux supports many different filesystems, but all these filesystems use virtually the same semantics. These semantics are provided through the Virtual Filesystem (VFS) layer, giving a generic layer for disk and network filesystems.
The filesystem consists of two types of elements: data and metadata. The metadata describes the actual data blocks that are on the disk. Most filesystems use information nodes (inodes) to provide store metadata. Most filesystems store the following data in their inodes:
Table 8.1. Common inode fields
| Field | Description |
|---|---|
| mode | The file permissions. |
| uid | The user ID of the owner of the file. |
| gid | The group ID of the group of the file. |
| size | Size of the file in bytes. |
| ctime | File creation time. |
| mtime | Time of the last file modification. |
| links_count | The number of links pointing to this inode. |
| i_block | Pointers to data blocks |
If you are not a UNIX or Linux afficiendo, these names will probably sound bogus to you, but we will clear them up in the following sections. At any rate, you can probably deduct the relation between inodes and data from this table, and specifically the i_block field: every inode has pointers to the data blocks that the inode provides information for. Together, the inode and data blocks are the actual file on the filesystem.
You may wonder by now where the names of files (and directories) reside, since there is no file name field in the inode. Actually, the names of the files are separated from the inode and data blocks, which allows you to do groovy stuff, like giving the same file more than one name. The filenames are stored in so-called directory entries. These entries specify a filename and the inode of the file. Since directories are also represented by inodes, a directory structure can also be constructed in this manner.
We can simply show how this all works by illustrating what the kernel does when we execute the command cat /home/daniel/note.txt
The system reads the inode of the /home
The system reads the inode of the
homedaniel
The system reads the inode of the
danielnote.txt
The system reads the inode of the
note.txt
As we have described earlier, Linux is a multi-user system. This means that each user has his/her own files (that are usually located in the home directory). Besides that users can be members of a group, which may give the user additional privileges.
As you have seen in the inode field table, every file has a owner and a group. Traditional UNIX access control gives read, write, or executable permissions to the file owner, file group, and other users. These permissions are stored in the mode field of the inode. The mode field represents the file permissions as a four digit octal number. The first digit represents some special options, the second digit stores the owner permissions, the third the group permissions, and the fourth the permissions for other users. The permissions are established by digit by using or adding one of the number in Table 8.2, “Meaning of numbers in the mode octet”
Now, suppose that a file has mode 0644, this means that the file is readable and writable by the owner (6), and readable by the file group (4) and others (4).
Most users do not want to deal with octal numbers, so that is why many utilities can also deal with an alphabetic representation of file permissions. The letters that are listed in Table 8.2, “Meaning of numbers in the mode octet” between parentheses are used in this notation. In the following example information about a file with 0644 permissions is printed. The numbers are replaced by three rwx triplets (the first character can list special mode options).
$ ls -l note.txt
-rw-r--r-- 1 daniel daniel 5 Aug 28 19:39 note.txt
Over the years these traditional UNIX permissions have proven not to be sufficient in some cases. The POSIX 1003.1e specification aimed to extend the UNIX access control model with Access Control Lists (ACLs). Unfortunately this effort stalled, though some systems (like GNU/Linux) have implemented ACLs[4]. Access control lists follow the same semantics as normal file permissions, but give you the opportunity to add rwx triplets for additional users and groups.
The following example shows the access control list of a file. As you can see, the permissions look like normal UNIX permissions (the access rights for the user, group, and others are specified). But there is also an additional entry for the user joe.
user::rwx
user:joe:r--
group::---
mask::r--
other::---
To make matters even more complex (and sophisticated), some GNU/Linux systems add more fine-grained access control through Mandatory Access Control Frameworks (MAC) like SELinux and AppArmor. But these access control frameworks are beyond the scope of this book.
A directory entry that points to an inode is named a hard link. Most files are only linked once, but nothing holds you from linking a file twice. This will increase the links_count field of the inode. This is a nice way for the system to see which inodes and data blocks are free to use. If links_count is set to zero, the inode is not referred to anymore, and can be reclaimed.
Hard links have two limitations. First of all, hard links can not interlink between filessystems, since they point to inodes. Every filesystem has its own inodes and corresponding inode numbers. Besides that, most filesystems do not allow you to create hard links to directories. Allowing creation of hard links to directories could produce directory loops, potentially leading to deadlocks and filesystem inconsistencies. In addition to that, most implementations of rm and rmdir do not know how to deal with such extra directory hard links.
Symbolic links do not have these limitations, because they point to file names, rather than inodes. When the symbolic link is used, the operating system will follow the path to that link. Symbolic links can also refer to a file that does not exist, since it just contains a name. Such links are called dangling links.
![]() |
Note |
|---|---|
|
If you ever get into system administration, it is good to be
aware of the security implications of hard links. If the
For this reason it is a good idea to put any directories
that users can write to on different filesystems. In
practice, this means that it is a good idea to put at least
|
Before going to some more adventurous venues, we will start with some file and directory usage basics.
One of the most common things that you will want to do is to list all or certain files. The ls command serves this purpose very well. Using ls without any arguments will show the contents of the actual directory:
$ ls
dns.txt network-hosts.txt papers
If you use a GNU/Linux distribution, you may also see some
fancy coloring based on the type of file. The standard output
is handy to skim through the contents of a directory, but if
you want more information, you can use the -l parameter. This provides a
so-called long listing for each file:
$ ls -l
total 36
-rw-rw-r-- 1 daniel daniel 12235 Sep 4 15:56 dns.txt
-rw-rw-r-- 1 daniel daniel 7295 Sep 4 15:56 network-hosts.txt
drwxrwxr-x 2 daniel daniel 4096 Sep 4 15:55 papers
This gives a lot more information about the three directory
entries that we have found with ls. The
first column shows the file permissions. The line that shows
the papers.
Files that start with a period (.) will not be shown by most
applications, including ls. You can list
these files too, by adding the -a option to
ls:
$ ls -la
total 60
drwxrwxr-x 3 daniel daniel 4096 Sep 11 10:01 .
drwx------ 88 daniel daniel 4096 Sep 11 10:01 ..
-rw-rw-r-- 1 daniel daniel 12235 Sep 4 15:56 dns.txt
-rw-rw-r-- 1 daniel daniel 7295 Sep 4 15:56 network-hosts.txt
drwxrwxr-x 2 daniel daniel 4096 Sep 4 15:55 papers
-rw-rw-r-- 1 daniel daniel 5 Sep 11 10:01 .settings
As you can see, three more entries have appeared. First of
all, the .settings...
Earlier in this chapter (Section 8.1.1, “inodes, directories and data”) we talked
about inodes. The inode number that a directory entry points
to can be shown with the -i parameter. Suppose that I have
created a hard link to the inode that points to the same inode
as dns.txt
$ ls -i dns*
3162388 dns-newhardlink.txt
3162388 dns.txt
Sometimes you will need some help to determine the type of a
file. This is where the file utility
becomes handy. Suppose that I find a file named
HelloWorld.class
$ file HelloWorld.class
HelloWorld.class: compiled Java class data, version 49.0
That is definitely Java bytecode. file is quite smart, and handles most things you throw at it. For instance, you could ask it to provide information about a device node:
$ file /dev/zero
/dev/zero: character special (1/5)
Or a symbolic link:
$ file /usr/X11R6/bin/X
/usr/X11R6/bin/X: symbolic link to `Xorg'
If you are rather interested in the file
/usr/X11R6/bin/X-L option of
file:
$ file -L /usr/X11R6/bin/X
/usr/X11R6/bin/X: setuid writable, executable, regular file, no read permission
You may wonder why file can determine the file type relatively easy. Most files start of with a so-called magic number, this is a unique number that tells programs that can read the file what kind of file it is. The file program uses a file which describes many file types and their magic numbers. For instance, the magic file on my system contains the following lines for Java compiled class files:
# Java ByteCode
# From Larry Schwimmer (schwim@cs.stanford.edu)
0 belong 0xcafebabe compiled Java class data,
>6 beshort x version %d.
>4 beshort x \b%d
This entry says that if a file starts with a long (32-bit) hexadecimal magic number 0xcafebabe[5], it is a file that holds “compiled Java class data”. The short that follows determines the class file format version.
While we will look at more advanced file integrity checking later, we will have a short look at the cksum utility. cksum can calculate a cyclic redundancy check (CRC) for an input file. This is a mathematically sound method for calculating a unique number for a file. You can use this number to check whether a file is unchanged (for example, after downloading a file from a server). You can specify the file to calculate a CRC for as a parameter to cksum, and cksum will print the CRC, the file size in bytes, and the file name:
$ cksum myfile
1817811752 22638 myfile
Slackware Linux also provides utilities for calculating checksums based on one-way hashes (for instance MD5 or SHA-1).
Since most files on UNIX systems are usually text files, they
are easy to view from a character-based terminal or terminal
emulator. The most primitive way of looking at the contents of
a file is by using cat.
cat reads files that were specified as a
parameter line by line, and will write the lines to the
standard output. So, you can write the contents of the file
note.txt
$ cat note.txt | less
or let less read the file directly:
$ less note.txt
The less paginator lets you scroll forward and backward through a file. Table 8.3, “less command keys” provides an overview of the most important keys that are used to control less
Table 8.3. less command keys
| Key | Description |
|---|---|
| j | Scroll forward one line. |
| k | Scroll backwards one line. |
| f | Scroll forward one screen full of text. |
| b | Scroll backwards one screen full of text. |
| q | Quit less. |
| g | Jump to the beginning of the file. |
| G | Jump to the end of the file. |
| /pattern | Search for the regular expression pattern. |
| n | Search for the next match of the previously specified regular expression. |
| mletter | Mark the current position in the file with letter. |
| 'letter | Jump to the mark letter |
The command keys that can be quantized can be prefixed by a number. For instance 11j scrolls forward eleven lines, and 3n searches the third match of the previously specified regular expression.
Slackware Linux also provides an alternative to less, the older “more” command. We will not go into more here, less is more comfortable, and also more popular these days.
The ls -l output that we have seen earlier provides information about the size of a file. While this usually provides enough information about the size of files, you might want to gather information about collections of files or directories. This is where the du command comes in. By default, du prints the file size per directory. For example:
$ du ~/qconcord
72 /home/daniel/qconcord/src
24 /home/daniel/qconcord/ui
132 /home/daniel/qconcord
By default, du represents the size in 1024
byte units. You can explicitly specify that
du should use 1024 byte units by adding the
-k flag. This is useful
for writing scripts, because some other systems default to
using 512-byte blocks. For example:
$ du -k ~/qconcord
72 /home/daniel/qconcord/src
24 /home/daniel/qconcord/ui
132 /home/daniel/qconcord
If you would also like to see per-file disk usage, you can add
the -a flag:
$ du -k -a ~/qconcord
8 /home/daniel/qconcord/ChangeLog
8 /home/daniel/qconcord/src/concordanceform.h
8 /home/daniel/qconcord/src/textfile.cpp
12 /home/daniel/qconcord/src/concordancemainwindow.cpp
12 /home/daniel/qconcord/src/concordanceform.cpp
8 /home/daniel/qconcord/src/concordancemainwindow.h
8 /home/daniel/qconcord/src/main.cpp
8 /home/daniel/qconcord/src/textfile.h
72 /home/daniel/qconcord/src
12 /home/daniel/qconcord/Makefile
16 /home/daniel/qconcord/ui/concordanceformbase.ui
24 /home/daniel/qconcord/ui
8 /home/daniel/qconcord/qconcord.pro
132 /home/daniel/qconcord
You can also use the name of a file or a wildcard as a
parameter. But this will not print the sizes of files in
subdirectories, unless -a is used:
$ du -k -a ~/qconcord/*
8 /home/daniel/qconcord/ChangeLog
12 /home/daniel/qconcord/Makefile
8 /home/daniel/qconcord/qconcord.pro
8 /home/daniel/qconcord/src/concordanceform.h
8 /home/daniel/qconcord/src/textfile.cpp
12 /home/daniel/qconcord/src/concordancemainwindow.cpp
12 /home/daniel/qconcord/src/concordanceform.cpp
8 /home/daniel/qconcord/src/concordancemainwindow.h
8 /home/daniel/qconcord/src/main.cpp
8 /home/daniel/qconcord/src/textfile.h
72 /home/daniel/qconcord/src
16 /home/daniel/qconcord/ui/concordanceformbase.ui
24 /home/daniel/qconcord/ui
If you want to see the total sum of the disk usage of the
files and subdirectories that a directory holds, use the
-s flag:
$ du -k -s ~/qconcord
132 /home/daniel/qconcord
After having a bird's eye view of directories in Section 8.1.1, “inodes, directories and data”, we will have a look at some directory-related commands.
The ls command that we have looked at in Section 8.2.1, “Listing files” can also be used to list directories in various ways. As we have seen, the default ls output includes directories, and directories can be identified using the first output column of a long listing:
$ ls -l
total 36
-rw-rw-r-- 1 daniel daniel 12235 Sep 4 15:56 dns.txt
-rw-rw-r-- 1 daniel daniel 7295 Sep 4 15:56 network-hosts.txt
drwxrwxr-x 2 daniel daniel 4096 Sep 4 15:55 papers
If a directory name, or if wildcards are specified,
ls will list the contents of the directory,
or the directories that match the wildcard respectively. For example,
if there is a directory paperspaper-d avoid that this recursion
happens:
$ ls -ld paper*
drwxrwxr-x 2 daniel daniel 4096 Sep 4 15:55 papers
You can also recursively list the contents of a directory, and
its subdirectories with the -R parameter:
$ ls -R
.:
dns.txt network-hosts.txt papers
./papers:
cs phil
./papers/cs:
entr.pdf
./papers/phil:
logics.pdf
UNIX provides the mkdir command to create directories. If a relative path is specified, the directory is created in the current active directory. The basic syntax is very simple: mkdir <name>, for example:
$ mkdir mydir
By default, mkdir only creates one
directory level. So, if you use mkdir to
create mydir/mysubdirmydir-p parameter:
$ mkdir -p mydir/mysubdir
rmdir removes a directory. Its behavior is
comparable to mkdir. rmdir
mydir/mysubdir removes mydir/subdirmydir/mysubdirmydir
If a subdirectory that we want to remove contains directory entries, rmdir will fail. If you would like to remove a directory, including all its contents, use the rm command instead.
Files and directories can be copied with the
cp command. In its most basic syntax the
source and the target file are specified. The following
example will make a copy of file1file2
$ cp file1 file2
It is not surprising that relative and absolute paths do also work:
$cp file1 somedir/file2$cp file1 /home/joe/design_documents/file2
You can also specify a directory as the second parameter. If this is the case, cp will make a copy of the file in that directory, giving it the same file name as the original file. If there is more than one parameter, the last parameter will be used as the target directory. For instance
$ cp file1 file2 somedir
will copy both file1file2somedir
$ cat file1 file2 > combined_file
You can also use cp to copy directories, by
adding the -R. This
will recursively copy a directory and all its
subdirectories. If the target directory exists, the source
directory or directories will be placed under the target
directory. If the target directory does not exist, it will be
created if there is only one source directory.
$cp -r mytree tree_copy$mkdir trees$cp -r mytree trees
After executing these commands, there are two copies of the directory
mytreetree_copytrees/mytree
$ cp -R mytree mytree2 newdir
usage: cp [-R [-H | -L | -P]] [-f | -i] [-pv] src target
cp [-R [-H | -L | -P]] [-f | -i] [-pv] src1 ... srcN directory
![]() |
Note |
|---|---|
|
Traditionally, the |
When you are copying files recursively, it is a good idea to
specify the behavior of what cp should do
when a symbolic link is encountered explicitly, if you want
to use cp in portable scripts. The
Single UNIX Specification version 3 does not specify how they
should be handled by default. If -P is used, symbolic links will
not be followed, effectively copying the link itself. If
-H is used, symbolic
links specified as a parameter to cp may be
followed, depending on the type and content of the file. If
-L is used, symbolic
links that were specified as a parameter to
cp and symbolic links that were encountered
while copying recursively may be followed, depending on the
content of the file.
If you want to preserve the ownership, SGID/SUID bits, and the
modification and access times of a file, you can use the
-p flag. This will try to preserve
these properties in the file or directory copy. Good
implementations of cp provide some
additional protection as well - if the target file already
exists, it may not be overwritten if the relevant metadata
could not be preserved.
The UNIX command for moving files, mv, can move or rename files or directories. What actually happens depends on the location of the files or directories. If the source and destination files or directories are on the same filesystem, mv usually just creates new hard links, effectively renaming the files or directories. If both are on different filesystems, the files are actually copied, and the source files or directories are unlinked.
The syntax of mv is comparable to
cp. The most basic syntax renames
file1file2
$ mv file1 file2
The same syntax can be used for two directories as well, which will rename the directory given as the first parameter to the second parameter.
When the last parameter is an existing directory, the file or directory that is specified as the first parameter, is copied to that directory. In this case you can specify multiple files or directories as well. For instance:
$targetdir$mv file1 directory1 targetdir
This creates the directory targetdirfile1directory1
Files and directories can be removed with the
rm
(1)
command. This command unlinks files and directories. If there are no other
links to a file, its inode and disk blocks can be reclaimed for new files. Files can be
removed by providing the files that should be removed as a parameter to
rm
(1)
. If the file is not writable,
rm
(1)
will ask for confirmation. For instance, to remove
file1file2
$ rm file1 file2
If you have to remove a large number of files that require a confirmation before they
can be deleted, or if you want to use
rm
(1)
to remove files from a script that will not be run on a terminal, add the
-f parameter to override the use of prompts. Files
that are not writable, are deleted with the -f
Directories can be removed recursively as well with the -r parameter.
rm
(1)
will traverse the directory structure, unlinking and removing directories as
they are encountered. The same semantics are used as when normal files are removed, as far
as the -f flag is concerned. To give a short example,
you can recursively remove all files and directories in the notes
$ rm -r notes
Since rm (1) command uses the unlink (2) function, data blocks are not rewritten to an uninitialized state. The information in data blocks is only overwritten when they are reallocated and used at a later time. To remove files including their data blocks securely, some systems provide a shred (1) command that overwrites data blocks with random data. But this is not effective on many modern (journaling) filesystems, because they don't write data in place.
The unlink (1) commands provides a one on one implementation of the unlink (2) function. It is of relatively little use, because it can not remove directories.
We touched the subject of file and directory permissions in Section 8.1.2, “File permissions”. In this section, we will look at the chown (1) and chmod (1) commands, that are used to set the file ownership and permissions respectively. After that, we are going to look at a modern extension to permissions named Access Control Lists (ACLs).
As we have seen earlier, every file has an owner (user) ID and a group ID stored in the
inode. The
chown
(1)
command can be used to set these fields. This can be done by the numeric
IDs, or their names. For instance, to change the owner of the file
note.txt
$ chown john:staff note.txt
You can also omit either components, to only set one of both fields. If you want to set the user name, you can also omit the colon. So, the command above can be split up in two steps:
$chown john note.txt$chown :staff note.txt
If you want to change the owner of a directory, and all the files or directories it
holds, you can add the -R to
chown
(1)
:
$ chown -R john:staff notes
If user and group names were specified, rather than IDs, the names are converted by
chown
(1)
. This conversion usually relies on the system-wide password database. If you
are operating on a filesystem that uses another password database (e.g. if you mount a root
filesystem from another system for recovery), it is often useful to change file ownership by
the user or group ID. In this manner, you can keep the relevant user/group name to ID
mappings in tact. So, changing the ownership of note
$ chown 1000:1000 note.txt
After reading the introduction to filesystem permissions in Section 8.1.2, “File permissions”, changing the permission bits that are stored in the inode is fairly easy with the chmod (1) command. chmod (1) accepts both numeric and symbolic representations of permissions. Representing the permissions of a file numerically is very handy, because it allows setting all relevant permissions tersely. For instance:
$ chmod 0644 note.txt
Make note.txt
Symbolic permissions work with addition or subtraction of rights, and allow for relative changes of file permissions. The syntax for symbolic permissions is:
[ugo][-+][rwxst]
The first component specifies the user classes to which the permission change applies (user, group or other). Multiple characters of this component can be combined. The second component takes away rights (-), or adds rights (+). The third component is the access specifier (read, write, execute, set UID/GID on execution, sticky). Multiple components can be specified for this component too. Let's look at some examples to clear this up:
ug+rw # Give read/write rights to the file user and group
chmod go-x # Take away execute rights from the file group and others.
chmod ugo-wx # Disallow all user classes to write to the file and to
# execute the file.
These commands can be used in the following manner with chmod:
$chmod ug+rw note.txt$chmod go-x script1.sh$chmod ugo-x script2.sh
Permissions of files and directories can be changed recursively with the -R. The following command makes the directory
notes
$ chmod -R ugo+r notes
Extra care should be taken with directories, because the x flag has a special meaning in a directory context. Users that have execute rights on directories can access a directory. User that don't have execute rights on directories can not. Because of this particular behavior, it is often easier to change the permissions of a directory structure and its files with help of the find (1) command .
There are a few extra permission bits that can be set that have a special meaning. The SUID and SGID are the most interesting bits of these extra bits. These bits change the active user ID or group ID to that of the owner or group of the file when the file is executed. The su(1) command is a good example of a file that usually has the SUID bit set:
$ ls -l /bin/su
-rwsr-xr-x 1 root root 60772 Aug 13 12:26 /bin/su
This means that the su command runs as the user
root when it is executed. The SUID bit can be set with the
s modifier. For instance, if the SUID bit was not set on
/bin/su
$ chmod u+s /bin/su
![]() |
Note |
|---|---|
|
Please be aware that the SUID and SGID bits have security implications. If a program with these bits set contain a bug, it may be exploited to get privileges of the file owner or group. For this reason it is good manner to keep the number of files with the SUID and SGID bits set to an absolute minimum. |
The sticky bit is also interesting when it comes to
directory. It disallows users to rename of unlink files that
they do not own, in directories that they do have write access
to. This is usually used on world-writeable directories, like
the temporary directory (/tmp
$ chmod g+t /tmp
The question that remains is what initial permissions are used when a file is created. This depends on two factors: the mode flag that was passed to the open(2) system call, that is used to create a file, and the active file creation mask. The file creation mask can be represented as an octal number. The effective permissions for creating the file are determined as mode & ~mask. Or, if represented in an octal fashion, you can substract the digits of the mask from the mode. For instance, if a file is created with permissions 0666 (readable and writable by the file user, file group, and others), and the effective file creation mask is 0022, the effective file permission will be 0644. Let's look at anothere example. Suppose that files are still created with 0666 permissions, and you are more paranoid, and want to take away all read and write permissions for the file group and others. This means you have to set the file creation mask to 0066, because substracting 0066 from 0666 yields 0600
The effective file creation mask can be queried and set with the umask command, that is normally a built-in shell command. The effective mask can be printed by running umask without any parameters:
$ umask
0002
The mask can be set by giving the octal mask number as a parameter. For instance:
$ umask 0066
We can verify that this works by creating an empty file:
$touch test$ls -l test-rw------- 1 daniel daniel 0 Oct 24 00:10 test2
Access Control lists (ACLs) are an extension to traditional UNIX file permissions, that allow for more fine-grained access control. Most systems that support filesystem ACLs implement them as they were specified in the POSIX.1e and POSIX.2c draft specifications. Notable UNIX and UNIX-like systems that implement ACLs according to this draft are FreeBSD, Solaris, and Linux.
As we have seen in Section 8.1.2, “File permissions” access control lists allows you to use read, write and execute triplets for additional users or groups. In contrast to the traditional file permissions, additional access control lists are note stored directly in the node, but in extended attributes that are associated with files. Two thing to be aware of when you use access control lists is that not all systems support them, and not all programs support them.
On most systems that support ACLs, ls uses a visual indicator to show that there are ACLs associated with a file. For example:
$ ls -l index.html
-rw-r-----+ 1 daniel daniel 3254 2006-10-31 17:11 index.html
As you can see, the permissions column shows an additional plus (+) sign. The permission bits do not quite act like you expect them to be. We will get to that in a minute.
The ACLs for a file can be queried with the getfacl command:
$ getfacl index.html
# file: index.html
# owner: daniel
# group: daniel
user::rw-
group::---
group:www-data:r--
mask::r--
other::---
Most lines can be interpreted very easily: the file user has read/write permissions, the file group no permissions, users of the group www-data have read permissions, and other users have no permissions. But why does the group entry list no permissions for the file group, while ls does? The secret is that if there is a mask entry, ls displays the value of the mask, rather than the file group permissions.
The mask entry is used to restrict all list entries with the exception of that of the file user, and that for other users. It is best to memorize the following rules for interpreting ACLs:
The user:: entry permissions correspond with the permissions of the file owner.
The group:: entry permissions correspond with the permissions of the file group, unless there is a mask:: entry. If there is a mask:: entry, the permissions of the group correspond to the group entry with the the mask entry as the maximum of allowed permissions (meaning that the group restrictions can be more restrictive, but not more permissive).
The permissions of other users and groups correspond to their user: and group: entries, with the value of mask:: as their maximum permissions.
The second and third rules can clearly be observed if there us a user or group that has more rights than the mask for the file:
$ getfacl links.html
# file: links.html
# owner: daniel
# group: daniel
user::rw-
group::rw- #effective:r--
group:www-data:rw- #effective:r--
mask::r--
other::---
Although read and write permissions are specified for the file and www-data groups, both groups will effectively only have read permission, because this is the maximal permission that the mask allows.
Another aspect to pay attention to is the handling of ACLs on directories. Access control lists can be added to directories to govern access, but directories can also have default ACLs which specify the initial ACLs for files and directories created under that directory.
Suppose that the directory reports
$ getfacl reports
# file: reports
# owner: daniel
# group: daniel
user::rwx
group::r-x
group:www-data:r-x
mask::r-x
other::---
default:user::rwx
default:group::r-x
default:group:www-data:r-x
default:mask::r-x
default:other::---
New files that are created in the
reports
$ touch reports/test $ getfacl reports/test # file: reports/test # owner: daniel # group: daniel user::rw- group::r-x #effective:r-- group:www-data:r-x #effective:r-- mask::r-- other::---
As you can see, the default ACL was copied. The execute bit is removed from the mask, because the new file was not created with execute permissions.
The ACL for a file or directory can be changed with the
setfacl program. Unfortunately, the
usage of this program highly depends on the system that
is being used. To add to that confusion, at least one
important flag (-d)
has a different meanings on different systems. One can
only hope that this command will get standardized.
Table 8.4. System-specific setfacl flags
| Operation | Linux |
|---|---|
| Set entries, removing all old entries | --set |
| Modify entries | -m |
| Modify default ACL entries | -d |
| Delete entry | -x |
| Remove all ACL entries (except for the three required entries. | -b |
| Recalculate mask |
Always recalculated, unless -n is used, or an mask
entry expicitly specified.
|
| Use ACL specification from a file |
-M (modify),
-X (delete),
or --restore
|
| Recursive modification of ACLs | -R |
As we have seen in the previous section, entries can be specified for users and groups, by using the following syntax: user/group:name:permissions. Permissions can be specified as a triplet by using the letters r (read), w (write), or x (execute). A dash (-) should be used for permissions that you do not want to give to the user or group, since Solaris requires this. If you want to disallow access completely, you can use the --- triplet.
The specification for other users, and the mask follows this format: other:r-x. The following slightly more predictable format can also be used: other::r-x.
The simplest operation is to modify an ACL entry. This
will create a new entry if the entry does not exist
yet. Entries can be modified with the -m. For instance, suppose that
we want to give the group friend read
and write access to the file
report.txt
$ setfacl -m group:friends:rw- report.txt
The mask entry will be recalculated, setting it to the union of all group entries, and additional user entries:
$ getfacl report.txt
# file: report.txt
# owner: daniel
# group: daniel
user::rw-
group::r--
group:friends:rw-
mask::rw-
other::r--
You can combine multiple ACL entries by separating them with a comma character. For instance:
$ setfacl -m group:friends:rw-,group:foes:--- report.txt
An entry can be removed with the -x option:
$ setfacl -x group:friends: report.txt
The trailing colon can optionally be omitted.
The --set option
is provided create a new access control list
for a file, clearing all existing entries,
except for the three required entries.
It is required that the file user, group and
other entries are also specified. For example:
$ setfacl --set user::rw-,group::r--,other:---,group:friends:rwx report.txt
If you do not want to clean the user, group, and other
permissions, but do want to clear all other ACL entries,
you can use the -b
option. The following example uses this in combination
with the -m option
to clear all ACL entries (except for user, group, and other),
and to add an entry for the friends
group:
$ setfacl -b -m group:friends:rw- report.txt
As we have seen in Section 8.5.4, “Access Control Lists”, directories
can have default ACL entries that specify what permissions
should be used for files and directories that are created
below that directory. The -d
option is used to operate on default entries:
$setfacl -d -m group:friends:rwx reports$getfacl reports# file: reports # owner: daniel # group: daniel user::rwx group::r-x other::r-x default:user::rwx default:group::r-x default:group:friends:rwx default:mask::rwx default:other::r-x
You can also use an ACL specification from file, rather than specifying it on the command line. An input file follows the same syntax as specifying entries as a parameter to setfacl, but the entries are separated by newlines, rather than by commas. This is very useful, because you can use the ACL for an existing file as a reference:
$ getfacl report.txt > ref
The -M option
is provided to modify the ACL for a
file by reading the entries from a file. So, if we have a
file named report2.txtref
$ setfacl -M ref report2.txt
If you would like to start with a clean ACL, and add the
entries from ref-b flag that we
encountered earlier:
$ setfacl -b -M ref report2.txt
Of course, it is not necessary to use this interim file. We can directly pipe the output from getfacl to setfacl, by using the symbolic name for the standard input (-), rather than the name of a file:
$ getfacl report.txt | setfacl -b -M - report2.txt
The -X removes
the ACL entries defined in a file. This follows the same syntax as the
-x flag, with
commas replaced by newlines.
The find command is without doubt the most comprehensive utility to find files on UNIX systems. Besides that it works in a simple and predictable way: find will traverse the directory tree or trees that are specified as a parameter to find. Besides that a user can specify an expression that will be evaluated for each file and directory. The name of a file or directory will be printed if the expression evaluates to true. The first argument that starts with a dash (-), exclamation mark (!, or an opening parenthesis ((, signifies the start of the expression. The expression can consist of various operands. To wrap it up, the syntax of find is: find paths expression.
The simplest use of find is to use no expression. Since this matches every directory and subdirectory entry, all files and directories will be printed. For instance:
$ find .
.
./economic
./economic/report.txt
./economic/report2.txt
./technical
./technical/report2.txt
./technical/report.txt
You can also specify multiple directories:
$ find economic technical
economic
economic/report.txt
economic/report2.txt
technical
technical/report2.txt
technical/report.txt
One common scenario for finding files or directories is to
look them up by name. The -name operand
can be used to match objects that have a certain name, or
match a particular wildcard. For instance, using the operand
-name 'report.txt' will only be true
for files or directories with the name
report.txt
$ find economic technical -name 'report.txt'
economic/report.txt
technical/report.txt
The same thing holds for wildcards:
$ find economic technical -name '*2.txt'
economic/report2.txt
technical/report2.txt
![]() |
Note |
|---|---|
|
When using find you will want to pass the wildcard to find, rather than letting the shell expand it. So, make sure that patterns are either quoted, or that wildcards are escaped. |
It is also possible to evaluate the type of the object with the -type c operand, where c specifies the type to be matched. Table 8.5, “Parameters for the '-type' operand” lists the various object types that can be used.
Table 8.5. Parameters for the '-type' operand
| Parameter | Meaning |
|---|---|
| b | Block device file |
| c | Character device file |
| d | Directory |
| f | Regular file |
| l | Symbolic link |
| p | FIFO |
| s | Socket |
So, for instance, if you would like to match directories, you could use the d parameter to -type operand:
$ find . -type d
.
./economic
./technical
We will look at forming a complex expression at the end of this section about find, but at this moment it is handy to know that you can make a boolean 'and' expression by specifying multiple operands. For instance operand1 operand2 is true if both operand1 and operand2 are true for the object that is being evaluated. So, you could combine the -name and -type operands to find all directories that start with eco:
$ find . -name 'eco*' -type d
./economic
Besides matching objects by their name or type, you can also match them by their active permissions or the object ownership. This is often useful to find files that have incorrect permissions or ownership.
The owner (user) or group of an object can be matched with respectively the -user username and -group groupname variants. The name of a user or group will be interpreted as a user ID or group ID of the name is decimal, and could not be found on the system with getpwnam(3) or getgrnam(3). So, if you would like to match all objects of which joe is the owner, you can use -user joe as an operand:
$ find . -user joe
./secret/report.txt
Or to find all objects with the group friend as the file group:
$ find . -group friends
./secret/report.txt
The operand for checking file permissions -perm is less trivial. Like the chmod command this operator can work with octal and symbolic permission notations. We will start with looking at the octal notation. If an octal number is specified as a parameter to the -perm operand, it will match all objects that have exactly that permissions. For instance, -perm 0600 will match all objects that are only readable and writable by the user, and have no additional flags set:
$ find . -perm 0600
./secret/report.txt
If a dash is added as a prefix to a number, it will match every object that has at least the bits set that are specified in the octal number. A useful example is to find all files which have at least writable bits set for other users with -perm -0002. This can help you to find device nodes or other objects with insecure permissions.
$ find /dev -perm -0002
/dev/null
/dev/zero
/dev/ctty
/dev/random
/dev/fd/0
/dev/fd/1
/dev/fd/2
/dev/psm0
/dev/bpsm0
/dev/ptyp0
![]() |
Note |
|---|---|
|
Some device nodes have to be world-writable for a UNIX
system to function correctly. For instance, the
|
The symbolic notation of -perm parameters uses the same notation as the chmod command. Symbolic permissions are built with a file mode where all bits are cleared, so it is never necessary to use a dash to take away rights. This also prevents ambiguity that could arise with the dash prefix. Like the octal syntax, prefixing the permission with a dash will match objects that have at least the specified permission bits set. The use of symbolic names is quite predictable - the following two commands repeat the previous examples with symbolic permissions:
$ find . -perm u+rw
./secret/report.txt
$ find /dev -perm -o+w
/dev/null
/dev/zero
/dev/ctty
/dev/random
/dev/fd/0
/dev/fd/1
/dev/fd/2
/dev/psm0
/dev/bpsm0
/dev/ptyp0
There are three operands that operate on time intervals. The syntax of the operand is operand n, where n is the time in days. All three operators calculate a time delta in seconds that is divided by the the number of seconds in a day (86400), discarding the remainder. So, if the delta is one day, operand 1 will match for the object. The three operands are:
-atime n - this operand evaluates to true if the initialization time of find minus the last access time of the object equals to n.
-ctime n - this operand evaluates to true if the initialization time of find minus the time of the latest change in the file status information equals to n.
-mtime n - this operand evaluates to true if the initialization time of find minus the latest file change time equals to n.
So, these operands match if the latest access, change,
modification respectively was n days
ago. To give an example, the following command shows all
objects in /etc
$ find /etc -mtime 1
/etc
/etc/group
/etc/master.passwd
/etc/spwd.db
/etc/passwd
/etc/pwd.db
The plus or minus sign can be used as modifiers for the meaning
of n. +n means more
than n days, -n
means less than n days. So, to find all
files in /etc
$ find /etc -mtime -2
/etc
/etc/network/run
/etc/network/run/ifstate
/etc/resolv.conf
/etc/default
/etc/default/locale
[...]
Another useful time-based operand is the -newer
reffile operand. This matches all files that were
modified later that the file with filename
reffileeconomic/report2.txt
$ find . -newer economic/report2.txt
.
./technical
./technical/report2.txt
./technical/report.txt
./secret
./secret/report.txt
Some operands affect the manner in which the
find command traverses the tree. The
first of these operands is the -xdev
operand. -xdev prevents that
find decends into directories that have a
different device ID, effectively avoiding traversal of other
filesystems. The directory to which the filesystem is
mounted, is printed, because this operand always returns
true. A nice example is a system where
/usr/
$ find / -name 'bin' -type d
/usr/bin
/bin
But if we add -xdev
/usr/bin
$ find / -name 'bin' -type d -xdev
/bin
The -depth operand modifies the order in which directories are evaluated. With -depth the contents of a directory are evaluated first, and then the directory itself. This can be witnessed in the following example:
$ find . -depth
./economic/report.txt
./economic/report2.txt
./economic
./technical/report2.txt
./technical/report.txt
./technical
.
As you can see in the output, files in the
./economic directory is evaluated
before ../economic/report.txt./economic
Finally, the -prune operand causes find not to decend into a directory that is being evaluated. -prune is discarded if the -depth operand is also used. -depth always evaluates to true.
find becomes a very powerful tool when it is combined with external utilities. This can be done with the -exec operand. There are two syntaxes for the -exec operand. The first syntax is -exec utility arguments ;. The command utility will be executed with the arguments that were specified for each object that is being evaluated. If any of the arguments is {}, these braces will be replaced by the file being evaluated. This is very handy, especially when we consider that, if we use no additional expression syntax, operands will be evaluated from left to right. Let's look at an example:
$ find . -perm 0666 -exec chmod 0644 {} \;
The first operand returns true for files that have their permissions set to 0666. The second operand executes chmod 0644 filename for each file that is being evaluated. If you were wondering why this command is not executed for every file, that is a good question. Like many other interpreters of expressions, find uses “short-circuiting”. Because no other operator was specified, the logical and operator is automatically is assumed between both operands. If the first operand evaluates to false, it makes no sense to evaluate any further operands, because the complete expression will always evaluate to false. So, the -exec operand will only be evaluated if the first operand is true. Another particularity is that the semi-colon that closes the -exec is escaped, to prevent that the shell parses it.
A nice thing about the -exec operator is that it evaluates to true if the command terminated sucessfully. So, you could also use the -exec command to add additional conditions that are not represented by find operands. For instance, the following command prints all objects ending with .txt that contain the string gross income:
$ find . -name '*.txt' -exec grep -q 'gross income' {} \; -print
./economic/report2.txt
The grep command will be covered lateron. But for the moment, it is enough to know that it can be used to match text patterns. The -print operand prints the current object path. It is always used implicitly, except when the -exec or -ok operands are used.
The second syntax of the -exec operand is -exec utility arguments {} +. This gathers a set of all matched object for which the expression is true, and provides this set of files as an argument to the utility that was specified. The first example of the -exec operand can also be written as:
$ find . -perm 0666 -exec chmod 0644 {} +
This will execute the chmod command only once, with all files for which the expression is true as its arguments. This operand always returns true.
If a command executed by find returns a non-zero value (meaning that the execution of the command was not succesful), find should also return a non-zero value.
find provides some operators that can be combined to make more complex expressions:
Operators
Evaluates to true if expr evaluates to true.
Evaluates to true if both expr1 and expr2 are true. If -a is omitted, this operator is implicitly assumed.
find will use short-circuiting when this operator is evaluated: expr2 will not be evaluated when expr1 evaluates to false
Evaluates to true if either or both expr1 and expr2 are true.
find will use short-circuiting when this operator is evaluated: expr2 will not be evaluated when expr1 evaluates to true
Negates expr. So, if expr evaluates to true, this expression will evaluate to false and vise versa.
Since both the parentheses and exclamation mark characters are interpreted by most shells, they should usually be escaped.
The following example shows some operators in action. This command executes chmod for all files that either have their permissions set to 0666 or 0664.
$ find . \( -perm 0666 -o -perm 0664 \) -exec chmod 0644 {} \;
The which command is not part of the Single UNIX Specification version 3, but it is provided by most sysmtems. which locates a command that is in the user's path (as set by the PATH environment variable), printing its full path. Providing the name of a command as its parameter will show the full path:
$ which ls
/bin/ls
You can also query the paths of multiple commands:
$ which ls cat
/bin/ls
/bin/cat
which returns a non-zero return value if the command could not be found.
This whereis command searches binaries, manual pages and sources of a command in some predefined places. For instance, the following command shows the path of the ls and the ls(1) manual page:
$ whereis ls
ls: /bin/ls /usr/share/man/man1/ls.1.gz
Slackware Linux also provides the locate command that searches through a file database that can be generated periodically with the updatedb command. Since it uses a prebuilt database of the filesystem, it is a lot faster than command, especially when directory entry information has not been cached yet. Though, the locate/updatedb combo has some downsides:
New files are not part of the database until the next updatedb invocation.
locate has no conception of permissions, so users may locate files that are normally hidden to them.
A newer implementation, named slocate deals with permissions, but requires elevated privileges. This is the locate variation that is included with Slackware Linux.
With filesystems becoming faster, and by applying common sense when formulating find queries, locate does not really seem worth the hassle. Of course, your mileage may vary. That said, the basic usage of locate is locate filename. For example:
$ locate locate
/usr/bin/locate
/usr/lib/locate
/usr/lib/locate/bigram
/usr/lib/locate/code
/usr/lib/locate/frcode
[...]
Sooner or later a GNU/Linux user will encounter tar archives, tar is the standard format for archiving files on GNU/Linux. It is often used in conjunction with gzip or bzip2. Both commands can compress files and archives. Table 8.6, “Archive file extensions” lists frequently used archive extensions, and what they mean.
Table 8.6. Archive file extensions
| Extension | Meaning |
|---|---|
| .tar | An uncompressed tar archive |
| .tar.gz | A tar archive compressed with gzip |
| .tgz | A tar archive compressed with gzip |
| .tar.bz2 | A tar archive compressed with bzip2 |
| .tbz | A tar archive compressed with bzip2 |
The difference between bzip2 and gzip is that bzip2 can find repeating information in larger blocks, resulting in better compression. But bzip2 is also a lot slower, because it does more data analysis.
Since many software and data in the GNU/Linux world is
archived with tar it is important to get
used to extracting tar archives. The first thing you will
often want to do when you receive a tar archive is to list its
contents. This can be achieved by using the t parameter. However, if we just
execute tar with this parameter and the
name of the archive it will just sit and wait until you enter
something to the standard input:
$ tar t test.tar
This happens because tar reads data from its standard input. If you forgot how redirection works, it is a good idea to reread Section 7.7, “Redirections and pipes”. Let's see what happens if we redirect our tar archive to tar:
$ tar t < test.tar
test/
test/test2
test/test1
That looks more like the output you probably expected. This
archive seems to contain a directory
testtest2test2f parameter:
$ tar tf test.tar
test/
test/test2
test/test1
This looks like an archive that contains useful files ;). We
can now go ahead, and extract this archive by using the
x parameter:
$ tar xf test.tar
We can now verify that tar really extracted the archive by listing the contents of the directory with ls:
$ ls test/
test1 test2
Extracting or listing files from a gzipped or bzipped archive
is not much more difficult. This can be done by adding a
z or b for respectively archives
compressed with gzip or
bzip2. For example, we can list the
contents of a gzipped archive with:
$ tar ztf archive2.tar.gz
And a bzipped archive can be extracted with:
$ tar bxf archive3.tar.bz2
You can create archives with the c parameter. Suppose that we have
the directory testtest
$ tar cf important-files.tar test
This will create the important-files.tarf parameter). We can now verify
the archive:
$ tar tf important-files.tar
test/
test/test2
test/test1
Creating a gzipped or bzipped archive goes along the same
lines as extracting compressed archives: add a z for gzipping an archive, or
b for bzipping an
archive. Suppose that we wanted to create a
gzip compressed version of the archive
created above. We can do this with:
tar zcf important-files.tar.gz test
Like most Unices Linux uses a technique named
“mounting” to access filesystems. Mounting means
that a filesystem is connected to a directory in the root
filesystem. One could for example mount a CD-ROM drive to the
/mnt/cdrom
The mount is used to mount filesystems. The basic syntax is: “mount /dev/devname /mountpoint”. The device name can be any block device, like hard disks or CD-ROM drives. The mount point can be an arbitrary point in the root filesystem. Let's look at an example:
# mount /dev/cdrom /mnt/cdrom
This mounts the /dev/cdrom/mnt/cdrom/dev/cdrom/dev/hdc-t parameter:
# mount -t vfat /dev/sda1 /mnt/flash
This mounts the vfat filesystem on
/dev/sda1/mnt/flash
The umount command is used to unmount filesystems. umount accepts two kinds of parameters, mount points or devices. For example:
#umount /mnt/cdrom#umount /dev/sda1
The first command unmounts the filesystem that was mounted on
/mnt/cdrom/dev/sda1
The GNU/Linux system has a special file,
/etc/fstab
/dev/hda10 swap swap defaults 0 0
/dev/hda5 / xfs defaults 1 1
/dev/hda6 /var xfs defaults 1 2
/dev/hda7 /tmp xfs defaults 1 2
/dev/hda8 /home xfs defaults 1 2
/dev/hda9 /usr xfs defaults 1 2
/dev/cdrom /mnt/cdrom iso9660 noauto,owner,ro 0 0
/dev/fd0 /mnt/floppy auto noauto,owner 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0
proc /proc proc defaults 0 0
As you can see each entry in the fstab
The fs_spec option specifies the block device, or remote
filesystem that should be mounted. As you can see in the
example several /dev/hda partitions are specified, as well
as the CD-ROM drive and floppy drive. When NFS volumes are
mounted an IP address and directory can be specified, for
example: 192.168.1.10:/exports/data
fs_file specifies the mount point. This can be an arbitrary directory in the filesystem.
This option specifies what kind of filesystem the entry represents. For example this can be: ext2, ext3, reiserfs, xfs, nfs, vfat, or ntfs.
The fs_mntops option specifies which parameters should be used for mounting the filesystem. The mount manual page has an extensive description of the available options. These are the most interesting options:
noauto: filesystems that are listed
in /etc/fstab
user: adding the “user” option will allow normal users to mount the filesystem (normally only the superuser is allowed to mount filesystems).
owner: the “owner” option will allow the owner of the specified device to mount the specified device. You can see the owner of a device using ls, e.g. ls -l /dev/cdrom.
noexec: with this option enabled users can not run files from the mounted filesystem. This can be used to provide more security.
nosuid: this option is comparable to the “noexec” option. With “nosuid” enabled SUID bits on files on the filesystem will not be allowed. SUID is used for certain binaries to provide a normal user to do something privileged. This is certainly a security threat, so this option should really be used for removable media, etc. A normal user mount will force the nosuid option, but a mount by the superuser will not!
unhide: this option is only relevant for normal CD-ROMs with the ISO9660 filesystem. If “unhide” is specified hidden files will also be visible.
If the “fs_freq” is set to 1 or higher, it specifies after how many days a filesystem dump (backup) has to be made. This option is only used when dump is installed, and set up correctly to handle this.
There are two security mechanisms for securing files: signing files and encrypting files. Signing a file means that a special digital signature is generated for a file. You, or other persons can use the signature to verify the integrity of the file. File encryption encodes a file in a way that only a person for which the file was intended to read can read the file.
This system relies on two keys: the private and the public key. Public keys are used to encrypt files, and files can only be decrypted with the private key. This means that one can sent his public key out to other persons. Others can use this key to send encrypted files, that only the person with the private key can decode. Of course, this means that the security of this system depends on how well the private is kept secret.
Slackware Linux provides an excellent tool for signing and encrypting files, named GnuPG. GnuPG can be installed from the “n” disk set.