CAPABILITIES(7)            Linux Programmer's Manual           CAPABILITIES(7)
       capabilities - overview of Linux capabilities

       For  the  purpose  of  performing  permission  checks, traditional Unix
       implementations distinguish two  categories  of  processes:  privileged
       processes  (whose  effective  user ID is 0, referred to as superuser or
       root), and unprivileged processes (whose effective  UID  is  non-zero).
       Privileged processes bypass all kernel permission checks, while unpriv-
       ileged processes are subject to full permission checking based  on  the
       process's  credentials (usually: effective UID, effective GID, and sup-
       plementary group list).

       Starting with kernel 2.2, Linux divides  the  privileges  traditionally
       associated  with  superuser into distinct units, known as capabilities,
       which can be independently enabled and disabled.   Capabilities  are  a
       per-thread attribute.

   Capabilities List
       As at Linux 2.6.14, the following capabilities are implemented:

       CAP_AUDIT_CONTROL (since Linux 2.6.11)
              Enable  and  disable  kernel  auditing;  change  auditing filter
              rules; retrieve auditing status and filtering rules.

       CAP_AUDIT_WRITE (since Linux 2.6.11)
              Allow records to be written to kernel auditing log.

              Allow arbitrary changes to file UIDs and GIDs (see chown(2)).

              Bypass file read, write, and execute permission checks.  (DAC  =
              "discretionary access control".)

              Bypass  file  read permission checks and directory read and exe-
              cute permission checks.

              Bypass permission checks on operations that normally require the
              file  system  UID  of  the  process to match the UID of the file
              (e.g., chmod(2), utime(2)), excluding those  operations  covered
              by  the  CAP_DAC_OVERRIDE  and CAP_DAC_READ_SEARCH; set extended
              file attributes (see chattr(1)) on arbitrary files;  set  Access
              Control Lists (ACLs) on arbitrary files; ignore directory sticky
              bit on file deletion; specify O_NOATIME for arbitrary  files  in
              open(2) and fcntl(2).

              Don't  clear  set-user-ID  and  set-group-ID bits when a file is
              modified; permit setting of the  set-group-ID  bit  for  a  file
              whose  GID  does not match the file system or any of the supple-
              mentary GIDs of the calling process.

              Permit memory  locking  (mlock(2),  mlockall(2),  mmap(2),  shm-

              Bypass permission checks for operations on System V IPC objects.

              Bypass permission checks  for  sending  signals  (see  kill(2)).
              This includes use of the KDSIGACCEPT ioctl.

              (Linux  2.4  onwards)   Allow  file  leases to be established on
              arbitrary files (see fcntl(2)).

              Allow  setting  of  the  EXT2_APPEND_FL  and   EXT2_IMMUTABLE_FL
              extended file attributes (see chattr(1)).

              (Linux  2.4  onwards)  Allow  creation  of  special  files using

              Allow various network-related operations (e.g.,  setting  privi-
              leged  socket options, enabling multicasting, interface configu-
              ration, modifying routing tables).

              Allow binding to Internet domain  reserved  socket  ports  (port
              numbers less than 1024).

              (Unused)  Allow socket broadcasting, and listening multicasts.

              Permit use of RAW and PACKET sockets.

              Allow  arbitrary manipulations of process GIDs and supplementary
              GID list; allow forged GID when passing socket  credentials  via
              Unix domain sockets.

              Grant  or  remove any capability in the caller's permitted capa-
              bility set to or from any other process.

              Allow  arbitrary  manipulations  of  process  UIDs   (setuid(2),
              setreuid(2),  setresuid(2),  setfsuid(2)); allow forged UID when
              passing socket credentials via Unix domain sockets.

              Permit a range of system  administration  operations  including:
              quotactl(2),   mount(2),   umount(2),   swapon(2),   swapoff(2),
              sethostname(2), setdomainname(2), IPC_SET  and  IPC_RMID  opera-
              tions  on  arbitrary System V IPC objects; perform operations on
              trusted and security Extended  Attributes  (see  attr(5));  call
              lookup_dcookie(2);  use  ioprio_set(2) to assign IOPRIO_CLASS_RT
              and IOPRIO_CLASS_IDLE I/O scheduling classes; perform  keyctl(2)
              KEYCTL_CHOWN  and  KEYCTL_SETPERM  operations.  allow forged UID
              when passing socket credentials;  exceed  /proc/sys/fs/file-max,
              the  system-wide  limit  on  the number of open files, in system
              calls that open  files  (e.g.,  accept(2),  execve(2),  open(2),
              pipe(2);  without  this  capability these system calls will fail
              with the error ENFILE if  this  limit  is  encountered);  employ
              CLONE_NEWNS   flag   with   clone(2)   and  unshare(2);  perform
              KEYCTL_CHOWN and KEYCTL_SETPERM keyctl(2) operations.

              Permit calls to reboot(2) and kexec_load(2).

              Permit calls to chroot(2).

              Allow loading and unloading of kernel modules;  allow  modifica-
              tions   to  capability  bounding  set  (see  init_module(2)  and

              Allow raising process nice value (nice(2),  setpriority(2))  and
              changing  of  the nice value for arbitrary processes; allow set-
              ting of real-time scheduling policies for calling  process,  and
              setting  scheduling  policies  and priorities for arbitrary pro-
              cesses  (sched_setscheduler(2),  sched_setparam(2));   set   CPU
              affinity for arbitrary processes (sched_setaffinity(2)); set I/O
              scheduling  class   and   priority   for   arbitrary   processes
              (ioprio_set(2));  allow  migrate_pages(2) to be applied to arbi-
              trary processes and allow processes to be migrated to  arbitrary
              nodes; allow move_pages(2) to be applied to arbitrary processes;
              use the MPOL_MF_MOVE_ALL flag with mbind(2) and move_pages(2).

              Permit calls to acct(2).

              Allow arbitrary processes to be traced using ptrace(2)

              Permit I/O  port  operations  (iopl(2)  and  ioperm(2));  access

              Permit:  use  of  reserved  space on ext2 file systems; ioctl(2)
              calls controlling ext3 journaling; disk quota limits to be over-
              ridden;  resource  limits  to  be  increased (see setrlimit(2));
              RLIMIT_NPROC resource limit to be overridden;  msg_qbytes  limit
              for   a   message   queue  to  be  raised  above  the  limit  in
              /proc/sys/kernel/msgmnb (see msgop(2) and msgctl(2).

              Allow modification of system clock  (settimeofday(2),  stime(2),
              adjtimex(2)); allow modification of real-time (hardware) clock

              Permit calls to vhangup(2).

   Capability Sets
       Each  thread  has  three capability sets containing zero or more of the
       above capabilities:

              the capabilities used by the kernel to perform permission checks
              for the thread.

              the  capabilities  that  the thread may assume (i.e., a limiting
              superset for the effective and inheritable sets).  If  a  thread
              drops  a  capability  from  its  permitted set, it can never re-
              acquire that capability (unless it  exec()s  a  set-user-ID-root

              the capabilities preserved across an execve(2).

       A  child created via fork(2) inherits copies of its parent's capability
       sets.  See below for a discussion of the treatment of capabilities dur-
       ing exec().

       Using  capset(2),  a thread may manipulate its own capability sets, or,
       if it has the CAP_SETPCAP capability, those of a thread in another pro-

   Capability bounding set
       When  a program is execed, the permitted and effective capabilities are
       ANDed with the current value of the so-called capability bounding  set,
       defined  in the file /proc/sys/kernel/cap-bound.  This parameter can be
       used to place a system-wide limit on the capabilities  granted  to  all
       subsequently  executed programs.  (Confusingly, this bit mask parameter
       is expressed as a signed decimal number in /proc/sys/kernel/cap-bound.)

       Only  the  init  process  may  set bits in the capability bounding set;
       other than that, the superuser may only clear bits in this set.

       On a standard system the capability bounding set always masks  out  the
       CAP_SETPCAP  capability.  To remove this restriction (dangerous!), mod-
       ify the definition of  CAP_INIT_EFF_SET  in  include/linux/capability.h
       and rebuild the kernel.

       The  capability  bounding  set feature was added to Linux starting with
       kernel version 2.2.11.

   Current and Future Implementation
       A full implementation of capabilities requires:

       2.  that the kernel provide system calls allowing a thread's capability
           sets to be changed and retrieved.

       3.  file  system  support  for  attaching capabilities to an executable
           file, so that a process gains those capabilities when the  file  is

       As at Linux 2.6.14, only the first two of these requirements are met.

       Eventually,  it  should  be possible to associate three capability sets
       with an executable file, which, in conjunction with the capability sets
       of  the  thread,  will  determine the capabilities of a thread after an

       Inheritable (formerly known as allowed):
              this set is ANDed with the thread's inheritable set to determine
              which inheritable capabilities are permitted to the thread after
              the exec().

       Permitted (formerly known as forced):
              the capabilities automatically permitted to the thread,  regard-
              less of the thread's inheritable capabilities.

              those capabilities in the thread's new permitted set are also to
              be set in the new effective set.  (F(effective)  would  normally
              be either all zeroes or all ones.)

       In the meantime, since the current implementation does not support file
       capability sets, during an exec():

       1.  All three file capability sets are initially assumed to be cleared.

       2.  If  a set-user-ID-root program is being execed, or the real user ID
           of the process is 0 (root) then the file inheritable and  permitted
           sets are defined to be all ones (i.e., all capabilities enabled).

       3.  If  a  set-user-ID-root  program  is  being executed, then the file
           effective set is defined to be all ones.

   Transformation of Capabilities During exec()
       During an exec(), the kernel calculates the  new  capabilities  of  the
       process using the following algorithm:

           P'(permitted) = (P(inheritable) & F(inheritable)) |
                           (F(permitted) & cap_bset)

           P'(effective) = P'(permitted) & F(effective)

           P'(inheritable) = P(inheritable)    [i.e., unchanged]


       P         denotes  the  value  of  a  thread  capability set before the

       P'        denotes the value of a capability set after the exec()

       F         denotes a file capability set

       cap_bset  is the value of the capability bounding set.

       In the current implementation, the upshot of  this  algorithm  is  that
       when  a  process  exec()s a set-user-ID-root program, or when a process
       with an effective UID of 0 exec()s a program, it gains all capabilities
       in its permitted and effective capability sets, except those masked out
       by the capability bounding  set  (i.e.,  CAP_SETPCAP).   This  provides
       semantics  that are the same as those provided by traditional Unix sys-

   Effect of User ID Changes on Capabilities
       To preserve the traditional semantics for  transitions  between  0  and
       non-zero user IDs, the kernel makes the following changes to a thread's
       capability sets on changes to the thread's real, effective, saved  set,
       and file system user IDs (using setuid(2), setresuid(2), or similar):

       1.  If  one  or  more  of the real, effective or saved set user IDs was
           previously 0, and as a result of the UID changes all of  these  IDs
           have  a  non-zero value, then all capabilities are cleared from the
           permitted and effective capability sets.

       2.  If the effective user ID is changed from 0 to  non-zero,  then  all
           capabilities are cleared from the effective set.

       3.  If  the  effective  user ID is changed from non-zero to 0, then the
           permitted set is copied to the effective set.

       4.  If the file system user ID is changed from 0 to non-zero (see setf-
           suid(2))  then  the  following  capabilities  are  cleared from the
           effective set:  CAP_CHOWN,  CAP_DAC_OVERRIDE,  CAP_DAC_READ_SEARCH,
           CAP_FOWNER, and CAP_FSETID.  If the file system UID is changed from
           non-zero to 0, then any of these capabilities that are  enabled  in
           the permitted set are enabled in the effective set.

       If a thread that has a 0 value for one or more of its user IDs wants to
       prevent its permitted capability set being cleared when it  resets  all
       of  its  user  IDs  to  non-zero values, it can do so using the prctl()
       PR_SET_KEEPCAPS operation.

       The libcap package provides a suite of routines for setting and getting
       capabilities  that  is  more comfortable and less likely to change than
       the interface provided by capset(2) and capget(2).

       No standards govern capabilities, but the Linux capability  implementa-
       tion is based on the withdrawn POSIX.1e draft standard.

       There  is  as  yet  no  file system support allowing capabilities to be
       associated with executable files.

       capget(2), prctl(2), setfsuid(2), pthreads(7)

Linux 2.6.18                      2006-07-31                   CAPABILITIES(7)