Theory Unix

theory Unix
imports Nested_Environment Sublist
(*  Title:      HOL/Unix/Unix.thy
    Author:     Markus Wenzel, TU Muenchen

section ‹Unix file-systems \label{sec:unix-file-system}›

theory Unix

text ‹
  We give a simple mathematical model of the basic structures underlying the
  Unix file-system, together with a few fundamental operations that could be
  imagined to be performed internally by the Unix kernel. This forms the basis
  for the set of Unix system-calls to be introduced later (see
  \secref{sec:unix-trans}), which are the actual interface offered to
  processes running in user-space.

  Basically, any Unix file is either a ∗‹plain file› or a ∗‹directory›,
  consisting of some ∗‹content› plus ∗‹attributes›. The content of a plain
  file is plain text. The content of a directory is a mapping from names to
  further files.⁋‹In fact, this is the only way that names get associated with
  files. In Unix files do ∗‹not› have a name in itself. Even more, any number
  of names may be associated with the very same file due to ∗‹hard links›
  (although this is excluded from our model).› Attributes include information
  to control various ways to access the file (read, write etc.).

  Our model will be quite liberal in omitting excessive detail that is easily
  seen to be ``irrelevant'' for the aspects of Unix file-systems to be
  discussed here. First of all, we ignore character and block special files,
  pipes, sockets, hard links, symbolic links, and mount points.

subsection ‹Names›

text ‹
  User ids and file name components shall be represented by natural numbers
  (without loss of generality). We do not bother about encoding of actual
  names (e.g.\ strings), nor a mapping between user names and user ids as
  would be present in a reality.

type_synonym uid = nat
type_synonym name = nat
type_synonym path = "name list"

subsection ‹Attributes›

text ‹
  Unix file attributes mainly consist of ∗‹owner› information and a number of
  ∗‹permission› bits which control access for ``user'', ``group'', and
  ``others'' (see the Unix man pages ‹chmod(2)› and ‹stat(2)› for more

  Our model of file permissions only considers the ``others'' part. The
  ``user'' field may be omitted without loss of overall generality, since the
  owner is usually able to change it anyway by performing ‹chmod›.⁋‹The
  inclined Unix expert may try to figure out some exotic arrangements of a
  real-world Unix file-system such that the owner of a file is unable to apply
  the ‹chmod› system call.› We omit ``group'' permissions as a genuine
  simplification as we just do not intend to discuss a model of multiple
  groups and group membership, but pretend that everyone is member of a single
  global group.⁋‹A general HOL model of user group structures and related
  issues is given in @{cite "Naraschewski:2001"}.›

datatype perm =
  | Writable
  | Executable     "(ignored)"

type_synonym perms = "perm set"

record att =
  owner :: uid
  others :: perms

text ‹
  For plain files @{term Readable} and @{term Writable} specify read and write
  access to the actual content, i.e.\ the string of text stored here. For
  directories @{term Readable} determines if the set of entry names may be
  accessed, and @{term Writable} controls the ability to create or delete any
  entries (both plain files or sub-directories).

  As another simplification, we ignore the @{term Executable} permission
  altogether. In reality it would indicate executable plain files (also known
  as ``binaries''), or control actual lookup of directory entries (recall that
  mere directory browsing is controlled via @{term Readable}). Note that the
  latter means that in order to perform any file-system operation whatsoever,
  all directories encountered on the path would have to grant @{term
  Executable}. We ignore this detail and pretend that all directories give
  @{term Executable} permission to anybody.

subsection ‹Files›

text ‹
  In order to model the general tree structure of a Unix file-system we use
  the arbitrarily branching datatype @{typ "('a, 'b, 'c) env"} from the
  standard library of Isabelle/HOL @{cite "Bauer-et-al:2002:HOL-Library"}.
  This type provides constructors @{term Val} and @{term Env} as follows:

  @{term [source] "Val :: 'a ⇒ ('a, 'b, 'c) env"} \\
  @{term [source] "Env :: 'b ⇒ ('c ⇒ ('a, 'b, 'c) env option) ⇒ ('a, 'b, 'c) env"} \\

  Here the parameter @{typ 'a} refers to plain values occurring at leaf
  positions, parameter @{typ 'b} to information kept with inner branch nodes,
  and parameter @{typ 'c} to the branching type of the tree structure. For our
  purpose we use the type instance with @{typ "att × string"} (representing
  plain files), @{typ att} (for attributes of directory nodes), and @{typ
  name} (for the index type of directory nodes).

type_synonym "file" = "(att × string, att, name) env"

text ‹
  The HOL library also provides @{term lookup} and @{term update} operations
  for general tree structures with the subsequent primitive recursive

  @{term [source] "lookup :: ('a, 'b, 'c) env ⇒ 'c list ⇒ ('a, 'b, 'c) env option"} \\
  @{term [source]
  "update :: 'c list ⇒ ('a, 'b, 'c) env option ⇒ ('a, 'b, 'c) env ⇒ ('a, 'b, 'c) env"} \\

  @{thm [display, indent = 2, eta_contract = false] lookup_eq [no_vars]}
  @{thm [display, indent = 2, eta_contract = false] update_eq [no_vars]}

  Several further properties of these operations are proven in @{cite
  "Bauer-et-al:2002:HOL-Library"}. These will be routinely used later on
  without further notice.

  Apparently, the elements of type @{typ "file"} contain an @{typ att}
  component in either case. We now define a few auxiliary operations to
  manipulate this field uniformly, following the conventions for record types
  in Isabelle/HOL @{cite "Nipkow-et-al:2000:HOL"}.

  "attributes file =
    (case file of
      Val (att, text) ⇒ att
    | Env att dir ⇒ att)"

  "map_attributes f file =
    (case file of
      Val (att, text) ⇒ Val (f att, text)
    | Env att dir ⇒ Env (f att) dir)"

lemma [simp]: "attributes (Val (att, text)) = att"
  by (simp add: attributes_def)

lemma [simp]: "attributes (Env att dir) = att"
  by (simp add: attributes_def)

lemma [simp]: "attributes (map_attributes f file) = f (attributes file)"
  by (cases "file") (simp_all add: attributes_def map_attributes_def

lemma [simp]: "map_attributes f (Val (att, text)) = Val (f att, text)"
  by (simp add: map_attributes_def)

lemma [simp]: "map_attributes f (Env att dir) = Env (f att) dir"
  by (simp add: map_attributes_def)

subsection ‹Initial file-systems›

text ‹
  Given a set of ∗‹known users› a file-system shall be initialized by
  providing an empty home directory for each user, with read-only access for
  everyone else. (Note that we may directly use the user id as home directory
  name, since both types have been identified.) Certainly, the very root
  directory is owned by the super user (who has user id 0).

  "init users =
    Env ⦇owner = 0, others = {Readable}⦈
      (λu. if u ∈ users then Some (Env ⦇owner = u, others = {Readable}⦈ empty)
        else None)"

subsection ‹Accessing file-systems›

text ‹
  The main internal file-system operation is access of a file by a user,
  requesting a certain set of permissions. The resulting @{typ "file option"}
  indicates if a file had been present at the corresponding @{typ path} and if
  access was granted according to the permissions recorded within the

  Note that by the rules of Unix file-system security (e.g.\ @{cite
  "Tanenbaum:1992"}) both the super-user and owner may always access a file
  unconditionally (in our simplified model).

  "access root path uid perms =
    (case lookup root path of
      None ⇒ None
    | Some file ⇒
        if uid = 0
          ∨ uid = owner (attributes file)
          ∨ perms ⊆ others (attributes file)
        then Some file
        else None)"

text ‹
  Successful access to a certain file is the main prerequisite for
  system-calls to be applicable (cf.\ \secref{sec:unix-trans}). Any
  modification of the file-system is then performed using the basic @{term
  update} operation.

  We see that @{term access} is just a wrapper for the basic @{term lookup}
  function, with additional checking of attributes. Subsequently we establish
  a few auxiliary facts that stem from the primitive @{term lookup} used
  within @{term access}.

lemma access_empty_lookup: "access root path uid {} = lookup root path"
  by (simp add: access_def split: option.splits)

lemma access_some_lookup:
  "access root path uid perms = Some file ⟹
    lookup root path = Some file"
  by (simp add: access_def split: option.splits if_splits)

lemma access_update_other:
  assumes parallel: "path' ∥ path"
  shows "access (update path' opt root) path uid perms = access root path uid perms"
proof -
  from parallel obtain y z xs ys zs where
      "y ≠ z" and "path' = xs @ y # ys" and "path = xs @ z # zs"
    by (blast dest: parallel_decomp)
  then have "lookup (update path' opt root) path = lookup root path"
    by (blast intro: lookup_update_other)
  then show ?thesis by (simp only: access_def)

section ‹File-system transitions \label{sec:unix-trans}›

subsection ‹Unix system calls \label{sec:unix-syscall}›

text ‹
  According to established operating system design (cf.\ @{cite
  "Tanenbaum:1992"}) user space processes may only initiate system operations
  by a fixed set of ∗‹system-calls›. This enables the kernel to enforce
  certain security policies in the first place.⁋‹Incidently, this is the very
  same principle employed by any ``LCF-style'' theorem proving system
  according to Milner's principle of ``correctness by construction'', such as
  Isabelle/HOL itself.›

  In our model of Unix we give a fixed datatype ‹operation› for the syntax of
  system-calls, together with an inductive definition of file-system state
  transitions of the form ‹root ─x→ root'› for the operational semantics.

datatype operation =
    Read uid string path
  | Write uid string path
  | Chmod uid perms path
  | Creat uid perms path
  | Unlink uid path
  | Mkdir uid perms path
  | Rmdir uid path
  | Readdir uid "name set" path

text ‹
  The @{typ uid} field of an operation corresponds to the ∗‹effective user id›
  of the underlying process, although our model never mentions processes
  explicitly. The other parameters are provided as arguments by the caller;
  the @{term path} one is common to all kinds of system-calls.

primrec uid_of :: "operation ⇒ uid"
    "uid_of (Read uid text path) = uid"
  | "uid_of (Write uid text path) = uid"
  | "uid_of (Chmod uid perms path) = uid"
  | "uid_of (Creat uid perms path) = uid"
  | "uid_of (Unlink uid path) = uid"
  | "uid_of (Mkdir uid path perms) = uid"
  | "uid_of (Rmdir uid path) = uid"
  | "uid_of (Readdir uid names path) = uid"

primrec path_of :: "operation ⇒ path"
    "path_of (Read uid text path) = path"
  | "path_of (Write uid text path) = path"
  | "path_of (Chmod uid perms path) = path"
  | "path_of (Creat uid perms path) = path"
  | "path_of (Unlink uid path) = path"
  | "path_of (Mkdir uid perms path) = path"
  | "path_of (Rmdir uid path) = path"
  | "path_of (Readdir uid names path) = path"

text ‹
  Note that we have omitted explicit ‹Open› and ‹Close› operations, pretending
  that @{term Read} and @{term Write} would already take care of this behind
  the scenes. Thus we have basically treated actual sequences of real
  system-calls ‹open›--‹read›/‹write›--‹close› as atomic.

  In principle, this could make big a difference in a model with explicit
  concurrent processes. On the other hand, even on a real Unix system the
  exact scheduling of concurrent ‹open› and ‹close› operations does ∗‹not›
  directly affect the success of corresponding ‹read› or ‹write›. Unix allows
  several processes to have files opened at the same time, even for writing!
  Certainly, the result from reading the contents later may be hard to
  predict, but the system-calls involved here will succeed in any case.

  The operational semantics of system calls is now specified via transitions
  of the file-system configuration. This is expressed as an inductive relation
  (although there is no actual recursion involved here).

inductive transition :: "file ⇒ operation ⇒ file ⇒ bool"
  ("_ ─_→ _" [90, 1000, 90] 100)
      "root ─(Read uid text path)→ root"
      if "access root path uid {Readable} = Some (Val (att, text))"
  | "write":
      "root ─(Write uid text path)→ update path (Some (Val (att, text))) root"
      if "access root path uid {Writable} = Some (Val (att, text'))"
  | chmod:
      "root ─(Chmod uid perms path)→
        update path (Some (map_attributes (others_update (λ_. perms)) file)) root"
      if "access root path uid {} = Some file" and "uid = 0 ∨ uid = owner (attributes file)"
  | creat:
      "root ─(Creat uid perms path)→
        update path (Some (Val (⦇owner = uid, others = perms⦈, []))) root"
      if "path = parent_path @ [name]"
      and "access root parent_path uid {Writable} = Some (Env att parent)"
      and "access root path uid {} = None"
  | unlink:
      "root ─(Unlink uid path)→ update path None root"
      if "path = parent_path @ [name]"
      and "access root parent_path uid {Writable} = Some (Env att parent)"
      and "access root path uid {} = Some (Val plain)"
  | mkdir:
      "root ─(Mkdir uid perms path)→
        update path (Some (Env ⦇owner = uid, others = perms⦈ empty)) root"
      if "path = parent_path @ [name]"
      and "access root parent_path uid {Writable} = Some (Env att parent)"
      and "access root path uid {} = None"
  | rmdir:
      "root ─(Rmdir uid path)→ update path None root"
      if "path = parent_path @ [name]"
      and "access root parent_path uid {Writable} = Some (Env att parent)"
      and "access root path uid {} = Some (Env att' empty)"
  | readdir:
      "root ─(Readdir uid names path)→ root"
      if "access root path uid {Readable} = Some (Env att dir)"
      and "names = dom dir"

text ‹
  Certainly, the above specification is central to the whole formal
  development. Any of the results to be established later on are only
  meaningful to the outside world if this transition system provides an
  adequate model of real Unix systems. This kind of ``reality-check'' of a
  formal model is the well-known problem of ∗‹validation›.

  If in doubt, one may consider to compare our definition with the informal
  specifications given the corresponding Unix man pages, or even peek at an
  actual implementation such as @{cite "Torvalds-et-al:Linux"}. Another common
  way to gain confidence into the formal model is to run simple simulations
  (see \secref{sec:unix-examples}), and check the results with that of
  experiments performed on a real Unix system.

subsection ‹Basic properties of single transitions \label{sec:unix-single-trans}›

text ‹
  The transition system ‹root ─x→ root'› defined above determines a unique
  result @{term root'} from given @{term root} and @{term x} (this is holds
  rather trivially, since there is even only one clause for each operation).
  This uniqueness statement will simplify our subsequent development to some
  extent, since we only have to reason about a partial function rather than a
  general relation.

theorem transition_uniq:
  assumes root': "root ─x→ root'"
    and root'': "root ─x→ root''"
  shows "root' = root''"
  using root''
proof cases
  case read
  with root' show ?thesis by cases auto
  case "write"
  with root' show ?thesis by cases auto
  case chmod
  with root' show ?thesis by cases auto
  case creat
  with root' show ?thesis by cases auto
  case unlink
  with root' show ?thesis by cases auto
  case mkdir
  with root' show ?thesis by cases auto
  case rmdir
  with root' show ?thesis by cases auto
  case readdir
  with root' show ?thesis by cases blast+

text ‹
  Apparently, file-system transitions are ∗‹type-safe› in the sense that the
  result of transforming an actual directory yields again a directory.

theorem transition_type_safe:
  assumes tr: "root ─x→ root'"
    and inv: "∃att dir. root = Env att dir"
  shows "∃att dir. root' = Env att dir"
proof (cases "path_of x")
  case Nil
  with tr inv show ?thesis
    by cases (auto simp add: access_def split: if_splits)
  case Cons
  from tr obtain opt where
      "root' = root ∨ root' = update (path_of x) opt root"
    by cases auto
  with inv Cons show ?thesis
    by (auto simp add: update_eq split: list.splits)

text ‹
  The previous result may be seen as the most basic invariant on the
  file-system state that is enforced by any proper kernel implementation. So
  user processes --- being bound to the system-call interface --- may never
  mess up a file-system such that the root becomes a plain file instead of a
  directory, which would be a strange situation indeed.

subsection ‹Iterated transitions›

text ‹
  Iterated system transitions via finite sequences of system operations are
  modeled inductively as follows. In a sense, this relation describes the
  cumulative effect of the sequence of system-calls issued by a number of
  running processes over a finite amount of time.

inductive transitions :: "file ⇒ operation list ⇒ file ⇒ bool"  ("_ ═_⇒ _" [90, 1000, 90] 100)
    nil: "root ═[]⇒ root"
  | cons: "root ═(x # xs)⇒ root''" if "root ─x→ root'" and "root' ═xs⇒ root''"

text ‹
  We establish a few basic facts relating iterated transitions with single
  ones, according to the recursive structure of lists.

lemma transitions_nil_eq: "root ═[]⇒ root' ⟷ root = root'"
  assume "root ═[]⇒ root'"
  then show "root = root'" by cases simp_all
  assume "root = root'"
  then show "root ═[]⇒ root'" by (simp only: transitions.nil)

lemma transitions_cons_eq:
  "root ═(x # xs)⇒ root'' ⟷ (∃root'. root ─x→ root' ∧ root' ═xs⇒ root'')"
  assume "root ═(x # xs)⇒ root''"
  then show "∃root'. root ─x→ root' ∧ root' ═xs⇒ root''"
    by cases auto
  assume "∃root'. root ─x→ root' ∧ root' ═xs⇒ root''"
  then show "root ═(x # xs)⇒ root''"
    by (blast intro: transitions.cons)

text ‹
  The next two rules show how to ``destruct'' known transition sequences. Note
  that the second one actually relies on the uniqueness property of the basic
  transition system (see \secref{sec:unix-single-trans}).

lemma transitions_nilD: "root ═[]⇒ root' ⟹ root' = root"
  by (simp add: transitions_nil_eq)

lemma transitions_consD:
  assumes list: "root ═(x # xs)⇒ root''"
    and hd: "root ─x→ root'"
  shows "root' ═xs⇒ root''"
proof -
  from list obtain r' where r': "root ─x→ r'" and root'': "r' ═xs⇒ root''"
    by cases simp_all
  from r' hd have "r' = root'" by (rule transition_uniq)
  with root'' show "root' ═xs⇒ root''" by simp

text ‹
  The following fact shows how an invariant @{term Q} of single transitions
  with property @{term P} may be transferred to iterated transitions. The
  proof is rather obvious by rule induction over the definition of @{term
  "root ═xs⇒ root'"}.

lemma transitions_invariant:
  assumes r: "⋀r x r'. r ─x→ r' ⟹ Q r ⟹ P x ⟹ Q r'"
    and trans: "root ═xs⇒ root'"
  shows "Q root ⟹ ∀x ∈ set xs. P x ⟹ Q root'"
  using trans
proof induct
  case nil
  show ?case by (rule nil.prems)
  case (cons root x root' xs root'')
  note P = ‹∀x ∈ set (x # xs). P x›
  then have "P x" by simp
  with ‹root ─x→ root'› and ‹Q root› have Q': "Q root'" by (rule r)
  from P have "∀x ∈ set xs. P x" by simp
  with Q' show "Q root''" by (rule cons.hyps)

text ‹
  As an example of applying the previous result, we transfer the basic
  type-safety property (see \secref{sec:unix-single-trans}) from single
  transitions to iterated ones, which is a rather obvious result indeed.

theorem transitions_type_safe:
  assumes "root ═xs⇒ root'"
    and "∃att dir. root = Env att dir"
  shows "∃att dir. root' = Env att dir"
  using transition_type_safe and assms
proof (rule transitions_invariant)
  show "∀x ∈ set xs. True" by blast

section ‹Executable sequences›

text ‹
  An inductively defined relation such as the one of ‹root ─x→ root'› (see
  \secref{sec:unix-syscall}) has two main aspects. First of all, the resulting
  system admits a certain set of transition rules (introductions) as given in
  the specification. Furthermore, there is an explicit least-fixed-point
  construction involved, which results in induction (and case-analysis) rules
  to eliminate known transitions in an exhaustive manner.

  Subsequently, we explore our transition system in an experimental style,
  mainly using the introduction rules with basic algebraic properties of the
  underlying structures. The technique closely resembles that of Prolog
  combined with functional evaluation in a very simple manner.

  Just as the ``closed-world assumption'' is left implicit in Prolog, we do
  not refer to induction over the whole transition system here. So this is
  still purely positive reasoning about possible executions; exhaustive
  reasoning will be employed only later on (see \secref{sec:unix-bogosity}),
  when we shall demonstrate that certain behavior is ∗‹not› possible.

subsection ‹Possible transitions›

text ‹
  Rather obviously, a list of system operations can be executed within a
  certain state if there is a result state reached by an iterated transition.

definition "can_exec root xs ⟷ (∃root'. root ═xs⇒ root')"

lemma can_exec_nil: "can_exec root []"
  unfolding can_exec_def by (blast intro: transitions.intros)

lemma can_exec_cons:
    "root ─x→ root' ⟹ can_exec root' xs ⟹ can_exec root (x # xs)"
  unfolding can_exec_def by (blast intro: transitions.intros)

text ‹
  In case that we already know that a certain sequence can be executed we may
  destruct it backwardly into individual transitions.

lemma can_exec_snocD: "can_exec root (xs @ [y])
    ⟹ ∃root' root''. root ═xs⇒ root' ∧ root' ─y→ root''"
proof (induct xs arbitrary: root)
  case Nil
  then show ?case
    by (simp add: can_exec_def transitions_nil_eq transitions_cons_eq)
  case (Cons x xs)
  from ‹can_exec root ((x # xs) @ [y])› obtain r root'' where
      x: "root ─x→ r" and
      xs_y: "r ═(xs @ [y])⇒ root''"
    by (auto simp add: can_exec_def transitions_nil_eq transitions_cons_eq)
  from xs_y Cons.hyps obtain root' r'
    where xs: "r ═xs⇒ root'" and y: "root' ─y→ r'"
    unfolding can_exec_def by blast
  from x xs have "root ═(x # xs)⇒ root'"
    by (rule transitions.cons)
  with y show ?case by blast

subsection ‹Example executions \label{sec:unix-examples}›

text ‹
  We are now ready to perform a few experiments within our formal model of
  Unix system-calls. The common technique is to alternate introduction rules
  of the transition system (see \secref{sec:unix-trans}), and steps to solve
  any emerging side conditions using algebraic properties of the underlying
  file-system structures (see \secref{sec:unix-file-system}).

lemmas eval = access_def init_def

theorem "u ∈ users ⟹ can_exec (init users)
    [Mkdir u perms [u, name]]"
  apply (rule can_exec_cons)
     ‹back-chain @{term can_exec} (of @{term [source] Cons})›
  apply (rule mkdir)
     ‹back-chain @{term Mkdir}›
  apply (force simp add: eval)+
     ‹solve preconditions of @{term Mkdir}›
  apply (simp add: eval)
     ‹peek at resulting dir (optional)›
  txt ‹@{subgoals [display]}›
  apply (rule can_exec_nil)
     ‹back-chain @{term can_exec} (of @{term [source] Nil})›

text ‹
  By inspecting the result shown just before the final step above, we may gain
  confidence that our specification of Unix system-calls actually makes sense.
  Further common errors are usually exhibited when preconditions of transition
  rules fail unexpectedly.

  Here are a few further experiments, using the same techniques as before.

theorem "u ∈ users ⟹ can_exec (init users)
   [Creat u perms [u, name],
    Unlink u [u, name]]"
  apply (rule can_exec_cons)
  apply (rule creat)
  apply (force simp add: eval)+
  apply (simp add: eval)
  apply (rule can_exec_cons)
  apply (rule unlink)
  apply (force simp add: eval)+
  apply (simp add: eval)
  txt ‹peek at result: @{subgoals [display]}›
  apply (rule can_exec_nil)

theorem "u ∈ users ⟹ Writable ∈ perms1 ⟹
  Readable ∈ perms2 ⟹ name3 ≠ name4 ⟹
  can_exec (init users)
   [Mkdir u perms1 [u, name1],
    Mkdir u' perms2 [u, name1, name2],
    Creat u' perms3 [u, name1, name2, name3],
    Creat u' perms3 [u, name1, name2, name4],
    Readdir u {name3, name4} [u, name1, name2]]"
  apply (rule can_exec_cons, rule transition.intros,
    (force simp add: eval)+, (simp add: eval)?)+
  txt ‹peek at result: @{subgoals [display]}›
  apply (rule can_exec_nil)

theorem "u ∈ users ⟹ Writable ∈ perms1 ⟹ Readable ∈ perms3 ⟹
  can_exec (init users)
   [Mkdir u perms1 [u, name1],
    Mkdir u' perms2 [u, name1, name2],
    Creat u' perms3 [u, name1, name2, name3],
    Write u' ''foo'' [u, name1, name2, name3],
    Read u ''foo'' [u, name1, name2, name3]]"
  apply (rule can_exec_cons, rule transition.intros,
    (force simp add: eval)+, (simp add: eval)?)+
  txt ‹peek at result: @{subgoals [display]}›
  apply (rule can_exec_nil)

section ‹Odd effects --- treated formally \label{sec:unix-bogosity}›

text ‹
  We are now ready to give a completely formal treatment of the slightly odd
  situation discussed in the introduction (see \secref{sec:unix-intro}): the
  file-system can easily reach a state where a user is unable to remove his
  very own directory, because it is still populated by items placed there by
  another user in an uncouth manner.

subsection ‹The general procedure \label{sec:unix-inv-procedure}›

text ‹
  The following theorem expresses the general procedure we are following to
  achieve the main result.

theorem general_procedure:
  assumes cannot_y: "⋀r r'. Q r ⟹ r ─y→ r' ⟹ False"
    and init_inv: "⋀root. init users ═bs⇒ root ⟹ Q root"
    and preserve_inv: "⋀r x r'. r ─x→ r' ⟹ Q r ⟹ P x ⟹ Q r'"
    and init_result: "init users ═bs⇒ root"
  shows "¬ (∃xs. (∀x ∈ set xs. P x) ∧ can_exec root (xs @ [y]))"
proof -
    fix xs
    assume Ps: "∀x ∈ set xs. P x"
    assume can_exec: "can_exec root (xs @ [y])"
    then obtain root' root'' where xs: "root ═xs⇒ root'" and y: "root' ─y→ root''"
      by (blast dest: can_exec_snocD)
    from init_result have "Q root" by (rule init_inv)
    from preserve_inv xs this Ps have "Q root'"
      by (rule transitions_invariant)
    from this y have False by (rule cannot_y)
  then show ?thesis by blast

text ‹
  Here @{prop "P x"} refers to the restriction on file-system operations that
  are admitted after having reached the critical configuration; according to
  the problem specification this will become @{prop "uid_of x = user1"} later
  on. Furthermore, @{term y} refers to the operations we claim to be
  impossible to perform afterwards, we will take @{term Rmdir} later. Moreover
  @{term Q} is a suitable (auxiliary) invariant over the file-system; choosing
  @{term Q} adequately is very important to make the proof work (see

subsection ‹The particular situation›

text ‹
  We introduce a few global declarations and axioms to describe our particular
  situation considered here. Thus we avoid excessive use of local parameters
  in the subsequent development.

  fixes users :: "uid set"
    and user1 :: uid
    and user2 :: uid
    and name1 :: name
    and name2 :: name
    and name3 :: name
    and perms1 :: perms
    and perms2 :: perms
  assumes user1_known: "user1 ∈ users"
    and user1_not_root: "user1 ≠ 0"
    and users_neq: "user1 ≠ user2"
    and perms1_writable: "Writable ∈ perms1"
    and perms2_not_writable: "Writable ∉ perms2"
  notes facts = user1_known user1_not_root users_neq
    perms1_writable perms2_not_writable

  "bogus =
     [Mkdir user1 perms1 [user1, name1],
      Mkdir user2 perms2 [user1, name1, name2],
      Creat user2 perms2 [user1, name1, name2, name3]]"

definition "bogus_path = [user1, name1, name2]"

text ‹
  The @{term bogus} operations are the ones that lead into the uncouth
  situation; @{term bogus_path} is the key position within the file-system
  where things go awry.

subsection ‹Invariance lemmas \label{sec:unix-inv-lemmas}›

text ‹
  The following invariant over the root file-system describes the bogus
  situation in an abstract manner: located at a certain @{term path} within
  the file-system is a non-empty directory that is neither owned nor writable
  by @{term user1}.

  "invariant root path ⟷
    (∃att dir.
      access root path user1 {} = Some (Env att dir) ∧ dir ≠ empty ∧
      user1 ≠ owner att ∧
      access root path user1 {Writable} = None)"

text ‹
  Following the general procedure (see \secref{sec:unix-inv-procedure}) we
  will now establish the three key lemmas required to yield the final result.

    ▸ The invariant is sufficiently strong to entail the pathological case
    that @{term user1} is unable to remove the (owned) directory at @{term
    "[user1, name1]"}.

    ▸ The invariant does hold after having executed the @{term bogus} list of
    operations (starting with an initial file-system configuration).

    ▸ The invariant is preserved by any file-system operation performed by
    @{term user1} alone, without any help by other users.

  As the invariant appears both as assumptions and conclusions in the course
  of proof, its formulation is rather critical for the whole development to
  work out properly. In particular, the third step is very sensitive to the
  invariant being either too strong or too weak. Moreover the invariant has to
  be sufficiently abstract, lest the proof become cluttered by confusing

  The first two lemmas are technically straight forward --- we just have to
  inspect rather special cases.

lemma cannot_rmdir:
  assumes inv: "invariant root bogus_path"
    and rmdir: "root ─(Rmdir user1 [user1, name1])→ root'"
  shows False
proof -
  from inv obtain "file" where "access root bogus_path user1 {} = Some file"
    unfolding invariant_def by blast
  from rmdir obtain att where "access root [user1, name1] user1 {} = Some (Env att empty)"
    by cases auto
  then have "access root ([user1, name1] @ [name2]) user1 {} = empty name2"
    by (simp only: access_empty_lookup lookup_append_some) simp
  ultimately show False by (simp add: bogus_path_def)

  assumes init: "init users ═bogus⇒ root"
  shows init_invariant: "invariant root bogus_path"
  supply eval = facts access_def init_def
  using init
  apply (unfold bogus_def bogus_path_def)
  apply (drule transitions_consD, rule transition.intros,
      (force simp add: eval)+, (simp add: eval)?)+
     "evaluate all operations"
  apply (drule transitions_nilD)
     "reach final result"
  apply (simp add: invariant_def eval)
     "check the invariant"

text ‹
  At last we are left with the main effort to show that the ``bogosity''
  invariant is preserved by any file-system operation performed by @{term
  user1} alone. Note that this holds for any @{term path} given, the
  particular @{term bogus_path} is not required here.

lemma preserve_invariant:
  assumes tr: "root ─x→ root'"
    and inv: "invariant root path"
    and uid: "uid_of x = user1"
  shows "invariant root' path"
proof -
  from inv obtain att dir where
      inv1: "access root path user1 {} = Some (Env att dir)" and
      inv2: "dir ≠ empty" and
      inv3: "user1 ≠ owner att" and
      inv4: "access root path user1 {Writable} = None"
    by (auto simp add: invariant_def)

  from inv1 have lookup: "lookup root path = Some (Env att dir)"
    by (simp only: access_empty_lookup)

  from inv1 inv3 inv4 and user1_not_root
  have not_writable: "Writable ∉ others att"
    by (auto simp add: access_def split: option.splits)

  show ?thesis
  proof cases
    assume "root' = root"
    with inv show "invariant root' path" by (simp only:)
    assume changed: "root' ≠ root"
    with tr obtain opt where root': "root' = update (path_of x) opt root"
      by cases auto
    show ?thesis
    proof (rule prefix_cases)
      assume "path_of x ∥ path"
      with inv root'
      have "⋀perms. access root' path user1 perms = access root path user1 perms"
        by (simp only: access_update_other)
      with inv show "invariant root' path"
        by (simp only: invariant_def)
      assume "prefix (path_of x) path"
      then obtain ys where path: "path = path_of x @ ys" ..

      show ?thesis
      proof (cases ys)
        assume "ys = []"
        with tr uid inv2 inv3 lookup changed path and user1_not_root
        have False
          by cases (auto simp add: access_empty_lookup dest: access_some_lookup)
        then show ?thesis ..
        fix z zs assume ys: "ys = z # zs"
        have "lookup root' path = lookup root path"
        proof -
          from inv2 lookup path ys
          have look: "lookup root (path_of x @ z # zs) = Some (Env att dir)"
            by (simp only:)
          then obtain att' dir' file' where
              look': "lookup root (path_of x) = Some (Env att' dir')" and
              dir': "dir' z = Some file'" and
              file': "lookup file' zs = Some (Env att dir)"
            by (blast dest: lookup_some_upper)

          from tr uid changed look' dir' obtain att'' where
              look'': "lookup root' (path_of x) = Some (Env att'' dir')"
            by cases (auto simp add: access_empty_lookup lookup_update_some
              dest: access_some_lookup)
          with dir' file' have "lookup root' (path_of x @ z # zs) =
              Some (Env att dir)"
            by (simp add: lookup_append_some)
          with look path ys show ?thesis
            by simp
        with inv show "invariant root' path"
          by (simp only: invariant_def access_def)
      assume "strict_prefix path (path_of x)"
      then obtain y ys where path: "path_of x = path @ y # ys" ..

      obtain dir' where
        lookup': "lookup root' path = Some (Env att dir')" and
        inv2': "dir' ≠ empty"
      proof (cases ys)
        assume "ys = []"
        with path have parent: "path_of x = path @ [y]" by simp
        with tr uid inv4 changed obtain "file" where
            "root' = update (path_of x) (Some file) root"
          by cases auto
        with lookup parent have "lookup root' path = Some (Env att (dir(y↦file)))"
          by (simp only: update_append_some update_cons_nil_env)
        moreover have "dir(y↦file) ≠ empty" by simp
        ultimately show ?thesis ..
        fix z zs assume ys: "ys = z # zs"
        with lookup root' path
        have "lookup root' path = Some (update (y # ys) opt (Env att dir))"
          by (simp only: update_append_some)
        also obtain file' where
          "update (y # ys) opt (Env att dir) = Env att (dir(y↦file'))"
        proof -
          have "dir y ≠ None"
          proof -
            have "dir y = lookup (Env att dir) [y]"
              by (simp split: option.splits)
            also from lookup have "… = lookup root (path @ [y])"
              by (simp only: lookup_append_some)
            also have "… ≠ None"
            proof -
              from ys obtain us u where rev_ys: "ys = us @ [u]"
                by (cases ys rule: rev_cases) simp
              with tr path
              have "lookup root ((path @ [y]) @ (us @ [u])) ≠ None ∨
                  lookup root ((path @ [y]) @ us) ≠ None"
                by cases (auto dest: access_some_lookup)
              then show ?thesis
                by (fastforce dest!: lookup_some_append)
            finally show ?thesis .
          with ys show ?thesis using that by auto
        also have "dir(y↦file') ≠ empty" by simp
        ultimately show ?thesis ..

      from lookup' have inv1': "access root' path user1 {} = Some (Env att dir')"
        by (simp only: access_empty_lookup)

      from inv3 lookup' and not_writable user1_not_root
      have "access root' path user1 {Writable} = None"
        by (simp add: access_def)
      with inv1' inv2' inv3 show ?thesis unfolding invariant_def by blast

subsection ‹Putting it all together \label{sec:unix-main-result}›

text ‹
  The main result is now imminent, just by composing the three invariance
  lemmas (see \secref{sec:unix-inv-lemmas}) according the the overall
  procedure (see \secref{sec:unix-inv-procedure}).

  assumes bogus: "init users ═bogus⇒ root"
  shows "¬ (∃xs. (∀x ∈ set xs. uid_of x = user1) ∧
    can_exec root (xs @ [Rmdir user1 [user1, name1]]))"
proof -
  from cannot_rmdir init_invariant preserve_invariant
    and bogus show ?thesis by (rule general_procedure)