Theory Datatypes

(*  Title:      Doc/Datatypes/Datatypes.thy
    Author:     Julian Biendarra, TU Muenchen
    Author:     Jasmin Blanchette, TU Muenchen
    Author:     Martin Desharnais, Ecole de technologie superieure
    Author:     Lorenz Panny, TU Muenchen
    Author:     Andrei Popescu, TU Muenchen
    Author:     Dmitriy Traytel, TU Muenchen

Tutorial for (co)datatype definitions.
*)

theory Datatypes
imports
  Setup
  "HOL-Library.BNF_Axiomatization"
  "HOL-Library.Countable"
  "HOL-Library.FSet"
  "HOL-Library.Simps_Case_Conv"
begin


(*<*)
unbundle cardinal_syntax
(*>*)

section ‹Introduction
  \label{sec:introduction}›

text ‹
The 2013 edition of Isabelle introduced a definitional package for freely
generated datatypes and codatatypes. This package replaces the earlier
implementation due to Berghofer and Wenzel cite"Berghofer-Wenzel:1999:TPHOL".
Perhaps the main advantage of the new package is that it supports recursion
through a large class of non-datatypes, such as finite sets:
›

    datatype 'a treefs = Nodefs (lblfs: 'a) (subfs: "'a treefs fset")

text ‹
\noindent
Another strong point is the support for local definitions:
›

    context linorder
    begin
    datatype flag = Less | Eq | Greater
    end

text ‹
\noindent
Furthermore, the package provides a lot of convenience, including automatically
generated discriminators, selectors, and relators as well as a wealth of
properties about them.

In addition to inductive datatypes, the package supports coinductive
datatypes, or \emph{codatatypes}, which allow infinite values. For example, the
following command introduces the type of lazy lists, which comprises both finite
and infinite values:
›

(*<*)
    locale early
    locale late
(*>*)
    codatatype (*<*)(in early) (*>*)'a llist = LNil | LCons 'a "'a llist"

text ‹
\noindent
Mixed inductive--coinductive recursion is possible via nesting. Compare the
following four Rose tree examples:
›

    datatype (*<*)(in early) (*>*)'a treeff = Nodeff 'a "'a treeff list"
    datatype (*<*)(in early) (*>*)'a treefi = Nodefi 'a "'a treefi llist"
    codatatype (*<*)(in early) (*>*)'a treeif = Nodeif 'a "'a treeif list"
    codatatype (*<*)(in early) (*>*)'a treeii = Nodeii 'a "'a treeii llist"

text ‹
The first two tree types allow only paths of finite length, whereas the last two
allow infinite paths. Orthogonally, the nodes in the first and third types have
finitely many direct subtrees, whereas those of the second and fourth may have
infinite branching.

The package is part of theoryMain. Additional functionality is provided by
the theory 🗏‹~~/src/HOL/Library/BNF_Axiomatization.thy›.

The package, like its predecessor, fully adheres to the LCF philosophy
citemgordon79: The characteristic theorems associated with the specified
(co)datatypes are derived rather than introduced axiomatically.%
\footnote{However, some of the
internal constructions and most of the internal proof obligations are omitted
if the quick_and_dirty› option is enabled.}
The package is described in a number of scientific papers
cite"traytel-et-al-2012" and "blanchette-et-al-2014-impl" and
"panny-et-al-2014" and "blanchette-et-al-2015-wit".
The central notion is that of a \emph{bounded natural functor} (BNF)---a
well-behaved type constructor for which nested (co)recursion is supported.

This tutorial is organized as follows:

\begin{itemize}
\setlength{\itemsep}{0pt}

\item Section \ref{sec:defining-datatypes}, ``Defining Datatypes,''
describes how to specify datatypes using the @{command datatype} command.

\item Section \ref{sec:defining-primitively-recursive-functions}, ``Defining
Primitively Recursive Functions,'' describes how to specify functions
using @{command primrec}. (A separate tutorial cite"isabelle-function"
describes the more powerful \keyw{fun} and \keyw{function} commands.)

\item Section \ref{sec:defining-codatatypes}, ``Defining Codatatypes,''
describes how to specify codatatypes using the @{command codatatype} command.

\item Section \ref{sec:defining-primitively-corecursive-functions},
``Defining Primitively Corecursive Functions,'' describes how to specify
functions using the @{command primcorec} and
@{command primcorecursive} commands. (A separate tutorial
cite"isabelle-corec" describes the more powerful \keyw{corec} and
\keyw{corecursive} commands.)

\item Section \ref{sec:registering-bounded-natural-functors}, ``Registering
Bounded Natural Functors,'' explains how to use the @{command bnf} command
to register arbitrary type constructors as BNFs.

\item Section
\ref{sec:deriving-destructors-and-constructor-theorems},
``Deriving Destructors and Constructor Theorems,'' explains how to
use the command @{command free_constructors} to derive destructor constants
and theorems for freely generated types, as performed internally by
@{command datatype} and @{command codatatype}.

%\item Section \ref{sec:using-the-standard-ml-interface}, ``Using the Standard
%ML Interface,'' describes the package's programmatic interface.

\item Section \ref{sec:selecting-plugins}, ``Selecting Plugins,'' is concerned
with the package's interoperability with other Isabelle packages and tools, such
as the code generator, Transfer, Lifting, and Quickcheck.

\item Section \ref{sec:known-bugs-and-limitations}, ``Known Bugs and
Limitations,'' concludes with known open issues.
\end{itemize}

\newbox\boxA
\setbox\boxA=\hbox{\texttt{NOSPAM}}

\newcommand\authoremaili{\texttt{jasmin.blan{\color{white}NOSPAM}\kern-\wd\boxA{}chette@\allowbreak
gmail.\allowbreak com}}

Comments and bug reports concerning either the package or this tutorial should
be directed to the second author at \authoremaili{} or to the
\texttt{cl-isabelle-users} mailing list.
›


section ‹Defining Datatypes
  \label{sec:defining-datatypes}›

text ‹
Datatypes can be specified using the @{command datatype} command.
›


subsection ‹Introductory Examples
  \label{ssec:datatype-introductory-examples}›

text ‹
Datatypes are illustrated through concrete examples featuring different flavors
of recursion. More examples can be found in the directory
🗀‹~~/src/HOL/Datatype_Examples›.
›


subsubsection ‹Nonrecursive Types
  \label{sssec:datatype-nonrecursive-types}›

text ‹
Datatypes are introduced by specifying the desired names and argument types for
their constructors. \emph{Enumeration} types are the simplest form of datatype.
All their constructors are nullary:
›

    datatype trool = Truue | Faalse | Perhaaps

text ‹
\noindent
constTruue, constFaalse, and constPerhaaps have the type typtrool.

Polymorphic types are possible, such as the following option type, modeled after
its homologue from the theoryHOL.Option theory:
›

(*<*)
    hide_const None Some map_option
    hide_type option
(*>*)
    datatype 'a option = None | Some 'a

text ‹
\noindent
The constructors are None :: 'a option› and
Some :: 'a ⇒ 'a option›.

The next example has three type parameters:
›

    datatype ('a, 'b, 'c) triple = Triple 'a 'b 'c

text ‹
\noindent
The constructor is
Triple :: 'a ⇒ 'b ⇒ 'c ⇒ ('a, 'b, 'c) triple›.
Unlike in Standard ML, curried constructors are supported. The uncurried variant
is also possible:
›

    datatype ('a, 'b, 'c) tripleu = Tripleu "'a * 'b * 'c"

text ‹
\noindent
Occurrences of nonatomic types on the right-hand side of the equal sign must be
enclosed in double quotes, as is customary in Isabelle.
›


subsubsection ‹Simple Recursion
  \label{sssec:datatype-simple-recursion}›

text ‹
Natural numbers are the simplest example of a recursive type:
›

    datatype nat = Zero | Succ nat

text ‹
\noindent
Lists were shown in the introduction. Terminated lists are a variant that
stores a value of type typ'b at the very end:
›

    datatype (*<*)(in early) (*>*)('a, 'b) tlist = TNil 'b | TCons 'a "('a, 'b) tlist"


subsubsection ‹Mutual Recursion
  \label{sssec:datatype-mutual-recursion}›

text ‹
\emph{Mutually recursive} types are introduced simultaneously and may refer to
each other. The example below introduces a pair of types for even and odd
natural numbers:
›

    datatype even_nat = Even_Zero | Even_Succ odd_nat
    and odd_nat = Odd_Succ even_nat

text ‹
\noindent
Arithmetic expressions are defined via terms, terms via factors, and factors via
expressions:
›

    datatype ('a, 'b) exp =
      Term "('a, 'b) trm" | Sum "('a, 'b) trm" "('a, 'b) exp"
    and ('a, 'b) trm =
      Factor "('a, 'b) fct" | Prod "('a, 'b) fct" "('a, 'b) trm"
    and ('a, 'b) fct =
      Const 'a | Var 'b | Expr "('a, 'b) exp"


subsubsection ‹Nested Recursion
  \label{sssec:datatype-nested-recursion}›

text ‹
\emph{Nested recursion} occurs when recursive occurrences of a type appear under
a type constructor. The introduction showed some examples of trees with nesting
through lists. A more complex example, that reuses our typeoption type,
follows:
›

    datatype 'a btree =
      BNode 'a "'a btree option" "'a btree option"

text ‹
\noindent
Not all nestings are admissible. For example, this command will fail:
›

    datatype 'a wrong = W1 | W2 (*<*)'a
    typ (*>*)"'a wrong  'a"

text ‹
\noindent
The issue is that the function arrow ⇒› allows recursion
only through its right-hand side. This issue is inherited by polymorphic
datatypes defined in terms of~⇒›:
›

    datatype ('a, 'b) fun_copy = Fun "'a  'b"
    datatype 'a also_wrong = W1 | W2 (*<*)'a
    typ (*>*)"('a also_wrong, 'a) fun_copy"

text ‹
\noindent
The following definition of typ'a-branching trees is legal:
›

    datatype 'a ftree = FTLeaf 'a | FTNode "'a  'a ftree"

text ‹
\noindent
And so is the definition of hereditarily finite sets:
›

    datatype hfset = HFSet "hfset fset"

text ‹
\noindent
In general, type constructors ('a1, …, 'am) t›
allow recursion on a subset of their type arguments 'a1, \ldots,
'am. These type arguments are called \emph{live}; the remaining
type arguments are called \emph{dead}. In typ'a  'b and
typ('a, 'b) fun_copy, the type variable typ'a is dead and
typ'b is live.

Type constructors must be registered as BNFs to have live arguments. This is
done automatically for datatypes and codatatypes introduced by the
@{command datatype} and @{command codatatype} commands.
Section~\ref{sec:registering-bounded-natural-functors} explains how to
register arbitrary type constructors as BNFs.

Here is another example that fails:
›

    datatype 'a pow_list = PNil 'a (*<*)'a
    datatype 'a pow_list' = PNil' 'a (*>*)| PCons "('a * 'a) pow_list"

text ‹
\noindent
This attempted definition features a different flavor of nesting, where the
recursive call in the type specification occurs around (rather than inside)
another type constructor.
›


subsubsection ‹Auxiliary Constants
  \label{sssec:datatype-auxiliary-constants}›

text ‹
The @{command datatype} command introduces various constants in addition to
the constructors. With each datatype are associated set functions, a map
function, a predicator, a relator, discriminators, and selectors, all of which can be given
custom names. In the example below, the familiar names null›, hd›,
tl›, set›, map›, and list_all2› override the
default names is_Nil›, un_Cons1›, un_Cons2›,
set_list›, map_list›, and rel_list›:
›

(*<*)
    no_translations
      "[x, xs]" == "x # [xs]"
      "[x]" == "x # []"

    no_notation
      Nil ("[]") and
      Cons (infixr "#" 65)

    hide_type list
    hide_const Nil Cons case_list hd tl set map list_all2 rec_list size_list list_all

    context early
    begin
(*>*)
    datatype (set: 'a) list =
      null: Nil
    | Cons (hd: 'a) (tl: "'a list")
    for
      map: map
      rel: list_all2
      pred: list_all
    where
      "tl Nil = Nil"

text ‹
\noindent
The types of the constants that appear in the specification are listed below.

\medskip

\begin{tabular}{@ {}ll@ {}}
Constructors: &
  Nil :: 'a list› \\
&
  Cons :: 'a ⇒ 'a list ⇒ 'a list› \\
Discriminator: &
  null :: 'a list ⇒ bool› \\
Selectors: &
  hd :: 'a list ⇒ 'a› \\
&
  tl :: 'a list ⇒ 'a list› \\
Set function: &
  set :: 'a list ⇒ 'a set› \\
Map function: &
  map :: ('a ⇒ 'b) ⇒ 'a list ⇒ 'b list› \\
Relator: &
  list_all2 :: ('a ⇒ 'b ⇒ bool) ⇒ 'a list ⇒ 'b list ⇒ bool›
\end{tabular}

\medskip

The discriminator constnull and the selectors consthd and consttl
are characterized by the following conditional equations:
%
\[@{thm list.collapse(1)[of xs, no_vars]}
  \qquad @{thm list.collapse(2)[of xs, no_vars]}\]
%
For two-constructor datatypes, a single discriminator constant is sufficient.
The discriminator associated with constCons is simply
termλxs. ¬ null xs.

The \keyw{where} clause at the end of the command specifies a default value for
selectors applied to constructors on which they are not a priori specified.
In the example, it is used to ensure that the tail of the empty list is itself
(instead of being left unspecified).

Because constNil is nullary, it is also possible to use
termλxs. xs = Nil as a discriminator. This is the default behavior
if we omit the identifier constnull and the associated colon. Some users
argue against this, because the mixture of constructors and selectors in the
characteristic theorems can lead Isabelle's automation to switch between the
constructor and the destructor view in surprising ways.

The usual mixfix syntax annotations are available for both types and
constructors. For example:
›

(*<*)
    end
(*>*)
    datatype ('a, 'b) prod (infixr "*" 20) = Pair 'a 'b

text ‹\blankline›

    datatype (set: 'a) list =
      null: Nil ("[]")
    | Cons (hd: 'a) (tl: "'a list") (infixr "#" 65)
    for
      map: map
      rel: list_all2
      pred: list_all

text ‹
\noindent
Incidentally, this is how the traditional syntax can be set up:
›

    syntax "_list" :: "args  'a list" ("[(_)]")

text ‹\blankline›

    translations
      "[x, xs]" == "x # [xs]"
      "[x]" == "x # []"


subsection ‹Command Syntax
  \label{ssec:datatype-command-syntax}›

subsubsection ‹\keyw{datatype}
  \label{sssec:datatype}›

text ‹
\begin{matharray}{rcl}
  @{command_def "datatype"} & : & local_theory → local_theory›
\end{matharray}

rail@@{command datatype} target? @{syntax dt_options}? @{syntax dt_spec}
  ;
  @{syntax_def dt_options}: '(' ((@{syntax plugins} | 'discs_sels') + ',') ')'
  ;
  @{syntax_def plugins}: 'plugins' ('only' | 'del') ':' (name +)
  ;
  @{syntax_def dt_spec}: (@{syntax dt_name} '=' (@{syntax dt_ctor} + '|') 
     @{syntax map_rel_pred}? (@'where' (prop + '|'))? + @'and')
  ;
  @{syntax_def map_rel_pred}: @'for' ((('map' | 'rel' | 'pred') ':' name) +)
›

\medskip

\noindent
The @{command datatype} command introduces a set of mutually recursive
datatypes specified by their constructors.

The syntactic entity \synt{target} can be used to specify a local
context (e.g., (in linorder)› cite"isabelle-isar-ref"),
and \synt{prop} denotes a HOL proposition.

The optional target is optionally followed by a combination of the following
options:

\begin{itemize}
\setlength{\itemsep}{0pt}

\item
The plugins› option indicates which plugins should be enabled
(only›) or disabled (del›). By default, all plugins are enabled.

\item
The discs_sels› option indicates that discriminators and selectors
should be generated. The option is implicitly enabled if names are specified for
discriminators or selectors.
\end{itemize}

The optional \keyw{where} clause specifies default values for selectors.
Each proposition must be an equation of the form
un_D (C …) = …›, where C› is a constructor and
un_D› is a selector.

The left-hand sides of the datatype equations specify the name of the type to
define, its type parameters, and additional information:

rail@{syntax_def dt_name}: @{syntax tyargs}? name mixfix?
  ;
  @{syntax_def tyargs}: typefree | '(' (('dead' | name ':')? typefree + ',') ')'

\medskip

\noindent
The syntactic entity \synt{name} denotes an identifier, \synt{mixfix} denotes
the usual parenthesized mixfix notation, and \synt{typefree} denotes fixed type
variable (typ'a, typ'b, \ldots) cite"isabelle-isar-ref".

The optional names preceding the type variables allow to override the default
names of the set functions (set1_t›, \ldots, setm_t›). Type
arguments can be marked as dead by entering dead› in front of the
type variable (e.g., (dead 'a)›); otherwise, they are live or dead
(and a set function is generated or not) depending on where they occur in the
right-hand sides of the definition. Declaring a type argument as dead can speed
up the type definition but will prevent any later (co)recursion through that
type argument.

Inside a mutually recursive specification, all defined datatypes must
mention exactly the same type variables in the same order.

rail@{syntax_def dt_ctor}: (name ':')? name (@{syntax dt_ctor_arg} * ) mixfix?

\medskip

\noindent
The main constituents of a constructor specification are the name of the
constructor and the list of its argument types. An optional discriminator name
can be supplied at the front. If discriminators are enabled (cf.\ the
discs_sels› option) but no name is supplied, the default is
λx. x = Cj for nullary constructors and
t.is_Cj otherwise.

rail@{syntax_def dt_ctor_arg}: type | '(' name ':' type ')'

\medskip

\noindent
The syntactic entity \synt{type} denotes a HOL type cite"isabelle-isar-ref".

In addition to the type of a constructor argument, it is possible to specify a
name for the corresponding selector. The same selector name can be reused for
arguments to several constructors as long as the arguments share the same type.
If selectors are enabled (cf.\ the discs_sels› option) but no name is
supplied, the default name is un_Cji›.
›


subsubsection ‹\keyw{datatype_compat}
  \label{sssec:datatype-compat}›

text ‹
\begin{matharray}{rcl}
  @{command_def "datatype_compat"} & : & local_theory → local_theory›
\end{matharray}

rail@@{command datatype_compat} (name +)
›

\medskip

\noindent
The @{command datatype_compat} command registers new-style datatypes as
old-style datatypes and invokes the old-style plugins. For example:
›

    datatype_compat even_nat odd_nat

text ‹\blankline›

    ML Old_Datatype_Data.get_info theory type_nameeven_nat

text ‹
The syntactic entity \synt{name} denotes an identifier cite"isabelle-isar-ref".

The command is sometimes useful when migrating from the old datatype package to
the new one.

A few remarks concern nested recursive datatypes:

\begin{itemize}
\setlength{\itemsep}{0pt}

\item The old-style, nested-as-mutual induction rule and recursor theorems are
generated under their usual names but with ``compat_›'' prefixed
(e.g., compat_tree.induct›, compat_tree.inducts›, and
compat_tree.rec›). These theorems should be identical to the ones
generated by the old datatype package, \emph{up to the order of the
premises}---meaning that the subgoals generated by the induct› or
induction› method may be in a different order than before.

\item All types through which recursion takes place must be new-style datatypes
or the function type.
\end{itemize}
›


subsection ‹Generated Constants
  \label{ssec:datatype-generated-constants}›

text ‹
Given a datatype ('a1, …, 'am) t› with $m$ live type variables
and $n$ constructors t.C1, \ldots, t.Cn, the following
auxiliary constants are introduced:

\medskip

\begin{tabular}{@ {}ll@ {}}
Case combinator: &
  t.case_t› (rendered using the familiar case›--of› syntax) \\
Discriminators: &
  t.is_C1$, \ldots, $t.is_Cn \\
Selectors: &
  t.un_C11›$, \ldots, $t.un_C1k1 \\
& \quad\vdots \\
& t.un_Cn1›$, \ldots, $t.un_Cnkn \\
Set functions: &
  t.set1_t›, \ldots, t.setm_t› \\
Map function: &
  t.map_t› \\
Relator: &
  t.rel_t› \\
Recursor: &
  t.rec_t›
\end{tabular}

\medskip

\noindent
The discriminators and selectors are generated only if the discs_sels›
option is enabled or if names are specified for discriminators or selectors.
The set functions, map function, predicator, and relator are generated only if $m > 0$.

In addition, some of the plugins introduce their own constants
(Section~\ref{sec:selecting-plugins}). The case combinator, discriminators,
and selectors are collectively called \emph{destructors}. The prefix
``t.›'' is an optional component of the names and is normally hidden.
›


subsection ‹Generated Theorems
  \label{ssec:datatype-generated-theorems}›

text ‹
The characteristic theorems generated by @{command datatype} are grouped in
three broad categories:

\begin{itemize}
\setlength{\itemsep}{0pt}

\item The \emph{free constructor theorems}
(Section~\ref{sssec:free-constructor-theorems}) are properties of the
constructors and destructors that can be derived for any freely generated type.
Internally, the derivation is performed by @{command free_constructors}.

\item The \emph{functorial theorems} (Section~\ref{sssec:functorial-theorems})
are properties of datatypes related to their BNF nature.

\item The \emph{inductive theorems} (Section~\ref{sssec:inductive-theorems})
are properties of datatypes related to their inductive nature.

\end{itemize}

\noindent
The full list of named theorems can be obtained by issuing the command
\keyw{print_theorems} immediately after the datatype definition. This list
includes theorems produced by plugins (Section~\ref{sec:selecting-plugins}),
but normally excludes low-level theorems that reveal internal constructions.
To make these accessible, add the line
›

    declare [[bnf_internals]]
(*<*)
    declare [[bnf_internals = false]]
(*>*)


subsubsection ‹Free Constructor Theorems
  \label{sssec:free-constructor-theorems}›

(*<*)
    consts nonnull :: 'a
(*>*)

text ‹
The free constructor theorems are partitioned in three subgroups. The first
subgroup of properties is concerned with the constructors. They are listed below
for typ'a list:

\begin{indentblock}
\begin{description}

\item[t.›\hthm{inject} [iff, induct_simp]›\rm:] ~ \\
@{thm list.inject[no_vars]}

\item[t.›\hthm{distinct} [simp, induct_simp]›\rm:] ~ \\
@{thm list.distinct(1)[no_vars]} \\
@{thm list.distinct(2)[no_vars]}

\item[t.›\hthm{exhaust} [cases t, case_names C1 … Cn]›\rm:] ~ \\
@{thm list.exhaust[no_vars]}

\item[t.›\hthm{nchotomy}\rm:] ~ \\
@{thm list.nchotomy[no_vars]}

\end{description}
\end{indentblock}

\noindent
In addition, these nameless theorems are registered as safe elimination rules:

\begin{indentblock}
\begin{description}

\item[t.›\hthm{distinct {\upshape[}THEN notE}, elim!›\hthm{\upshape]}\rm:] ~ \\
@{thm list.distinct(1)[THEN notE, elim!, no_vars]} \\
@{thm list.distinct(2)[THEN notE, elim!, no_vars]}

\end{description}
\end{indentblock}

\noindent
The next subgroup is concerned with the case combinator:

\begin{indentblock}
\begin{description}

\item[t.›\hthm{case} [simp, code]›\rm:] ~ \\
@{thm list.case(1)[no_vars]} \\
@{thm list.case(2)[no_vars]} \\
The [code]› attribute is set by the code› plugin
(Section~\ref{ssec:code-generator}).

\item[t.›\hthm{case_cong} [fundef_cong]›\rm:] ~ \\
@{thm list.case_cong[no_vars]}

\item[t.›\hthm{case_cong_weak} [cong]›\rm:] ~ \\
@{thm list.case_cong_weak[no_vars]}

\item[t.›\hthm{case_distrib}\rm:] ~ \\
@{thm list.case_distrib[no_vars]}

\item[t.›\hthm{split}\rm:] ~ \\
@{thm list.split[no_vars]}

\item[t.›\hthm{split_asm}\rm:] ~ \\
@{thm list.split_asm[no_vars]}

\item[t.›\hthm{splits} = split split_asm›]

\end{description}
\end{indentblock}

\noindent
The third subgroup revolves around discriminators and selectors:

\begin{indentblock}
\begin{description}

\item[t.›\hthm{disc} [simp]›\rm:] ~ \\
@{thm list.disc(1)[no_vars]} \\
@{thm list.disc(2)[no_vars]}

\item[t.›\hthm{discI}\rm:] ~ \\
@{thm list.discI(1)[no_vars]} \\
@{thm list.discI(2)[no_vars]}

\item[t.›\hthm{sel} [simp, code]›\rm:] ~ \\
@{thm list.sel(1)[no_vars]} \\
@{thm list.sel(2)[no_vars]} \\
The [code]› attribute is set by the code› plugin
(Section~\ref{ssec:code-generator}).

\item[t.›\hthm{collapse} [simp]›\rm:] ~ \\
@{thm list.collapse(1)[no_vars]} \\
@{thm list.collapse(2)[no_vars]} \\
The [simp]› attribute is exceptionally omitted for datatypes equipped
with a single nullary constructor, because a property of the form
propx = C is not suitable as a simplification rule.

\item[t.›\hthm{distinct_disc} [dest]›\rm:] ~ \\
These properties are missing for typ'a list because there is only one
proper discriminator. If the datatype had been introduced with a second
discriminator called constnonnull, they would have read as follows: \\[\jot]
propnull list  ¬ nonnull list \\
propnonnull list  ¬ null list

\item[t.›\hthm{exhaust_disc} [case_names C1 … Cn]›\rm:] ~ \\
@{thm list.exhaust_disc[no_vars]}

\item[t.›\hthm{exhaust_sel} [case_names C1 … Cn]›\rm:] ~ \\
@{thm list.exhaust_sel[no_vars]}

\item[t.›\hthm{expand}\rm:] ~ \\
@{thm list.expand[no_vars]}

\item[t.›\hthm{split_sel}\rm:] ~ \\
@{thm list.split_sel[no_vars]}

\item[t.›\hthm{split_sel_asm}\rm:] ~ \\
@{thm list.split_sel_asm[no_vars]}

\item[t.›\hthm{split_sels} = split_sel split_sel_asm›]

\item[t.›\hthm{case_eq_if}\rm:] ~ \\
@{thm list.case_eq_if[no_vars]}

\item[t.›\hthm{disc_eq_case}\rm:] ~ \\
@{thm list.disc_eq_case(1)[no_vars]} \\
@{thm list.disc_eq_case(2)[no_vars]}

\end{description}
\end{indentblock}

\noindent
In addition, equational versions of t.disc› are registered with the
[code]› attribute. The [code]› attribute is set by the
code› plugin (Section~\ref{ssec:code-generator}).
›


subsubsection ‹Functorial Theorems
  \label{sssec:functorial-theorems}›

text ‹
The functorial theorems are generated for type constructors with at least
one live type argument (e.g., typ'a list). They are partitioned in two
subgroups. The first subgroup consists of properties involving the
constructors or the destructors and either a set function, the map function,
the predicator, or the relator:

\begin{indentblock}
\begin{description}

\item[t.›\hthm{case_transfer} [transfer_rule]›\rm:] ~ \\
@{thm list.case_transfer[no_vars]} \\
This property is generated by the transfer› plugin
(Section~\ref{ssec:transfer}).
%The [transfer_rule]› attribute is set by the transfer› plugin
%(Section~\ref{ssec:transfer}).

\item[t.›\hthm{sel_transfer} [transfer_rule]›\rm:] ~ \\
This property is missing for typ'a list because there is no common
selector to all constructors. \\
The [transfer_rule]› attribute is set by the transfer› plugin
(Section~\ref{ssec:transfer}).

\item[t.›\hthm{ctr_transfer} [transfer_rule]›\rm:] ~ \\
@{thm list.ctr_transfer(1)[no_vars]} \\
@{thm list.ctr_transfer(2)[no_vars]} \\
The [transfer_rule]› attribute is set by the transfer› plugin
(Section~\ref{ssec:transfer}) .

\item[t.›\hthm{disc_transfer} [transfer_rule]›\rm:] ~ \\
@{thm list.disc_transfer(1)[no_vars]} \\
@{thm list.disc_transfer(2)[no_vars]} \\
The [transfer_rule]› attribute is set by the transfer› plugin
(Section~\ref{ssec:transfer}).

\item[t.›\hthm{set} [simp, code]›\rm:] ~ \\
@{thm list.set(1)[no_vars]} \\
@{thm list.set(2)[no_vars]} \\
The [code]› attribute is set by the code› plugin
(Section~\ref{ssec:code-generator}).

\item[t.›\hthm{set_cases} [consumes 1, cases set: seti_t]›\rm:] ~ \\
@{thm list.set_cases[no_vars]}

\item[t.›\hthm{set_intros}\rm:] ~ \\
@{thm list.set_intros(1)[no_vars]} \\
@{thm list.set_intros(2)[no_vars]}

\item[t.›\hthm{set_sel}\rm:] ~ \\
@{thm list.set_sel[no_vars]}

\item[t.›\hthm{map} [simp, code]›\rm:] ~ \\
@{thm list.map(1)[no_vars]} \\
@{thm list.map(2)[no_vars]} \\
The [code]› attribute is set by the code› plugin
(Section~\ref{ssec:code-generator}).

\item[t.›\hthm{map_disc_iff} [simp]›\rm:] ~ \\
@{thm list.map_disc_iff[no_vars]}

\item[t.›\hthm{map_sel}\rm:] ~ \\
@{thm list.map_sel[no_vars]}

\item[t.›\hthm{pred_inject} [simp]›\rm:] ~ \\
@{thm list.pred_inject(1)[no_vars]} \\
@{thm list.pred_inject(2)[no_vars]}

\item[t.›\hthm{rel_inject} [simp]›\rm:] ~ \\
@{thm list.rel_inject(1)[no_vars]} \\
@{thm list.rel_inject(2)[no_vars]}

\item[t.›\hthm{rel_distinct} [simp]›\rm:] ~ \\
@{thm list.rel_distinct(1)[no_vars]} \\
@{thm list.rel_distinct(2)[no_vars]}

\item[t.›\hthm{rel_intros}\rm:] ~ \\
@{thm list.rel_intros(1)[no_vars]} \\
@{thm list.rel_intros(2)[no_vars]}

\item[t.›\hthm{rel_cases} [consumes 1, case_names t1 … tm, cases pred]›\rm:] ~ \\
@{thm list.rel_cases[no_vars]}

\item[t.›\hthm{rel_sel}\rm:] ~ \\
@{thm list.rel_sel[no_vars]}

\end{description}
\end{indentblock}

\noindent
In addition, equational versions of t.rel_inject› and rel_distinct› are registered with the [code]› attribute. The
[code]› attribute is set by the code› plugin
(Section~\ref{ssec:code-generator}).

The second subgroup consists of more abstract properties of the set functions,
the map function, the predicator, and the relator:

\begin{indentblock}
\begin{description}

\item[t.›\hthm{inj_map}\rm:] ~ \\
@{thm list.inj_map[no_vars]}

\item[t.›\hthm{inj_map_strong}\rm:] ~ \\
@{thm list.inj_map_strong[no_vars]}

\item[t.›\hthm{map_comp}\rm:] ~ \\
@{thm list.map_comp[no_vars]}

\item[t.›\hthm{map_cong0}\rm:] ~ \\
@{thm list.map_cong0[no_vars]}

\item[t.›\hthm{map_cong} [fundef_cong]›\rm:] ~ \\
@{thm list.map_cong[no_vars]}

\item[t.›\hthm{map_cong_pred}\rm:] ~ \\
@{thm list.map_cong_pred[no_vars]}

\item[t.›\hthm{map_cong_simp}\rm:] ~ \\
@{thm list.map_cong_simp[no_vars]}

\item[t.›\hthm{map_id0}\rm:] ~ \\
@{thm list.map_id0[no_vars]}

\item[t.›\hthm{map_id}\rm:] ~ \\
@{thm list.map_id[no_vars]}

\item[t.›\hthm{map_ident}\rm:] ~ \\
@{thm list.map_ident[no_vars]}

\item[t.›\hthm{map_ident_strong}\rm:] ~ \\
@{thm list.map_ident_strong[no_vars]}

\item[t.›\hthm{map_transfer} [transfer_rule]›\rm:] ~ \\
@{thm list.map_transfer[no_vars]} \\
The [transfer_rule]› attribute is set by the transfer› plugin
(Section~\ref{ssec:transfer}) for type constructors with no dead type arguments.

\item[t.›\hthm{pred_cong} [fundef_cong]›\rm:] ~ \\
@{thm list.pred_cong[no_vars]}

\item[t.›\hthm{pred_cong_simp}\rm:] ~ \\
@{thm list.pred_cong_simp[no_vars]}

\item[t.›\hthm{pred_map}\rm:] ~ \\
@{thm list.pred_map[no_vars]}

\item[t.›\hthm{pred_mono} [mono]›\rm:] ~ \\
@{thm list.pred_mono[no_vars]}

\item[t.›\hthm{pred_mono_strong}\rm:] ~ \\
@{thm list.pred_mono_strong[no_vars]}

\item[t.›\hthm{pred_rel}\rm:] ~ \\
@{thm list.pred_rel[no_vars]}

\item[t.›\hthm{pred_set}\rm:] ~ \\
@{thm list.pred_set[no_vars]}

\item[t.›\hthm{pred_transfer} [transfer_rule]›\rm:] ~ \\
@{thm list.pred_transfer[no_vars]} \\
The [transfer_rule]› attribute is set by the transfer› plugin
(Section~\ref{ssec:transfer}) for type constructors with no dead type arguments.

\item[t.›\hthm{pred_True}\rm:] ~ \\
@{thm list.pred_True[no_vars]}

\item[t.›\hthm{set_map}\rm:] ~ \\
@{thm list.set_map[no_vars]}

\item[t.›\hthm{set_transfer} [transfer_rule]›\rm:] ~ \\
@{thm list.set_transfer[no_vars]} \\
The [transfer_rule]› attribute is set by the transfer› plugin
(Section~\ref{ssec:transfer}) for type constructors with no dead type arguments.

\item[t.›\hthm{rel_compp} [relator_distr]›\rm:] ~ \\
@{thm list.rel_compp[no_vars]} \\
The [relator_distr]› attribute is set by the lifting› plugin
(Section~\ref{ssec:lifting}).

\item[t.›\hthm{rel_conversep}\rm:] ~ \\
@{thm list.rel_conversep[no_vars]}

\item[t.›\hthm{rel_eq}\rm:] ~ \\
@{thm list.rel_eq[no_vars]}

\item[t.›\hthm{rel_eq_onp}\rm:] ~ \\
@{thm list.rel_eq_onp[no_vars]}

\item[t.›\hthm{rel_flip}\rm:] ~ \\
@{thm list.rel_flip[no_vars]}

\item[t.›\hthm{rel_map}\rm:] ~ \\
@{thm list.rel_map(1)[no_vars]} \\
@{thm list.rel_map(2)[no_vars]}

\item[t.›\hthm{rel_mono} [mono, relator_mono]›\rm:] ~ \\
@{thm list.rel_mono[no_vars]} \\
The [relator_mono]› attribute is set by the lifting› plugin
(Section~\ref{ssec:lifting}).

\item[t.›\hthm{rel_mono_strong}\rm:] ~ \\
@{thm list.rel_mono_strong[no_vars]}

\item[t.›\hthm{rel_cong} [fundef_cong]›\rm:] ~ \\
@{thm list.rel_cong[no_vars]}

\item[t.›\hthm{rel_cong_simp}\rm:] ~ \\
@{thm list.rel_cong_simp[no_vars]}

\item[t.›\hthm{rel_refl}\rm:] ~ \\
@{thm list.rel_refl[no_vars]}

\item[t.›\hthm{rel_refl_strong}\rm:] ~ \\
@{thm list.rel_refl_strong[no_vars]}

\item[t.›\hthm{rel_reflp}\rm:] ~ \\
@{thm list.rel_reflp[no_vars]}

\item[t.›\hthm{rel_symp}\rm:] ~ \\
@{thm list.rel_symp[no_vars]}

\item[t.›\hthm{rel_transp}\rm:] ~ \\
@{thm list.rel_transp[no_vars]}

\item[t.›\hthm{rel_transfer} [transfer_rule]›\rm:] ~ \\
@{thm list.rel_transfer[no_vars]} \\
The [transfer_rule]› attribute is set by the transfer› plugin
(Section~\ref{ssec:transfer}) for type constructors with no dead type arguments.

\end{description}
\end{indentblock}
›


subsubsection ‹Inductive Theorems
  \label{sssec:inductive-theorems}›

text ‹
The inductive theorems are as follows:

\begin{indentblock}
\begin{description}

\item[t.›\hthm{induct} [case_names C1 … Cn, induct t]›\rm:] ~ \\
@{thm list.induct[no_vars]}

\item[t.›\hthm{rel_induct} [case_names C1 … Cn, induct pred]›\rm:] ~ \\
@{thm list.rel_induct[no_vars]}

\item[\begin{tabular}{@ {}l@ {}}
  t1_…_tm.›\hthm{induct} [case_names C1 … Cn]›\rm: \\
  t1_…_tm.›\hthm{rel_induct} [case_names C1 … Cn]›\rm: \\
\end{tabular}] ~ \\
Given $m > 1$ mutually recursive datatypes, this induction rule can be used to
prove $m$ properties simultaneously.

\item[t.›\hthm{rec} [simp, code]›\rm:] ~ \\
@{thm list.rec(1)[no_vars]} \\
@{thm list.rec(2)[no_vars]} \\
The [code]› attribute is set by the code› plugin
(Section~\ref{ssec:code-generator}).

\item[t.›\hthm{rec_o_map}\rm:] ~ \\
@{thm list.rec_o_map[no_vars]}

\item[t.›\hthm{rec_transfer} [transfer_rule]›\rm:] ~ \\
@{thm list.rec_transfer[no_vars]} \\
The [transfer_rule]› attribute is set by the transfer› plugin
(Section~\ref{ssec:transfer}) for type constructors with no dead type arguments.

\end{description}
\end{indentblock}

\noindent
For convenience, @{command datatype} also provides the following collection:

\begin{indentblock}
\begin{description}

\item[t.›\hthm{simps}] = t.inject› t.distinct› t.case› t.rec› t.map› t.rel_inject› \\
t.rel_distinct› t.set›

\end{description}
\end{indentblock}
›


subsection ‹Proof Method
  \label{ssec:datatype-proof-method}›

subsubsection ‹\textit{countable_datatype}
  \label{sssec:datatype-compat}›

text ‹
The theory 🗏‹~~/src/HOL/Library/Countable.thy› provides a
proof method called @{method countable_datatype} that can be used to prove the
countability of many datatypes, building on the countability of the types
appearing in their definitions and of any type arguments. For example:
›

    instance list :: (countable) countable
      by countable_datatype


subsection ‹Antiquotation
  \label{ssec:datatype-antiquotation}›

subsubsection ‹\textit{datatype}
  \label{sssec:datatype-datatype}›

text ‹
The \textit{datatype} antiquotation, written
\texttt{\char`\\\char`<\char`^}\verb|datatype>|\guilsinglleft\textit{t}\guilsinglright{}
or \texttt{@}\verb|{datatype| \textit{t}\verb|}|, where \textit{t} is a type
name, expands to \LaTeX{} code for the definition of the datatype, with each
constructor listed with its argument types. For example, if \textit{t} is
@{type option}:

\begin{quote}
datatypeoption
\end{quote}
›


subsection ‹Compatibility Issues
  \label{ssec:datatype-compatibility-issues}›

text ‹
The command @{command datatype} has been designed to be highly compatible with
the old, pre-Isabelle2015 command, to ease migration. There are nonetheless a
few incompatibilities that may arise when porting:

\begin{itemize}
\setlength{\itemsep}{0pt}

\item \emph{The Standard ML interfaces are different.} Tools and extensions
written to call the old ML interfaces will need to be adapted to the new
interfaces. The BNF_LFP_Compat› structure provides convenience functions
that simulate the old interfaces in terms of the new ones.

\item \emph{The recursor rec_t› has a different signature for nested
recursive datatypes.} In the old package, nested recursion through non-functions
was internally reduced to mutual recursion. This reduction was visible in the
type of the recursor, used by \keyw{primrec}. Recursion through functions was
handled specially. In the new package, nested recursion (for functions and
non-functions) is handled in a more modular fashion. The old-style recursor can
be generated on demand using @{command primrec} if the recursion is via
new-style datatypes, as explained in
Section~\ref{sssec:primrec-nested-as-mutual-recursion}, or using
@{command datatype_compat}.

\item \emph{Accordingly, the induction rule is different for nested recursive
datatypes.} Again, the old-style induction rule can be generated on demand
using @{command primrec} if the recursion is via new-style datatypes, as
explained in Section~\ref{sssec:primrec-nested-as-mutual-recursion}, or using
@{command datatype_compat}. For recursion through functions, the old-style
induction rule can be obtained by applying the [unfolded
all_mem_range]› attribute on t.induct›.

\item \emph{The constsize function has a slightly different definition.}
The new function returns 1› instead of 0› for some nonrecursive
constructors. This departure from the old behavior made it possible to implement
constsize in terms of the generic function t.size_t›. Moreover,
the new function considers nested occurrences of a value, in the nested
recursive case. The old behavior can be obtained by disabling the size›
plugin (Section~\ref{sec:selecting-plugins}) and instantiating the
size› type class manually.

\item \emph{The internal constructions are completely different.} Proof texts
that unfold the definition of constants introduced by the old command will
be difficult to port.

\item \emph{Some constants and theorems have different names.}
For non-mutually recursive datatypes,
the alias t.inducts› for t.induct› is no longer generated.
For $m > 1$ mutually recursive datatypes,
rec_t1_…_tm_i› has been renamed
rec_ti for each i ∈ {1, …, m}›,
t1_…_tm.inducts(i)› has been renamed
ti.induct› for each i ∈ {1, …, m}›, and the
collection t1_…_tm.size› (generated by the
size› plugin, Section~\ref{ssec:size}) has been divided into
t1.size›, \ldots, tm.size›.

\item \emph{The t.simps› collection has been extended.}
Previously available theorems are available at the same index as before.

\item \emph{Variables in generated properties have different names.} This is
rarely an issue, except in proof texts that refer to variable names in the
[where …]› attribute. The solution is to use the more robust
[of …]› syntax.
\end{itemize}
›


section ‹Defining Primitively Recursive Functions
  \label{sec:defining-primitively-recursive-functions}›

text ‹
Recursive functions over datatypes can be specified using the @{command primrec}
command, which supports primitive recursion, or using the \keyw{fun},
\keyw{function}, and \keyw{partial_function} commands. In this tutorial, the
focus is on @{command primrec}; \keyw{fun} and \keyw{function} are described in
a separate tutorial cite"isabelle-function".

Because it is restricted to primitive recursion, @{command primrec} is less
powerful than \keyw{fun} and \keyw{function}. However, there are primitively
recursive specifications (e.g., based on infinitely branching or mutually
recursive datatypes) for which \keyw{fun}'s termination check fails. It is also
good style to use the simpler @{command primrec} mechanism when it works, both
as an optimization and as documentation.
›


subsection ‹Introductory Examples
  \label{ssec:primrec-introductory-examples}›

text ‹
Primitive recursion is illustrated through concrete examples based on the
datatypes defined in Section~\ref{ssec:datatype-introductory-examples}. More
examples can be found in the directory 🗀‹~~/src/HOL/Datatype_Examples›.
›


subsubsection ‹Nonrecursive Types
  \label{sssec:primrec-nonrecursive-types}›

text ‹
Primitive recursion removes one layer of constructors on the left-hand side in
each equation. For example:
›

    primrec (nonexhaustive) bool_of_trool :: "trool  bool" where
      "bool_of_trool Faalse  False"
    | "bool_of_trool Truue  True"

text ‹\blankline›

    primrec the_list :: "'a option  'a list" where
      "the_list None = []"
    | "the_list (Some a) = [a]"

text ‹\blankline›

    primrec the_default :: "'a  'a option  'a" where
      "the_default d None = d"
    | "the_default _ (Some a) = a"

text ‹\blankline›

    primrec mirrror :: "('a, 'b, 'c) triple  ('c, 'b, 'a) triple" where
      "mirrror (Triple a b c) = Triple c b a"

text ‹
\noindent
The equations can be specified in any order, and it is acceptable to leave out
some cases, which are then unspecified. Pattern matching on the left-hand side
is restricted to a single datatype, which must correspond to the same argument
in all equations.
›


subsubsection ‹Simple Recursion
  \label{sssec:primrec-simple-recursion}›

text ‹
For simple recursive types, recursive calls on a constructor argument are
allowed on the right-hand side:
›

    primrec replicate :: "nat  'a  'a list" where
      "replicate Zero _ = []"
    | "replicate (Succ n) x = x # replicate n x"

text ‹\blankline›

    primrec (nonexhaustive) at :: "'a list  nat  'a" where
      "at (x # xs) j =
         (case j of
            Zero  x
          | Succ j'  at xs j')"

text ‹\blankline›

    primrec (*<*)(in early) (transfer) (*>*)tfold :: "('a  'b  'b)  ('a, 'b) tlist  'b" where
      "tfold _ (TNil y) = y"
    | "tfold f (TCons x xs) = f x (tfold f xs)"

text ‹
\noindent
Pattern matching is only available for the argument on which the recursion takes
place. Fortunately, it is easy to generate pattern-maching equations using the
@{command simps_of_case} command provided by the theory
🗏‹~~/src/HOL/Library/Simps_Case_Conv.thy›.
›

    simps_of_case at_simps_alt: at.simps

text ‹
This generates the lemma collection @{thm [source] at_simps_alt}:
%
\[@{thm at_simps_alt(1)[no_vars]}
  \qquad @{thm at_simps_alt(2)[no_vars]}\]
%
The next example is defined using \keyw{fun} to escape the syntactic
restrictions imposed on primitively recursive functions:
›

    fun at_least_two :: "nat  bool" where
      "at_least_two (Succ (Succ _))  True"
    | "at_least_two _  False"


subsubsection ‹Mutual Recursion
  \label{sssec:primrec-mutual-recursion}›

text ‹
The syntax for mutually recursive functions over mutually recursive datatypes
is straightforward:
›

    primrec
      nat_of_even_nat :: "even_nat  nat" and
      nat_of_odd_nat :: "odd_nat  nat"
    where
      "nat_of_even_nat Even_Zero = Zero"
    | "nat_of_even_nat (Even_Succ n) = Succ (nat_of_odd_nat n)"
    | "nat_of_odd_nat (Odd_Succ n) = Succ (nat_of_even_nat n)"

text ‹\blankline›

    primrec
      evale :: "('a  int)  ('b  int)  ('a, 'b) exp  int" and
      evalt :: "('a  int)  ('b  int)  ('a, 'b) trm  int" and
      evalf :: "('a  int)  ('b  int)  ('a, 'b) fct  int"
    where
      "evale γ ξ (Term t) = evalt γ ξ t"
    | "evale γ ξ (Sum t e) = evalt γ ξ t + evale γ ξ e"
    | "evalt γ ξ (Factor f) = evalf γ ξ f"
    | "evalt γ ξ (Prod f t) = evalf γ ξ f + evalt γ ξ t"
    | "evalf γ _ (Const a) = γ a"
    | "evalf _ ξ (Var b) = ξ b"
    | "evalf γ ξ (Expr e) = evale γ ξ e"

text ‹
\noindent
Mutual recursion is possible within a single type, using \keyw{fun}:
›

    fun
      even :: "nat  bool" and
      odd :: "nat  bool"
    where
      "even Zero = True"
    | "even (Succ n) = odd n"
    | "odd Zero = False"
    | "odd (Succ n) = even n"


subsubsection ‹Nested Recursion
  \label{sssec:primrec-nested-recursion}›

text ‹
In a departure from the old datatype package, nested recursion is normally
handled via the map functions of the nesting type constructors. For example,
recursive calls are lifted to lists using constmap:
›

(*<*)
    datatype 'a treeff = Nodeff (lblff: 'a) (subff: "'a treeff list")
(*>*)
    primrec atff :: "'a treeff  nat list  'a" where
      "atff (Nodeff a ts) js =
         (case js of
            []  a
          | j # js'  at (map (λt. atff t js') ts) j)"

text ‹
\noindent
The next example features recursion through the option› type. Although
option› is not a new-style datatype, it is registered as a BNF with the
map function constmap_option:
›

    primrec (*<*)(in early) (*>*)sum_btree :: "('a::{zero,plus}) btree  'a" where
      "sum_btree (BNode a lt rt) =
         a + the_default 0 (map_option sum_btree lt) +
           the_default 0 (map_option sum_btree rt)"

text ‹
\noindent
The same principle applies for arbitrary type constructors through which
recursion is possible. Notably, the map function for the function type
(⇒›) is simply composition ((∘)›):
›

    primrec (*<*)(in early) (*>*)relabel_ft :: "('a  'a)  'a ftree  'a ftree" where
      "relabel_ft f (FTLeaf x) = FTLeaf (f x)"
    | "relabel_ft f (FTNode g) = FTNode (relabel_ft f  g)"

text ‹
\noindent
For convenience, recursion through functions can also be expressed using
$\lambda$-abstractions and function application rather than through composition.
For example:
›

    primrec relabel_ft :: "('a  'a)  'a ftree  'a ftree" where
      "relabel_ft f (FTLeaf x) = FTLeaf (f x)"
    | "relabel_ft f (FTNode g) = FTNode (λx. relabel_ft f (g x))"

text ‹\blankline›

    primrec (nonexhaustive) subtree_ft :: "'a  'a ftree  'a ftree" where
      "subtree_ft x (FTNode g) = g x"

text ‹
\noindent
For recursion through curried $n$-ary functions, $n$ applications of
term(∘) are necessary. The examples below illustrate the case where
$n = 2$:
›

    datatype 'a ftree2 = FTLeaf2 'a | FTNode2 "'a  'a  'a ftree2"

text ‹\blankline›

    primrec (*<*)(in early) (*>*)relabel_ft2 :: "('a  'a)  'a ftree2  'a ftree2" where
      "relabel_ft2 f (FTLeaf2 x) = FTLeaf2 (f x)"
    | "relabel_ft2 f (FTNode2 g) = FTNode2 ((∘) ((∘) (relabel_ft2 f)) g)"

text ‹\blankline›

    primrec relabel_ft2 :: "('a  'a)  'a ftree2  'a ftree2" where
      "relabel_ft2 f (FTLeaf2 x) = FTLeaf2 (f x)"
    | "relabel_ft2 f (FTNode2 g) = FTNode2 (λx y. relabel_ft2 f (g x y))"

text ‹\blankline›

    primrec (nonexhaustive) subtree_ft2 :: "'a  'a  'a ftree2  'a ftree2" where
      "subtree_ft2 x y (FTNode2 g) = g x y"

text ‹
For any datatype featuring nesting, the predicator can be used instead of the
map function, typically when defining predicates. For example:
›

    primrec increasing_tree :: "int  int treeff  bool" where
      "increasing_tree m (Nodeff n ts) 
       n  m  list_all (increasing_tree (n + 1)) ts"


subsubsection ‹Nested-as-Mutual Recursion
  \label{sssec:primrec-nested-as-mutual-recursion}›

(*<*)
    locale n2m
    begin
(*>*)

text ‹
For compatibility with the old package, but also because it is sometimes
convenient in its own right, it is possible to treat nested recursive datatypes
as mutually recursive ones if the recursion takes place though new-style
datatypes. For example:
›

    primrec (nonexhaustive)
      atff :: "'a treeff  nat list  'a" and
      atsff :: "'a treeff list  nat  nat list  'a"
    where
      "atff (Nodeff a ts) js =
         (case js of
            []  a
          | j # js'  atsff ts j js')"
    | "atsff (t # ts) j =
         (case j of
            Zero  atff t
          | Succ j'  atsff ts j')"

text ‹
\noindent
Appropriate induction rules are generated as
@{thm [source] atff.induct},
@{thm [source] atsff.induct}, and
@{thm [source] atff_atsff.induct}. The
induction rules and the underlying recursors are generated dynamically
and are kept in a cache to speed up subsequent definitions.

Here is a second example:
›

    primrec
      sum_btree :: "('a::{zero,plus}) btree  'a" and
      sum_btree_option :: "'a btree option  'a"
    where
      "sum_btree (BNode a lt rt) =
         a + sum_btree_option lt + sum_btree_option rt"
    | "sum_btree_option None = 0"
    | "sum_btree_option (Some t) = sum_btree t"

text ‹
%  * can pretend a nested type is mutually recursive (if purely inductive)
%  * avoids the higher-order map
%  * e.g.

%  * this can always be avoided;
%     * e.g. in our previous example, we first mapped the recursive
%       calls, then we used a generic at function to retrieve the result
%
%  * there's no hard-and-fast rule of when to use one or the other,
%    just like there's no rule when to use fold and when to use
%    primrec
%
%  * higher-order approach, considering nesting as nesting, is more
%    compositional -- e.g. we saw how we could reuse an existing polymorphic
%    at or the_default, whereas constatsff is much more specific
%
%  * but:
%     * is perhaps less intuitive, because it requires higher-order thinking
%     * may seem inefficient, and indeed with the code generator the
%       mutually recursive version might be nicer
%     * is somewhat indirect -- must apply a map first, then compute a result
%       (cannot mix)
%     * the auxiliary functions like constatsff are sometimes useful in own right
%
%  * impact on automation unclear
%
›
(*<*)
    end
(*>*)


subsection ‹Command Syntax
  \label{ssec:primrec-command-syntax}›

subsubsection ‹\keyw{primrec}
  \label{sssec:primrec}›

text ‹
\begin{matharray}{rcl}
  @{command_def "primrec"} & : & local_theory → local_theory›
\end{matharray}

rail@@{command primrec} target? @{syntax pr_options}? fixes 
  @'where' (@{syntax pr_equation} + '|')
  ;
  @{syntax_def pr_options}: '(' ((@{syntax plugins} | 'nonexhaustive' | 'transfer') + ',') ')'
  ;
  @{syntax_def pr_equation}: thmdecl? prop
›

\medskip

\noindent
The @{command primrec} command introduces a set of mutually recursive functions
over datatypes.

The syntactic entity \synt{target} can be used to specify a local context,
\synt{fixes} denotes a list of names with optional type signatures,
\synt{thmdecl} denotes an optional name for the formula that follows, and
\synt{prop} denotes a HOL proposition cite"isabelle-isar-ref".

The optional target is optionally followed by a combination of the following
options:

\begin{itemize}
\setlength{\itemsep}{0pt}

\item
The plugins› option indicates which plugins should be enabled
(only›) or disabled (del›). By default, all plugins are enabled.

\item
The nonexhaustive› option indicates that the functions are not
necessarily specified for all constructors. It can be used to suppress the
warning that is normally emitted when some constructors are missing.

\item
The transfer› option indicates that an unconditional transfer rule
should be generated and proved by transfer_prover›. The
[transfer_rule]› attribute is set on the generated theorem.
\end{itemize}

%%% TODO: elaborate on format of the equations
%%% TODO: elaborate on mutual and nested-to-mutual
›


subsection ‹Generated Theorems
  \label{ssec:primrec-generated-theorems}›

(*<*)
    context early
    begin
(*>*)

text ‹
The @{command primrec} command generates the following properties (listed
for consttfold):

\begin{indentblock}
\begin{description}

\item[f.›\hthm{simps} [simp, code]›\rm:] ~ \\
@{thm tfold.simps(1)[no_vars]} \\
@{thm tfold.simps(2)[no_vars]} \\
The [code]› attribute is set by the code› plugin
(Section~\ref{ssec:code-generator}).

\item[f.›\hthm{transfer} [transfer_rule]›\rm:] ~ \\
@{thm tfold.transfer[no_vars]} \\
This theorem is generated by the transfer› plugin
(Section~\ref{ssec:transfer}) for functions declared with the transfer›
option enabled.

\item[f.›\hthm{induct} [case_names C1 … Cn]›\rm:] ~ \\
This induction rule is generated for nested-as-mutual recursive functions
(Section~\ref{sssec:primrec-nested-as-mutual-recursion}).

\item[f1_…_fm.›\hthm{induct} [case_names C1 … Cn]›\rm:] ~ \\
This induction rule is generated for nested-as-mutual recursive functions
(Section~\ref{sssec:primrec-nested-as-mutual-recursion}). Given $m > 1$ mutually
recursive functions, this rule can be used to prove $m$ properties
simultaneously.

\end{description}
\end{indentblock}
›

(*<*)
    end
(*>*)


subsection ‹Recursive Default Values for Selectors
  \label{ssec:primrec-recursive-default-values-for-selectors}›

text ‹
A datatype selector un_D› can have a default value for each constructor
on which it is not otherwise specified. Occasionally, it is useful to have the
default value be defined recursively. This leads to a chicken-and-egg
situation, because the datatype is not introduced yet at the moment when the
selectors are introduced. Of course, we can always define the selectors
manually afterward, but we then have to state and prove all the characteristic
theorems ourselves instead of letting the package do it.

Fortunately, there is a workaround that relies on overloading to relieve us
from the tedium of manual derivations:

\begin{enumerate}
\setlength{\itemsep}{0pt}

\item
Introduce a fully unspecified constant un_D0 :: 'a› using
@{command consts}.

\item
Define the datatype, specifying un_D0 as the selector's default
value.

\item
Define the behavior of un_D0 on values of the newly introduced
datatype using the \keyw{overloading} command.

\item
Derive the desired equation on un_D› from the characteristic equations
for un_D0.
\end{enumerate}

\noindent
The following example illustrates this procedure:
›

(*<*)
    hide_const termi
(*>*)
    consts termi0 :: 'a

text ‹\blankline›

    datatype ('a, 'b) tlist =
      TNil (termi: 'b)
    | TCons (thd: 'a) (ttl: "('a, 'b) tlist")
    where
      "ttl (TNil y) = TNil y"
    | "termi (TCons _ xs) = termi0 xs"

text ‹\blankline›

    overloading
      termi0  "termi0 :: ('a, 'b) tlist  'b"
    begin
    primrec termi0 :: "('a, 'b) tlist  'b" where
      "termi0 (TNil y) = y"
    | "termi0 (TCons x xs) = termi0 xs"
    end

text ‹\blankline›

    lemma termi_TCons[simp]: "termi (TCons x xs) = termi xs"
      by (cases xs) auto


subsection ‹Compatibility Issues
  \label{ssec:primrec-compatibility-issues}›

text ‹
The command @{command primrec}'s behavior on new-style datatypes has been
designed to be highly compatible with that for old, pre-Isabelle2015 datatypes,
to ease migration. There is nonetheless at least one incompatibility that may
arise when porting to the new package:

\begin{itemize}
\setlength{\itemsep}{0pt}

\item \emph{Some theorems have different names.}
For $m > 1$ mutually recursive functions,
f1_…_fm.simps› has been broken down into separate
subcollections fi.simps›.
\end{itemize}
›


section ‹Defining Codatatypes
  \label{sec:defining-codatatypes}›

text ‹
Codatatypes can be specified using the @{command codatatype} command. The
command is first illustrated through concrete examples featuring different
flavors of corecursion. More examples can be found in the directory
🗀‹~~/src/HOL/Datatype_Examples›. The \emph{Archive of Formal Proofs} also
includes some useful codatatypes, notably for lazy lists cite"lochbihler-2010".
›


subsection ‹Introductory Examples
  \label{ssec:codatatype-introductory-examples}›

subsubsection ‹Simple Corecursion
  \label{sssec:codatatype-simple-corecursion}›

text ‹
Non-corecursive codatatypes coincide with the corresponding datatypes, so they
are rarely used in practice. \emph{Corecursive codatatypes} have the same syntax
as recursive datatypes, except for the command name. For example, here is the
definition of lazy lists:
›

    codatatype (lset: 'a) llist =
      lnull: LNil
    | LCons (lhd: 'a) (ltl: "'a llist")
    for
      map: lmap
      rel: llist_all2
      pred: llist_all
    where
      "ltl LNil = LNil"

text ‹
\noindent
Lazy lists can be infinite, such as LCons 0 (LCons 0 (…))› and
LCons 0 (LCons 1 (LCons 2 (…)))›. Here is a related type, that of
infinite streams:
›

    codatatype (sset: 'a) stream =
      SCons (shd: 'a) (stl: "'a stream")
    for
      map: smap
      rel: stream_all2

text ‹
\noindent
Another interesting type that can
be defined as a codatatype is that of the extended natural numbers:
›

    codatatype enat = EZero | ESucc enat

text ‹
\noindent
This type has exactly one infinite element, ESucc (ESucc (ESucc (…)))›,
that represents $\infty$. In addition, it has finite values of the form
ESucc (… (ESucc EZero)…)›.

Here is an example with many constructors:
›

    codatatype 'a process =
      Fail
    | Skip (cont: "'a process")
    | Action (prefix: 'a) (cont: "'a process")
    | Choice (left: "'a process") (right: "'a process")

text ‹
\noindent
Notice that the constcont selector is associated with both constSkip
and constAction.
›


subsubsection ‹Mutual Corecursion
  \label{sssec:codatatype-mutual-corecursion}›

text ‹
\noindent
The example below introduces a pair of \emph{mutually corecursive} types:
›

    codatatype even_enat = Even_EZero | Even_ESucc odd_enat
    and odd_enat = Odd_ESucc even_enat


subsubsection ‹Nested Corecursion
  \label{sssec:codatatype-nested-corecursion}›

text ‹
\noindent
The next examples feature \emph{nested corecursion}:
›

    codatatype 'a treeii = Nodeii (lblii: 'a) (subii: "'a treeii llist")

text ‹\blankline›

    codatatype 'a treeis = Nodeis (lblis: 'a) (subis: "'a treeis fset")

text ‹\blankline›

    codatatype 'a sm = SM (accept: bool) (trans: "'a  'a sm")


subsection ‹Command Syntax
  \label{ssec:codatatype-command-syntax}›

subsubsection ‹\keyw{codatatype}
  \label{sssec:codatatype}›

text ‹
\begin{matharray}{rcl}
  @{command_def "codatatype"} & : & local_theory → local_theory›
\end{matharray}

rail@@{command codatatype} target? @{syntax dt_options}? @{syntax dt_spec}

\medskip

\noindent
Definitions of codatatypes have almost exactly the same syntax as for datatypes
(Section~\ref{ssec:datatype-command-syntax}). The discs_sels› option
is superfluous because discriminators and selectors are always generated for
codatatypes.
›


subsection ‹Generated Constants
  \label{ssec:codatatype-generated-constants}›

text ‹
Given a codatatype ('a1, …, 'am) t›
with $m > 0$ live type variables and $n$ constructors t.C1,
\ldots, t.Cn, the same auxiliary constants are generated as for
datatypes (Section~\ref{ssec:datatype-generated-constants}), except that the
recursor is replaced by a dual concept:

\medskip

\begin{tabular}{@ {}ll@ {}}
Corecursor: &
  t.corec_t›
\end{tabular}
›


subsection ‹Generated Theorems
  \label{ssec:codatatype-generated-theorems}›

text ‹
The characteristic theorems generated by @{command codatatype} are grouped in
three broad categories:

\begin{itemize}
\setlength{\itemsep}{0pt}

\item The \emph{free constructor theorems}
(Section~\ref{sssec:free-constructor-theorems}) are properties of the
constructors and destructors that can be derived for any freely generated type.

\item The \emph{functorial theorems}
(Section~\ref{sssec:functorial-theorems}) are properties of datatypes related to
their BNF nature.

\item The \emph{coinductive theorems} (Section~\ref{sssec:coinductive-theorems})
are properties of datatypes related to their coinductive nature.
\end{itemize}

\noindent
The first two categories are exactly as for datatypes.
›


subsubsection ‹Coinductive Theorems
  \label{sssec:coinductive-theorems}›

text ‹
The coinductive theorems are listed below for typ'a llist:

\begin{indentblock}
\begin{description}

\item[\begin{tabular}{@ {}l@ {}}
  t.›\hthm{coinduct} [consumes m, case_names t1 … tm,› \\
  \phantom{t.›\hthm{coinduct} [›}case_conclusion D1 …
  Dn, coinduct t]›\rm:
\end{tabular}] ~ \\
@{thm llist.coinduct[no_vars]}

\item[\begin{tabular}{@ {}l@ {}}
  t.›\hthm{coinduct_strong} [consumes m, case_names t1 … tm,› \\
  \phantom{t.›\hthm{coinduct_strong} [›}case_conclusion D1 … Dn]›\rm:
\end{tabular}] ~ \\
@{thm llist.coinduct_strong[no_vars]}

\item[\begin{tabular}{@ {}l@ {}}
  t.›\hthm{rel_coinduct} [consumes m, case_names t1 … tm,› \\
  \phantom{t.›\hthm{rel_coinduct} [›}case_conclusion D1 …
  Dn, coinduct pred]›\rm:
\end{tabular}] ~ \\
@{thm llist.rel_coinduct[no_vars]}

\item[\begin{tabular}{@ {}l@ {}}
  t1_…_tm.›\hthm{coinduct} [case_names t1 … tm, case_conclusion D1 … Dn]› \\
  t1_…_tm.›\hthm{coinduct_strong} [case_names t1 … tm,› \\
  \phantom{t1_…_tm.›\hthm{coinduct_strong} [›}case_conclusion D1 … Dn]›\rm: \\
  t1_…_tm.›\hthm{rel_coinduct} [case_names t1 … tm,› \\
  \phantom{t1_…_tm.›\hthm{rel_coinduct} [›}case_conclusion D1 … Dn]›\rm: \\
\end{tabular}] ~ \\
Given $m > 1$ mutually corecursive codatatypes, these coinduction rules can be
used to prove $m$ properties simultaneously.

\item[\begin{tabular}{@ {}l@ {}}
  t1_…_tm.›\hthm{set_induct} [case_names C1 … Cn,› \\
  \phantom{t1_…_tm.›\hthm{set_induct} [›}induct set: setj_t1, …, induct set: setj_tm]›\rm: \\
\end{tabular}] ~ \\
@{thm llist.set_induct[no_vars]} \\
If $m = 1$, the attribute [consumes 1]› is generated as well.

\item[t.›\hthm{corec}\rm:] ~ \\
@{thm llist.corec(1)[no_vars]} \\
@{thm llist.corec(2)[no_vars]}

\item[t.›\hthm{corec_code} [code]›\rm:] ~ \\
@{thm llist.corec_code[no_vars]} \\
The [code]› attribute is set by the code› plugin
(Section~\ref{ssec:code-generator}).

\item[t.›\hthm{corec_disc}\rm:] ~ \\
@{thm llist.corec_disc(1)[no_vars]} \\
@{thm llist.corec_disc(2)[no_vars]}

\item[t.›\hthm{corec_disc_iff} [simp]›\rm:] ~ \\
@{thm llist.corec_disc_iff(1)[no_vars]} \\
@{thm llist.corec_disc_iff(2)[no_vars]}

\item[t.›\hthm{corec_sel} [simp]›\rm:] ~ \\
@{thm llist.corec_sel(1)[no_vars]} \\
@{thm llist.corec_sel(2)[no_vars]}

\item[t.›\hthm{map_o_corec}\rm:] ~ \\
@{thm llist.map_o_corec[no_vars]}

\item[t.›\hthm{corec_transfer} [transfer_rule]›\rm:] ~ \\
@{thm llist.corec_transfer[no_vars]} \\
The [transfer_rule]› attribute is set by the transfer› plugin
(Section~\ref{ssec:transfer}) for type constructors with no dead type arguments.

\end{description}
\end{indentblock}

\noindent
For convenience, @{command codatatype} also provides the following collection:

\begin{indentblock}
\begin{description}

\item[t.›\hthm{simps}] = t.inject› t.distinct› t.case› t.corec_disc_iff› t.corec_sel› \\
t.map› t.rel_inject› t.rel_distinct› t.set›

\end{description}
\end{indentblock}
›


subsection ‹Antiquotation
  \label{ssec:codatatype-antiquotation}›

subsubsection ‹\textit{codatatype}
  \label{sssec:codatatype-codatatype}›

text ‹
The \textit{codatatype} antiquotation, written
\texttt{\char`\\\char`<\char`^}\verb|codatatype>|\guilsinglleft\textit{t}\guilsinglright{}
or \texttt{@}\verb|{codatatype| \textit{t}\verb|}|, where \textit{t} is a type
name, expands to \LaTeX{} code for the definition of the codatatype, with each
constructor listed with its argument types. For example, if \textit{t} is
@{type llist}:

\begin{quote}
codatatypellist
\end{quote}
›


section ‹Defining Primitively Corecursive Functions
  \label{sec:defining-primitively-corecursive-functions}›

text ‹
Corecursive functions can be specified using the @{command primcorec} and
\keyw{prim\-corec\-ursive} commands, which support primitive corecursion.
Other approaches include the more general \keyw{partial_function} command, the
\keyw{corec} and \keyw{corecursive} commands, and techniques based on domains
and topologies cite"lochbihler-hoelzl-2014". In this tutorial, the focus is
on @{command primcorec} and @{command primcorecursive}; \keyw{corec} and
\keyw{corecursive} are described in a separate tutorial
cite"isabelle-corec". More examples can be found in the directories
🗀‹~~/src/HOL/Datatype_Examples› and 🗀‹~~/src/HOL/Corec_Examples›.

Whereas recursive functions consume datatypes one constructor at a time,
corecursive functions construct codatatypes one constructor at a time.
Partly reflecting a lack of agreement among proponents of coalgebraic methods,
Isabelle supports three competing syntaxes for specifying a function $f$:

\begin{itemize}
\setlength{\itemsep}{0pt}

\abovedisplayskip=.5\abovedisplayskip
\belowdisplayskip=.5\belowdisplayskip

\item The \emph{destructor view} specifies $f$ by implications of the form
\[… ⟹ is_Cj (f x1 … xn)›\] and
equations of the form
\[un_Cji (f x1 … xn) = …›\]
This style is popular in the coalgebraic literature.

\item The \emph{constructor view} specifies $f$ by equations of the form
\[… ⟹ f x1 … xn = Cj …›\]
This style is often more concise than the previous one.

\item The \emph{code view} specifies $f$ by a single equation of the form
\[f x1 … xn = …›\]
with restrictions on the format of the right-hand side. Lazy functional
programming languages such as Haskell support a generalized version of this
style.
\end{itemize}

All three styles are available as input syntax. Whichever syntax is chosen,
characteristic theorems for all three styles are generated.

%%% TODO: partial_function? E.g. for defining tail recursive function on lazy
%%% lists (cf. terminal0 in TLList.thy)
›


subsection ‹Introductory Examples
  \label{ssec:primcorec-introductory-examples}›

text ‹
Primitive corecursion is illustrated through concrete examples based on the
codatatypes defined in Section~\ref{ssec:codatatype-introductory-examples}. More
examples can be found in the directory 🗀‹~~/src/HOL/Datatype_Examples›.
The code view is favored in the examples below. Sections
\ref{ssec:primrec-constructor-view} and \ref{ssec:primrec-destructor-view}
present the same examples expressed using the constructor and destructor views.
›


subsubsection ‹Simple Corecursion
  \label{sssec:primcorec-simple-corecursion}›

text ‹
Following the code view, corecursive calls are allowed on the right-hand side as
long as they occur under a constructor, which itself appears either directly to
the right of the equal sign or in a conditional expression:
›

    primcorec (*<*)(transfer) (*>*)literate :: "('a  'a)  'a  'a llist" where
      "literate g x = LCons x (literate g (g x))"

text ‹\blankline›

    primcorec siterate :: "('a  'a)  'a  'a stream" where
      "siterate g x = SCons x (siterate g (g x))"

text ‹
\noindent
The constructor ensures that progress is made---i.e., the function is
\emph{productive}. The above functions compute the infinite lazy list or stream
[x, g x, g (g x), …]›. Productivity guarantees that prefixes
[x, g x, g (g x), …, (g ^^ k) x]› of arbitrary finite length
k› can be computed by unfolding the code equation a finite number of
times.

Corecursive functions construct codatatype values, but nothing prevents them
from also consuming such values. The following function drops every second
element in a stream:
›

    primcorec every_snd :: "'a stream  'a stream" where
      "every_snd s = SCons (shd s) (stl (stl s))"

text ‹
\noindent
Constructs such as let›--in›, if›--then›--else›, and case›--of› may
appear around constructors that guard corecursive calls:
›

    primcorec lapp :: "'a llist  'a llist  'a llist" where
      "lapp xs ys =
         (case xs of
            LNil  ys
          | LCons x xs'  LCons x (lapp xs' ys))"

text ‹
\noindent
For technical reasons, case›--of› is only supported for
case distinctions on (co)datatypes that provide discriminators and selectors.

Pattern matching is not supported by @{command primcorec}. Fortunately, it is
easy to generate pattern-maching equations using the @{command simps_of_case}
command provided by the theory 🗏‹~~/src/HOL/Library/Simps_Case_Conv.thy›.
›

    simps_of_case lapp_simps: lapp.code

text ‹
This generates the lemma collection @{thm [source] lapp_simps}:
%
\begin{gather*}
  @{thm lapp_simps(1)[no_vars]} \\
  @{thm lapp_simps(2)[no_vars]}
\end{gather*}
%
Corecursion is useful to specify not only functions but also infinite objects:
›

    primcorec infty :: enat where
      "infty = ESucc infty"

text ‹
\noindent
The example below constructs a pseudorandom process value. It takes a stream of
actions (s›), a pseudorandom function generator (f›), and a
pseudorandom seed (n›):
›

(*<*)
    context early
    begin
(*>*)
    primcorec
      random_process :: "'a stream  (int  int)  int  'a process"
    where
      "random_process s f n =
         (if n mod 4 = 0 then
            Fail
          else if n mod 4 = 1 then
            Skip (random_process s f (f n))
          else if n mod 4 = 2 then
            Action (shd s) (random_process (stl s) f (f n))
          else
            Choice (random_process (every_snd s) (f  f) (f n))
              (random_process (every_snd (stl s)) (f  f) (f (f n))))"
(*<*)
    end
(*>*)

text ‹
\noindent
The main disadvantage of the code view is that the conditions are tested
sequentially. This is visible in the generated theorems. The constructor and
destructor views offer nonsequential alternatives.
›


subsubsection ‹Mutual Corecursion
  \label{sssec:primcorec-mutual-corecursion}›

text ‹
The syntax for mutually corecursive functions over mutually corecursive
datatypes is unsurprising:
›

    primcorec
      even_infty :: even_enat and
      odd_infty :: odd_enat
    where
      "even_infty = Even_ESucc odd_infty"
    | "odd_infty = Odd_ESucc even_infty"


subsubsection ‹Nested Corecursion
  \label{sssec:primcorec-nested-corecursion}›

text ‹
The next pair of examples generalize the constliterate and constsiterate
functions (Section~\ref{sssec:primcorec-nested-corecursion}) to possibly
infinite trees in which subnodes are organized either as a lazy list (treeii) or as a finite set (treeis). They rely on the map functions of
the nesting type constructors to lift the corecursive calls:
›

    primcorec iterateii :: "('a  'a llist)  'a  'a treeii" where
      "iterateii g x = Nodeii x (lmap (iterateii g) (g x))"

text ‹\blankline›

    primcorec iterateis :: "('a  'a fset)  'a  'a treeis" where
      "iterateis g x = Nodeis x (fimage (iterateis g) (g x))"

text ‹
\noindent
Both examples follow the usual format for constructor arguments associated
with nested recursive occurrences of the datatype. Consider
constiterateii. The term termg x constructs an typ'a llist
value, which is turned into an typ'a treeii llist value using
constlmap.

This format may sometimes feel artificial. The following function constructs
a tree with a single, infinite branch from a stream:
›

    primcorec treeii_of_stream :: "'a stream  'a treeii" where
      "treeii_of_stream s =
         Nodeii (shd s) (lmap treeii_of_stream (LCons (stl s) LNil))"

text ‹
\noindent
A more natural syntax, also supported by Isabelle, is to move corecursive calls
under constructors:
›

    primcorec (*<*)(in late) (*>*)treeii_of_stream :: "'a stream  'a treeii" where
      "treeii_of_stream s =
         Nodeii (shd s) (LCons (treeii_of_stream (stl s)) LNil)"

text ‹
The next example illustrates corecursion through functions, which is a bit
special. Deterministic finite automata (DFAs) are traditionally defined as
5-tuples (Q, Σ, δ, q0, F)›, where Q› is a finite set of states,
Σ› is a finite alphabet, δ› is a transition function, q0
is an initial state, and F› is a set of final states. The following
function translates a DFA into a state machine:
›

    primcorec (*<*)(in early) (*>*)sm_of_dfa :: "('q  'a  'q)  'q set  'q  'a sm" where
      "sm_of_dfa δ F q = SM (q  F) (sm_of_dfa δ F  δ q)"

text ‹
\noindent
The map function for the function type (⇒›) is composition
((∘)›). For convenience, corecursion through functions can
also be expressed using $\lambda$-abstractions and function application rather
than through composition. For example:
›

    primcorec sm_of_dfa :: "('q  'a  'q)  'q set  'q  'a sm" where
      "sm_of_dfa δ F q = SM (q  F) (λa. sm_of_dfa δ F (δ q a))"

text ‹\blankline›

    primcorec empty_sm :: "'a sm" where
      "empty_sm = SM False (λ_. empty_sm)"

text ‹\blankline›

    primcorec not_sm :: "'a sm  'a sm" where
      "not_sm M = SM (¬ accept M) (λa. not_sm (trans M a))"

text ‹\blankline›

    primcorec or_sm :: "'a sm  'a sm  'a sm" where
      "or_sm M N =
         SM (accept M  accept N) (λa. or_sm (trans M a) (trans N a))"

text ‹
\noindent
For recursion through curried $n$-ary functions, $n$ applications of
term(∘) are necessary. The examples below illustrate the case where
$n = 2$:
›

    codatatype ('a, 'b) sm2 =
      SM2 (accept2: bool) (trans2: "'a  'b  ('a, 'b) sm2")

text ‹\blankline›

    primcorec
      (*<*)(in early) (*>*)sm2_of_dfa :: "('q  'a  'b  'q)  'q set  'q  ('a, 'b) sm2"
    where
      "sm2_of_dfa δ F q = SM2 (q  F) ((∘) ((∘) (sm2_of_dfa δ F)) (δ q))"

text ‹\blankline›

    primcorec
      sm2_of_dfa :: "('q  'a  'b  'q)  'q set  'q  ('a, 'b) sm2"
    where
      "sm2_of_dfa δ F q = SM2 (q  F) (λa b. sm2_of_dfa δ F (δ q a b))"


subsubsection ‹Nested-as-Mutual Corecursion
  \label{sssec:primcorec-nested-as-mutual-corecursion}›

text ‹
Just as it is possible to recurse over nested recursive datatypes as if they
were mutually recursive
(Section~\ref{sssec:primrec-nested-as-mutual-recursion}), it is possible to
pretend that nested codatatypes are mutually corecursive. For example:
›

(*<*)
    context late
    begin
(*>*)
    primcorec
      iterateii :: "('a  'a llist)  'a  'a treeii" and
      iteratesii :: "('a  'a llist)  'a llist  'a treeii llist"
    where
      "iterateii g x = Nodeii x (iteratesii g (g x))"
    | "iteratesii g xs =
         (case xs of
            LNil  LNil
          | LCons x xs'  LCons (iterateii g x) (iteratesii g xs'))"

text ‹
\noindent
Coinduction rules are generated as
@{thm [source] iterateii.coinduct},
@{thm [source] iteratesii.coinduct}, and
@{thm [source] iterateii_iteratesii.coinduct}
and analogously for coinduct_strong›. These rules and the
underlying corecursors are generated dynamically and are kept in a cache
to speed up subsequent definitions.
›

(*<*)
    end
(*>*)


subsubsection ‹Constructor View
  \label{ssec:primrec-constructor-view}›

(*<*)
    locale ctr_view
    begin
(*>*)

text ‹
The constructor view is similar to the code view, but there is one separate
conditional equation per constructor rather than a single unconditional
equation. Examples that rely on a single constructor, such as constliterate
and constsiterate, are identical in both styles.

Here is an example where there is a difference:
›

    primcorec lapp :: "'a llist  'a llist  'a llist" where
      "lnull xs  lnull ys  lapp xs ys = LNil"
    | "_  lapp xs ys = LCons (lhd (if lnull xs then ys else xs))
         (if xs = LNil then ltl ys else lapp (ltl xs) ys)"

text ‹
\noindent
With the constructor view, we must distinguish between the constLNil and
the constLCons case. The condition for constLCons is
left implicit, as the negation of that for constLNil.

For this example, the constructor view is slightly more involved than the
code equation. Recall the code view version presented in
Section~\ref{sssec:primcorec-simple-corecursion}.
% TODO: \[{thm code_view.lapp.code}\]
The constructor view requires us to analyze the second argument (termys).
The code equation generated from the constructor view also suffers from this.
% TODO: \[{thm lapp.code}\]

In contrast, the next example is arguably more naturally expressed in the
constructor view:
›

    primcorec
      random_process :: "'a stream  (int  int)  int  'a process"
    where
      "n mod 4 = 0  random_process s f n = Fail"
    | "n mod 4 = 1 
         random_process s f n = Skip (random_process s f (f n))"
    | "n mod 4 = 2 
         random_process s f n = Action (shd s) (random_process (stl s) f (f n))"
    | "n mod 4 = 3 
         random_process s f n = Choice (random_process (every_snd s) f (f n))
           (random_process (every_snd (stl s)) f (f n))"
(*<*)
    end
(*>*)

text ‹
\noindent
Since there is no sequentiality, we can apply the equation for constChoice
without having first to discharge termn mod (4::int)  0,
termn mod (4::int)  1, and
termn mod (4::int)  2.
The price to pay for this elegance is that we must discharge exclusiveness proof
obligations, one for each pair of conditions
term(n mod (4::int) = i, n mod (4::int) = j)
with termi < j. If we prefer not to discharge any obligations, we can
enable the sequential› option. This pushes the problem to the users of
the generated properties.
%Here are more examples to conclude:
›


subsubsection ‹Destructor View
  \label{ssec:primrec-destructor-view}›

(*<*)
    locale dtr_view
    begin
(*>*)

text ‹
The destructor view is in many respects dual to the constructor view. Conditions
determine which constructor to choose, and these conditions are interpreted
sequentially or not depending on the sequential› option.
Consider the following examples:
›

    primcorec literate :: "('a  'a)  'a  'a llist" where
      "¬ lnull (literate _ x)"
    | "lhd (literate _ x) = x"
    | "ltl (literate g x) = literate g (g x)"

text ‹\blankline›

    primcorec siterate :: "('a  'a)  'a  'a stream" where
      "shd (siterate _ x) = x"
    | "stl (siterate g x) = siterate g (g x)"

text ‹\blankline›

    primcorec every_snd :: "'a stream  'a stream" where
      "shd (every_snd s) = shd s"
    | "stl (every_snd s) = stl (stl s)"

text ‹
\noindent
The first formula in the constliterate specification indicates which
constructor to choose. For constsiterate and constevery_snd, no such
formula is necessary, since the type has only one constructor. The last two
formulas are equations specifying the value of the result for the relevant
selectors. Corecursive calls appear directly to the right of the equal sign.
Their arguments are unrestricted.

The next example shows how to specify functions that rely on more than one
constructor:
›

    primcorec lapp :: "'a llist  'a llist  'a llist" where
      "lnull xs  lnull ys  lnull (lapp xs ys)"
    | "lhd (lapp xs ys) = lhd (if lnull xs then ys else xs)"
    | "ltl (lapp xs ys) = (if xs = LNil then ltl ys else lapp (ltl xs) ys)"

text ‹
\noindent
For a codatatype with $n$ constructors, it is sufficient to specify $n - 1$
discriminator formulas. The command will then assume that the remaining
constructor should be taken otherwise. This can be made explicit by adding
›

(*<*)
    end

    locale dtr_view2
    begin

    primcorec lapp :: "'a llist  'a llist  'a llist" where
      "lnull xs  lnull ys  lnull (lapp xs ys)"
    | "lhd (lapp xs ys) = lhd (if lnull xs then ys else xs)"
    | "ltl (lapp xs ys) = (if xs = LNil then ltl ys else lapp (ltl xs) ys)" |
(*>*)
      "_  ¬ lnull (lapp xs ys)"

text ‹
\noindent
to the specification. The generated selector theorems are conditional.

The next example illustrates how to cope with selectors defined for several
constructors:
›

    primcorec
      random_process :: "'a stream  (int  int)  int  'a process"
    where
      "n mod 4 = 0  random_process s f n = Fail"
    | "n mod 4 = 1  is_Skip (random_process s f n)"
    | "n mod 4 = 2  is_Action (random_process s f n)"
    | "n mod 4 = 3  is_Choice (random_process s f n)"
    | "cont (random_process s f n) = random_process s f (f n)" of Skip
    | "prefix (random_process s f n) = shd s"
    | "cont (random_process s f n) = random_process (stl s) f (f n)" of Action
    | "left (random_process s f n) = random_process (every_snd s) f (f n)"
    | "right (random_process s f n) = random_process (every_snd (stl s)) f (f n)"

text ‹
\noindent
Using the of› keyword, different equations are specified for constcont depending on which constructor is selected.

Here are more examples to conclude:
›

    primcorec
      even_infty :: even_enat and
      odd_infty :: odd_enat
    where
      "even_infty  Even_EZero"
    | "un_Even_ESucc even_infty = odd_infty"
    | "un_Odd_ESucc odd_infty = even_infty"

text ‹\blankline›

    primcorec iterateii :: "('a  'a llist)  'a  'a treeii" where
      "lblii (iterateii g x) = x"
    | "subii (iterateii g x) = lmap (iterateii g) (g x)"
(*<*)
    end
(*>*)


subsection ‹Command Syntax
  \label{ssec:primcorec-command-syntax}›

subsubsection ‹\keyw{primcorec} and \keyw{primcorecursive}
  \label{sssec:primcorecursive-and-primcorec}›

text ‹
\begin{matharray}{rcl}
  @{command_def "primcorec"} & : & local_theory → local_theory› \\
  @{command_def "primcorecursive"} & : & local_theory → proof(prove)›
\end{matharray}

rail‹
  (@@{command primcorec} | @@{command primcorecursive}) target? 
    @{syntax pcr_options}? fixes @'where' (@{syntax pcr_formula} + '|')
  ;
  @{syntax_def pcr_options}: '(' ((@{syntax plugins} | 'sequential' | 'exhaustive' | 'transfer') + ',') ')'
  ;
  @{syntax_def pcr_formula}: thmdecl? prop (@'of' (term * ))?

\medskip

\noindent
The @{command primcorec} and @{command primcorecursive} commands introduce a set
of mutually corecursive functions over codatatypes.

The syntactic entity \synt{target} can be used to specify a local context,
\synt{fixes} denotes a list of names with optional type signatures,
\synt{thmdecl} denotes an optional name for the formula that follows, and
\synt{prop} denotes a HOL proposition cite"isabelle-isar-ref".

The optional target is optionally followed by a combination of the following
options:

\begin{itemize}
\setlength{\itemsep}{0pt}

\item
The plugins› option indicates which plugins should be enabled
(only›) or disabled (del›). By default, all plugins are enabled.

\item
The sequential› option indicates that the conditions in specifications
expressed using the constructor or destructor view are to be interpreted
sequentially.

\item
The exhaustive› option indicates that the conditions in specifications
expressed using the constructor or destructor view cover all possible cases.
This generally gives rise to an additional proof obligation.

\item
The transfer› option indicates that an unconditional transfer rule
should be generated and proved by transfer_prover›. The
[transfer_rule]› attribute is set on the generated theorem.
\end{itemize}

The @{command primcorec} command is an abbreviation for @{command
primcorecursive} with by auto?› to discharge any emerging proof
obligations.

%%% TODO: elaborate on format of the propositions
%%% TODO: elaborate on mutual and nested-to-mutual
›


subsection ‹Generated Theorems
  \label{ssec:primcorec-generated-theorems}›

text ‹
The @{command primcorec} and @{command primcorecursive} commands generate the
following properties (listed for constliterate):

\begin{indentblock}
\begin{description}

\item[f.›\hthm{code} [code]›\rm:] ~ \\
@{thm literate.code[no_vars]} \\
The [code]› attribute is set by the code› plugin
(Section~\ref{ssec:code-generator}).

\item[f.›\hthm{ctr}\rm:] ~ \\
@{thm literate.ctr[no_vars]}

\item[f.›\hthm{disc} [simp, code]›\rm:] ~ \\
@{thm literate.disc[no_vars]} \\
The [code]› attribute is set by the code› plugin
(Section~\ref{ssec:code-generator}). The [simp]› attribute is set only
for functions for which f.disc_iff› is not available.

\item[f.›\hthm{disc_iff} [simp]›\rm:] ~ \\
@{thm literate.disc_iff[no_vars]} \\
This property is generated only for functions declared with the
exhaustive› option or whose conditions are trivially exhaustive.

\item[f.›\hthm{sel} [simp, code]›\rm:] ~ \\
@{thm literate.disc[no_vars]} \\
The [code]› attribute is set by the code› plugin
(Section~\ref{ssec:code-generator}).

\item[f.›\hthm{exclude}\rm:] ~ \\
These properties are missing for constliterate because no exclusiveness
proof obligations arose. In general, the properties correspond to the
discharged proof obligations.

\item[f.›\hthm{exhaust}\rm:] ~ \\
This property is missing for constliterate because no exhaustiveness
proof obligation arose. In general, the property correspond to the discharged
proof obligation.

\item[\begin{tabular}{@ {}l@ {}}
  f.›\hthm{coinduct} [consumes m, case_names t1 … tm,› \\
  \phantom{f.›\hthm{coinduct} [›}case_conclusion D1 …
  Dn]›\rm:
\end{tabular}] ~ \\
This coinduction rule is generated for nested-as-mutual corecursive functions
(Section~\ref{sssec:primcorec-nested-as-mutual-corecursion}).

\item[\begin{tabular}{@ {}l@ {}}
  f.›\hthm{coinduct_strong} [consumes m, case_names t1 … tm,› \\
  \phantom{f.›\hthm{coinduct_strong} [›}case_conclusion D1 …
  Dn]›\rm:
\end{tabular}] ~ \\
This coinduction rule is generated for nested-as-mutual corecursive functions
(Section~\ref{sssec:primcorec-nested-as-mutual-corecursion}).

\item[\begin{tabular}{@ {}l@ {}}
  f1_…_fm.›\hthm{coinduct} [case_names t1 … tm,› \\
  \phantom{f.›\hthm{coinduct} [›}case_conclusion D1 …
  Dn]›\rm:
\end{tabular}] ~ \\
This coinduction rule is generated for nested-as-mutual corecursive functions
(Section~\ref{sssec:primcorec-nested-as-mutual-corecursion}). Given $m > 1$
mutually corecursive functions, this rule can be used to prove $m$ properties
simultaneously.

\item[\begin{tabular}{@ {}l@ {}}
  f1_…_fm.›\hthm{coinduct_strong} [case_names t1 … tm,› \\
  \phantom{f.›\hthm{coinduct_strong} [›}case_conclusion D1 …
  Dn]›\rm:
\end{tabular}] ~ \\
This coinduction rule is generated for nested-as-mutual corecursive functions
(Section~\ref{sssec:primcorec-nested-as-mutual-corecursion}). Given $m > 1$
mutually corecursive functions, this rule can be used to prove $m$ properties
simultaneously.

\end{description}
\end{indentblock}

\noindent
For convenience, @{command primcorec} and @{command primcorecursive} also
provide the following collection:

\begin{indentblock}
\begin{description}

\item[f.›\hthm{simps}] = f.disc_iff› (or f.disc›) t.sel›

\end{description}
\end{indentblock}
›


(*
subsection ‹Recursive Default Values for Selectors
  \label{ssec:primcorec-recursive-default-values-for-selectors}›

text ‹
partial_function to the rescue
›
*)


section ‹Registering Bounded Natural Functors
  \label{sec:registering-bounded-natural-functors}›

text ‹
The (co)datatype package can be set up to allow nested recursion through
arbitrary type constructors, as long as they adhere to the BNF requirements
and are registered as BNFs. It is also possible to declare a BNF abstractly
without specifying its internal structure.
›


subsection ‹Bounded Natural Functors
  \label{ssec:bounded-natural-functors}›

text ‹
Bounded natural functors (BNFs) are a semantic criterion for where
(co)re\-cur\-sion may appear on the right-hand side of an equation
cite"traytel-et-al-2012" and "blanchette-et-al-2015-wit".

An $n$-ary BNF is a type constructor equipped with a map function
(functorial action), $n$ set functions (natural transformations),
and an infinite cardinal bound that satisfy certain properties.
For example, typ'a llist is a unary BNF.
Its predicator llist_all ::
  ('a ⇒ bool) ⇒
  'a llist ⇒ bool›
extends unary predicates over elements to unary predicates over
lazy lists.
Similarly, its relator llist_all2 ::
  ('a ⇒ 'b ⇒ bool) ⇒
  'a llist ⇒ 'b llist ⇒ bool›
extends binary predicates over elements to binary predicates over parallel
lazy lists. The cardinal bound limits the number of elements returned by the
set function; it may not depend on the cardinality of typ'a.

The type constructors introduced by @{command datatype} and
@{command codatatype} are automatically registered as BNFs. In addition, a
number of old-style datatypes and non-free types are preregistered.

Given an $n$-ary BNF, the $n$ type variables associated with set functions,
and on which the map function acts, are \emph{live}; any other variables are
\emph{dead}. Nested (co)recursion can only take place through live variables.
›


subsection ‹Introductory Examples
  \label{ssec:bnf-introductory-examples}›

text ‹
The example below shows how to register a type as a BNF using the @{command bnf}
command. Some of the proof obligations are best viewed with the bundle
"cardinal_syntax" included.

The type is simply a copy of the function space typ'd  'a, where typ'a
is live and typ'd is dead. We introduce it together with its map function,
set function, predicator, and relator.
›

    typedef ('d, 'a) fn = "UNIV :: ('d  'a) set"
      by simp

text ‹\blankline›

    setup_lifting type_definition_fn

text ‹\blankline›

    lift_definition map_fn :: "('a  'b)  ('d, 'a) fn  ('d, 'b) fn" is "(∘)" .
    lift_definition set_fn :: "('d, 'a) fn  'a set" is range .

text ‹\blankline›

    lift_definition
      pred_fn :: "('a  bool)  ('d, 'a) fn  bool"
    is
      "pred_fun (λ_. True)" .

    lift_definition
      rel_fn :: "('a  'b  bool)  ('d, 'a) fn  ('d, 'b) fn  bool"
    is
      "rel_fun (=)" .

text ‹\blankline›

    bnf "('d, 'a) fn"
      map: map_fn
      sets: set_fn
      bd: "natLeq +c card_suc |UNIV :: 'd set|"
      rel: rel_fn
      pred: pred_fn
    proof -
      show "map_fn id = id"
        by transfer auto
    next
      fix f :: "'a  'b" and g :: "'b  'c"
      show "map_fn (g  f) = map_fn g  map_fn f"
        by transfer (auto simp add: comp_def)
    next
      fix F :: "('d, 'a) fn" and f g :: "'a  'b"
      assume "x. x  set_fn F  f x = g x"
      then show "map_fn f F = map_fn g F"
        by transfer auto
    next
      fix f :: "'a  'b"
      show "set_fn  map_fn f = (`) f  set_fn"
        by transfer (auto simp add: comp_def)
    next
      show "card_order (natLeq +c card_suc |UNIV :: 'd set| )"
        by (rule card_order_bd_fun)
    next
      show "cinfinite (natLeq +c card_suc |UNIV :: 'd set| )"
        by (rule Cinfinite_bd_fun[THEN conjunct1])
    next
      show "regularCard (natLeq +c card_suc |UNIV :: 'd set| )"
        by (rule regularCard_bd_fun)
    next
      fix F :: "('d, 'a) fn"
      have "|set_fn F| ≤o |UNIV :: 'd set|" (is "_ ≤o ?U")
        by transfer (rule card_of_image)
      also have "?U <o card_suc ?U"
        by (simp add: card_of_card_order_on card_suc_greater)
      also have "card_suc ?U ≤o natLeq +c card_suc ?U"
        using Card_order_card_suc card_of_card_order_on ordLeq_csum2 by blast
      finally show "|set_fn F| <o natLeq +c card_suc |UNIV :: 'd set|" .
    next
      fix R :: "'a  'b  bool" and S :: "'b  'c  bool"
      show "rel_fn R OO rel_fn S  rel_fn (R OO S)"
        by (rule, transfer) (auto simp add: rel_fun_def)
    next
      fix R :: "'a  'b  bool"
      show "rel_fn R = (λx y. z. set_fn z  {(x, y). R x y}  map_fn fst z = x  map_fn snd z = y)"
        unfolding fun_eq_iff relcompp.simps conversep.simps
        by transfer (force simp: rel_fun_def subset_iff)
    next
      fix P :: "'a  bool"
      show "pred_fn P = (λx. Ball (set_fn x) P)"
        unfolding fun_eq_iff by transfer simp
    qed

text ‹\blankline›

    print_theorems
    print_bnfs

text ‹
\noindent
Using \keyw{print_theorems} and @{command print_bnfs}, we can contemplate and
show the world what we have achieved.

This particular example does not need any nonemptiness witness, because the
one generated by default is good enough, but in general this would be
necessary. See 🗏‹~~/src/HOL/Basic_BNFs.thy›,
🗏‹~~/src/HOL/Library/Countable_Set_Type.thy›,
🗏‹~~/src/HOL/Library/FSet.thy›, and
🗏‹~~/src/HOL/Library/Multiset.thy› for further examples of BNF
registration, some of which feature custom witnesses.

For many typedefs and quotient types, lifting the BNF structure from the raw typ
to the abstract type can be done uniformly. This is the task of the @{command lift_bnf}
command. Using @{command lift_bnf}, the above registration of typ('d, 'a) fn as a
BNF becomes much shorter:
›

(*<*)
    context early
    begin
(*>*)
    lift_bnf (*<*)(no_warn_wits) (*>*)('d, 'a) fn
      by force+
(*<*)
    end
(*>*)

text ‹
For type copies (@{command typedef}s with termUNIV as the representing set),
the proof obligations are so simple that they can be
discharged automatically, yielding another command, @{command copy_bnf}, which
does not emit any proof obligations:
›

(*<*)
    context late
    begin
(*>*)
    copy_bnf ('d, 'a) fn
(*<*)
    end
(*>*)

text ‹
Since record schemas are type copies, @{command copy_bnf} can be used to
register them as BNFs:
›

    record 'a point =
      xval :: 'a
      yval :: 'a

text ‹\blankline›

    copy_bnf (*<*)(no_warn_transfer) (*>*)('a, 'z) point_ext

text ‹
In the general case, the proof obligations generated by @{command lift_bnf} are
simpler than the acual BNF properties. In particular, no cardinality reasoning
is required. Consider the following type of nonempty lists:
›

    typedef 'a nonempty_list = "{xs :: 'a list. xs  []}" by auto

text ‹
The @{command lift_bnf} command requires us to prove that the set of nonempty lists
is closed under the map function and the zip function. The latter only
occurs implicitly in the goal, in form of the variable
termzs :: ('a × 'b) list.
›

    lift_bnf (*<*)(no_warn_wits,no_warn_transfer) (*>*)'a nonempty_list
    proof -
      fix f (*<*):: "'a  'c"(*>*)and xs :: "'a list"
      assume "xs  {xs. xs  []}"
      then show "map f xs  {xs. xs  []}"
        by (cases xs(*<*) rule: list.exhaust(*>*)) auto
    next
      fix zs :: "('a × 'b) list"
      assume "map fst zs  {xs. xs  []}" "map snd zs  {xs. xs  []}"
      then show "zs'{xs. xs  []}.
            set zs'  set zs 
            map fst zs' = map fst zs 
            map snd zs' = map snd zs"
        by (cases zs (*<*)rule: list.exhaust(*>*)) (auto intro!: exI[of _ zs])
    qed

text ‹The @{command lift_bnf} command also supports quotient types. Here is an example
that defines the option type as a quotient of the sum type. The proof obligations
generated by @{command lift_bnf} for quotients are different from the ones for typedefs.
You can find additional examples of usages of @{command lift_bnf} for both quotients and subtypes
in the session \emph{HOL-Datatype_Examples}.›

    inductive ignore_Inl :: "'a + 'a  'a + 'a  bool" where
      "ignore_Inl (Inl x) (Inl y)"
    | "ignore_Inl (Inr x) (Inr x)"

    (*<*)
    inductive_simps ignore_Inl_simps[simp]:
      "ignore_Inl (Inl x) y"
      "ignore_Inl (Inr x') y"
      "ignore_Inl y (Inl x)"
      "ignore_Inl y (Inr x')"
    (*>*)

    lemma ignore_Inl_equivp:
      "ignore_Inl x x"
      "ignore_Inl x y  ignore_Inl y x"
      "ignore_Inl x y  ignore_Inl y z  ignore_Inl x z"
      by (cases x; cases y; cases z; auto)+

    quotient_type 'a myoption = "'a + 'a" / ignore_Inl
      unfolding equivp_reflp_symp_transp reflp_def symp_def transp_def
      by (blast intro: ignore_Inl_equivp)

    lift_bnf 'a myoption (*<*)[wits: "Inl undefined :: 'a + 'a"](*>*)

    proof -
      fix P :: "'a  'b  bool" and Q :: "'b  'c  bool"
      assume "P OO Q  bot"
      then show "rel_sum P P OO ignore_Inl OO rel_sum Q Q
          ignore_Inl OO rel_sum (P OO Q) (P OO Q) OO ignore_Inl"
        by (fastforce(*<*) elim!: ignore_Inl.cases simp add: relcompp_apply fun_eq_iff rel_sum.simps(*>*))
    next
      fix S :: "'a set set"
      let ?eq = "{(x, x'). ignore_Inl x x'}"
      let ?in = "λA. {x. Basic_BNFs.setl x  Basic_BNFs.setr x  A}"
      assume "S  {}" " S  {}"
      show "(AS. ?eq `` ?in A)  ?eq `` ?in ( S)"
      proof (intro subsetI)
        fix x
        assume "x  (AS. ?eq `` ?in A)"
        with  S  {} show "x  ?eq `` ?in ( S)"
          by (cases x) (fastforce(*<*) dest!: spec[where x="Inl _"](*>*))+
      qed
    (*<*)next
      fix a :: 'a
      assume "a  (mx{mx. ignore_Inl (map_sum Inr Inr (Inl undefined)) mx}.
         (Basic_BNFs.setr ` (Basic_BNFs.setl mx  Basic_BNFs.setr mx)))"
      then show False
        by (auto simp: setr.simps)(*>*)
    qed


text ‹
The next example declares a BNF axiomatically. This can be convenient for
reasoning abstractly about an arbitrary BNF. The @{command bnf_axiomatization}
command below introduces a type ('a, 'b, 'c) F›, three set constants,
a map function, a predicator, a relator, and a nonemptiness witness that depends only on
typ'a. The type 'a ⇒ ('a, 'b, 'c) F› of the witness can be read
as an implication: Given a witness for typ'a, we can construct a witness for
('a, 'b, 'c) F›. The BNF properties are postulated as axioms.
›

    bnf_axiomatization (setA: 'a, setB: 'b, setC: 'c) F
      [wits: "'a  ('a, 'b, 'c) F"]

text ‹\blankline›

    print_theorems
    print_bnfs


subsection ‹Command Syntax
  \label{ssec:bnf-command-syntax}›

subsubsection ‹\keyw{bnf}
  \label{sssec:bnf}›

text ‹
\begin{matharray}{rcl}
  @{command_def "bnf"} & : & local_theory → proof(prove)›
\end{matharray}

rail@@{command bnf} target? (name ':')? type 
    'map:' term ('sets:' (term +))? 'bd:' term 
    ('wits:' (term +))? ('rel:' term)? 
    ('pred:' term)? @{syntax plugins}?

\medskip

\noindent
The @{command bnf} command registers an existing type as a bounded natural
functor (BNF). The type must be equipped with an appropriate map function
(functorial action). In addition, custom set functions, predicators, relators, and
nonemptiness witnesses can be specified; otherwise, default versions are used.

The syntactic entity \synt{target} can be used to specify a local context,
\synt{type} denotes a HOL type, and \synt{term} denotes a HOL term
cite"isabelle-isar-ref".

The @{syntax plugins} option indicates which plugins should be enabled
(only›) or disabled (del›). By default, all plugins are enabled.

%%% TODO: elaborate on proof obligations
›

subsubsection ‹\keyw{lift_bnf}
  \label{sssec:lift-bnf}›

text ‹
\begin{matharray}{rcl}
  @{command_def "lift_bnf"} & : & local_theory → proof(prove)›
\end{matharray}

rail@@{command lift_bnf} target? lb_options? 
    @{syntax tyargs} name wit_terms?  
    ('via' thm thm?)? @{syntax map_rel_pred}?
  ;
  @{syntax_def lb_options}: '(' ((@{syntax plugins} | 'no_warn_wits' | 'no_warn_transfer') + ',') ')'
  ;
  @{syntax_def wit_terms}: '[' 'wits' ':' terms ']'
\medskip

\noindent
The @{command lift_bnf} command registers as a BNF an existing type (the
\emph{abstract type}) that was defined as a subtype of a BNF (the \emph{raw
type}) using the @{command typedef} command or as a quotient type of a BNF (also, the
\emph{raw type}) using the @{command quotient_type}. To achieve this, it lifts the BNF
structure on the raw type to the abstract type following a termtype_definition or a
termQuotient theorem. The theorem is usually inferred from the type, but can
also be explicitly supplied by means of the optional via› clause. In case of quotients, it
is sometimes also necessary to supply a second theorem of the form termreflp eq,
that expresses the reflexivity (and thus totality) of the equivalence relation. In
addition, custom names for the set functions, the map function, the predicator, and the relator,
as well as nonemptiness witnesses can be specified.

Nonemptiness witnesses are not lifted from the raw type's BNF, as this would be
incomplete. They must be given as terms (on the raw type) and proved to be
witnesses. The command warns about witness types that are present in the raw
type's BNF but not supplied by the user. The warning can be disabled by
specifying the no_warn_wits› option.
›

subsubsection ‹\keyw{copy_bnf}
  \label{sssec:copy-bnf}›

text ‹
\begin{matharray}{rcl}
  @{command_def "copy_bnf"} & : & local_theory → local_theory›
\end{matharray}

rail@@{command copy_bnf} target? cb_options? 
    @{syntax tyargs} name ('via' thm)? @{syntax map_rel_pred}?
  ;
  @{syntax_def cb_options}: '(' ((@{syntax plugins} | 'no_warn_transfer') + ',') ')'
\medskip

\noindent
The @{command copy_bnf} command performs the same lifting as @{command lift_bnf}
for type copies (@{command typedef}s with termUNIV as the representing set),
without requiring the user to discharge any proof obligations or provide
nonemptiness witnesses.
›

subsubsection ‹\keyw{bnf_axiomatization}
  \label{sssec:bnf-axiomatization}›

text ‹
\begin{matharray}{rcl}
  @{command_def "bnf_axiomatization"} & : & local_theory → local_theory›
\end{matharray}

rail@@{command bnf_axiomatization} target? ('(' @{syntax plugins} ')')? 
    @{syntax tyargs}? name @{syntax wit_types}? 
    mixfix? @{syntax map_rel_pred}?
  ;
  @{syntax_def wit_types}: '[' 'wits' ':' types ']'

\medskip

\noindent
The @{command bnf_axiomatization} command declares a new type and associated constants
(map, set, predicator, relator, and cardinal bound) and asserts the BNF properties for
these constants as axioms.

The syntactic entity \synt{target} can be used to specify a local context,
\synt{name} denotes an identifier, \synt{typefree} denotes fixed type variable
(typ'a, typ'b, \ldots), \synt{mixfix} denotes the usual parenthesized
mixfix notation, and \synt{types} denotes a space-separated list of types
cite"isabelle-isar-ref".

The @{syntax plugins} option indicates which plugins should be enabled
(only›) or disabled (del›). By default, all plugins are enabled.

Type arguments are live by default; they can be marked as dead by entering
dead› in front of the type variable (e.g., (dead 'a)›)
instead of an identifier for the corresponding set function. Witnesses can be
specified by their types. Otherwise, the syntax of @{command bnf_axiomatization}
is identical to the left-hand side of a @{command datatype} or
@{command codatatype} definition.

The command is useful to reason abstractly about BNFs. The axioms are safe
because there exist BNFs of arbitrary large arities. Applications must import
the 🗏‹~~/src/HOL/Library/BNF_Axiomatization.thy› theory
to use this functionality.
›


subsubsection ‹\keyw{print_bnfs}
  \label{sssec:print-bnfs}›

text ‹
\begin{matharray}{rcl}
  @{command_def "print_bnfs"} & : & local_theory →›
\end{matharray}

rail@@{command print_bnfs}


section ‹Deriving Destructors and Constructor Theorems
  \label{sec:deriving-destructors-and-constructor-theorems}›

text ‹
The derivation of convenience theorems for types equipped with free constructors,
as performed internally by @{command datatype} and @{command codatatype},
is available as a stand-alone command called @{command free_constructors}.

%  * need for this is rare but may arise if you want e.g. to add destructors to
%    a type not introduced by ...
%
%  * @{command free_constructors}
%    * plugins›, discs_sels›
%    * hack to have both co and nonco view via locale (cf. ext nats)
›


(*
subsection ‹Introductory Example
  \label{ssec:ctors-introductory-example}›
*)


subsection ‹Command Syntax
  \label{ssec:ctors-command-syntax}›

subsubsection ‹\keyw{free_constructors}
  \label{sssec:free-constructors}›

text ‹
\begin{matharray}{rcl}
  @{command_def "free_constructors"} & : & local_theory → proof(prove)›
\end{matharray}

rail@@{command free_constructors} target? @{syntax dt_options} 
    name 'for' (@{syntax fc_ctor} + '|') 
  (@'where' (prop + '|'))?
  ;
  @{syntax_def fc_ctor}: (name ':')? term (name * )
›

\medskip

\noindent
The @{command free_constructors} command generates destructor constants for
freely constructed types as well as properties about constructors and
destructors. It also registers the constants and theorems in a data structure
that is queried by various tools (e.g., \keyw{function}).

The syntactic entity \synt{target} can be used to specify a local context,
\synt{name} denotes an identifier, \synt{prop} denotes a HOL proposition, and
\synt{term} denotes a HOL term cite"isabelle-isar-ref".

The syntax resembles that of @{command datatype} and @{command codatatype}
definitions (Sections
\ref{ssec:datatype-command-syntax}~and~\ref{ssec:codatatype-command-syntax}).
A constructor is specified by an optional name for the discriminator, the
constructor itself (as a term), and a list of optional names for the selectors.

Section~\ref{ssec:datatype-generated-theorems} lists the generated theorems.
For bootstrapping reasons, the generally useful [fundef_cong]›
attribute is not set on the generated case_cong› theorem. It can be
added manually using \keyw{declare}.
›


subsubsection ‹\keyw{simps_of_case}
  \label{sssec:simps-of-case}›

text ‹
\begin{matharray}{rcl}
  @{command_def "simps_of_case"} & : & local_theory → local_theory›
\end{matharray}

rail@@{command simps_of_case} target? (name ':')? 
    (thm + ) (@'splits' ':' (thm + ))?

\medskip

\noindent
The @{command simps_of_case} command provided by theory
🗏‹~~/src/HOL/Library/Simps_Case_Conv.thy› converts a single equation with
a complex case expression on the right-hand side into a set of pattern-matching
equations. For example,
›

(*<*)
    context late
    begin
(*>*)
    simps_of_case lapp_simps: lapp.code

text ‹
\noindent
translates @{thm lapp.code[no_vars]} into
%
\begin{gather*}
  @{thm lapp_simps(1)[no_vars]} \\
  @{thm lapp_simps(2)[no_vars]}
\end{gather*}
›


subsubsection ‹\keyw{case_of_simps}
  \label{sssec:case-of-simps}›

text ‹
\begin{matharray}{rcl}
  @{command_def "case_of_simps"} & : & local_theory → local_theory›
\end{matharray}

rail@@{command case_of_simps} target? (name ':')? 
    (thm + )
›

\medskip

\noindent
The \keyw{case_of_simps} command provided by theory
🗏‹~~/src/HOL/Library/Simps_Case_Conv.thy› converts a set of
pattern-matching equations into single equation with a complex case expression
on the right-hand side (cf.\ @{command simps_of_case}). For example,
›

    case_of_simps lapp_case: lapp_simps

text ‹
\noindent
translates
%
\begin{gather*}
  @{thm lapp_simps(1)[no_vars]} \\
  @{thm lapp_simps(2)[no_vars]}
\end{gather*}
%
into @{thm lapp_case[no_vars]}.
›
(*<*)
    end
(*>*)


(*
section ‹Using the Standard ML Interface
  \label{sec:using-the-standard-ml-interface}›

text ‹
The package's programmatic interface.
›
*)


section ‹Selecting Plugins
  \label{sec:selecting-plugins}›

text ‹
Plugins extend the (co)datatype package to interoperate with other Isabelle
packages and tools, such as the code generator, Transfer, Lifting, and
Quickcheck. They can be enabled or disabled individually using the
@{syntax plugins} option to the commands @{command datatype},
@{command primrec}, @{command codatatype}, @{command primcorec},
@{command primcorecursive}, @{command bnf}, @{command bnf_axiomatization}, and
@{command free_constructors}. For example:
›

    datatype (plugins del: code "quickcheck") color = Red | Black

text ‹
Beyond the standard plugins, the \emph{Archive of Formal Proofs} includes a
\keyw{derive} command that derives class instances of datatypes
cite"sternagel-thiemann-2015".
›

subsection ‹Code Generator
  \label{ssec:code-generator}›

text ‹
The \hthm{code} plugin registers freely generated types, including
(co)datatypes, and (co)recursive functions for code generation. No distinction
is made between datatypes and codatatypes. This means that for target languages
with a strict evaluation strategy (e.g., Standard ML), programs that attempt to
produce infinite codatatype values will not terminate.

For types, the plugin derives the following properties:

\begin{indentblock}
\begin{description}

\item[t.›\hthm{eq.refl} [code nbe]›\rm:] ~ \\
@{thm list.eq.refl[no_vars]}

\item[t.›\hthm{eq.simps} [code]›\rm:] ~ \\
@{thm list.eq.simps(1)[no_vars]} \\
@{thm list.eq.simps(2)[no_vars]} \\
@{thm list.eq.simps(3)[no_vars]} \\
@{thm list.eq.simps(4)[no_vars]} \\
@{thm list.eq.simps(5)[no_vars]} \\
@{thm list.eq.simps(6)[no_vars]}

\end{description}
\end{indentblock}

In addition, the plugin sets the [code]› attribute on a number of
properties of freely generated types and of (co)recursive functions, as
documented in Sections \ref{ssec:datatype-generated-theorems},
\ref{ssec:primrec-generated-theorems}, \ref{ssec:codatatype-generated-theorems},
and~\ref{ssec:primcorec-generated-theorems}.
›


subsection ‹Size
  \label{ssec:size}›

text ‹
For each datatype t›, the \hthm{size} plugin generates a generic size
function t.size_t› as well as a specific instance
size :: t ⇒ nat› belonging to the size› type class. The
\keyw{fun} command relies on constsize to prove termination of recursive
functions on datatypes.

The plugin derives the following properties:

\begin{indentblock}
\begin{description}

\item[t.›\hthm{size} [simp, code]›\rm:] ~ \\
@{thm list.size(1)[no_vars]} \\
@{thm list.size(2)[no_vars]} \\
@{thm list.size(3)[no_vars]} \\
@{thm list.size(4)[no_vars]}

\item[t.›\hthm{size_gen}\rm:] ~ \\
@{thm list.size_gen(1)[no_vars]} \\
@{thm list.size_gen(2)[no_vars]}

\item[t.›\hthm{size_gen_o_map}\rm:] ~ \\
@{thm list.size_gen_o_map[no_vars]}

\item[t.›\hthm{size_neq}\rm:] ~ \\
This property is missing for typ'a list. If the termsize function
always evaluates to a non-zero value, this theorem has the form
prop¬ size x = 0.

\end{description}
\end{indentblock}

The t.size› and t.size_t› functions generated for datatypes
defined by nested recursion through a datatype u› depend on
u.size_u›.

If the recursion is through a non-datatype u› with type arguments
'a1, …, 'am, by default u› values are given a size of 0. This
can be improved upon by registering a custom size function of type
('a1 ⇒ nat) ⇒ … ⇒ ('am ⇒ nat) ⇒ u ⇒ nat› using
the ML function MLBNF_LFP_Size.register_size or
MLBNF_LFP_Size.register_size_global. See theory
🗏‹~~/src/HOL/Library/Multiset.thy› for an example.
›


subsection ‹Transfer
  \label{ssec:transfer}›

text ‹
For each (co)datatype with live type arguments and each manually registered BNF,
the \hthm{transfer} plugin generates a predicator t.pred_t› and
properties that guide the Transfer tool.

For types with at least one live type argument and \emph{no dead type
arguments}, the plugin derives the following properties:

\begin{indentblock}
\begin{description}

\item[t.›\hthm{Domainp_rel} [relator_domain]›\rm:] ~ \\
@{thm list.Domainp_rel[no_vars]}

\item[t.›\hthm{left_total_rel} [transfer_rule]›\rm:] ~ \\
@{thm list.left_total_rel[no_vars]}

\item[t.›\hthm{left_unique_rel} [transfer_rule]›\rm:] ~ \\
@{thm list.left_unique_rel[no_vars]}

\item[t.›\hthm{right_total_rel} [transfer_rule]›\rm:] ~ \\
@{thm list.right_total_rel[no_vars]}

\item[t.›\hthm{right_unique_rel} [transfer_rule]›\rm:] ~ \\
@{thm list.right_unique_rel[no_vars]}

\item[t.›\hthm{bi_total_rel} [transfer_rule]›\rm:] ~ \\
@{thm list.bi_total_rel[no_vars]}

\item[t.›\hthm{bi_unique_rel} [transfer_rule]›\rm:] ~ \\
@{thm list.bi_unique_rel[no_vars]}

\end{description}
\end{indentblock}

For (co)datatypes with at least one live type argument, the plugin sets the
[transfer_rule]› attribute on the following (co)datatypes properties:
t.case_›\allowbreak transfer›,
t.sel_›\allowbreak transfer›,
t.ctr_›\allowbreak transfer›,
t.disc_›\allowbreak transfer›,
t.rec_›\allowbreak transfer›, and
t.corec_›\allowbreak transfer›.
For (co)datatypes that further have \emph{no dead type arguments}, the plugin
sets [transfer_rule]› on
t.set_›\allowbreak transfer›,
t.map_›\allowbreak transfer›, and
t.rel_›\allowbreak transfer›.

For @{command primrec}, @{command primcorec}, and @{command primcorecursive},
the plugin implements the generation of the f.transfer› property,
conditioned by the transfer› option, and sets the
[transfer_rule]› attribute on these.
›


subsection ‹Lifting
  \label{ssec:lifting}›

text ‹
For each (co)datatype and each manually registered BNF with at least one live
type argument \emph{and no dead type arguments}, the \hthm{lifting} plugin
generates properties and attributes that guide the Lifting tool.

The plugin derives the following property:

\begin{indentblock}
\begin{description}

\item[t.›\hthm{Quotient} [quot_map]›\rm:] ~ \\
@{thm list.Quotient[no_vars]}

\end{description}
\end{indentblock}

In addition, the plugin sets the [relator_eq]› attribute on a
variant of the t.rel_eq_onp› property, the [relator_mono]›
attribute on t.rel_mono›, and the [relator_distr]› attribute
on t.rel_compp›.
›


subsection ‹Quickcheck
  \label{ssec:quickcheck}›

text ‹
The integration of datatypes with Quickcheck is accomplished by the
\hthm{quick\-check} plugin. It combines a number of subplugins that instantiate
specific type classes. The subplugins can be enabled or disabled individually.
They are listed below:

\begin{indentblock}
\hthm{quickcheck_random} \\
\hthm{quickcheck_exhaustive} \\
\hthm{quickcheck_bounded_forall} \\
\hthm{quickcheck_full_exhaustive} \\
\hthm{quickcheck_narrowing}
\end{indentblock}
›


subsection ‹Program Extraction
  \label{ssec:program-extraction}›

text ‹
The \hthm{extraction} plugin provides realizers for induction and case analysis,
to enable program extraction from proofs involving datatypes. This functionality
is only available with full proof objects, i.e., with the \emph{HOL-Proofs}
session.
›


section ‹Known Bugs and Limitations
  \label{sec:known-bugs-and-limitations}›

text ‹
This section lists the known bugs and limitations of the (co)datatype package at
the time of this writing.

\begin{enumerate}
\setlength{\itemsep}{0pt}

\item
\emph{Defining mutually (co)recursive (co)datatypes can be slow.} Fortunately,
it is always possible to recast mutual specifications to nested ones, which are
processed more efficiently.

\item
\emph{Locally fixed types and terms cannot be used in type specifications.}
The limitation on types can be circumvented by adding type arguments to the local
(co)datatypes to abstract over the locally fixed types.

\item
\emph{The \emph{\keyw{primcorec}} command does not allow user-specified names and
attributes next to the entered formulas.} The less convenient syntax, using the
\keyw{lemmas} command, is available as an alternative.

\item
\emph{The \emph{\keyw{primcorec}} command does not allow corecursion under
case›--of› for datatypes that are defined without
discriminators and selectors.}

\item
\emph{There is no way to use an overloaded constant from a syntactic type
class, such as 0›, as a constructor.}

\item
\emph{There is no way to register the same type as both a datatype and a
codatatype.} This affects types such as the extended natural numbers, for which
both views would make sense (for a different set of constructors).

\item
\emph{The names of variables are often suboptimal in the properties generated by
the package.}

\item
\emph{The compatibility layer sometimes produces induction principles with a
slightly different ordering of the premises than the old package.}

\end{enumerate}
›


text ‹
\section*{Acknowledgment}

Tobias Nipkow and Makarius Wenzel encouraged us to implement the new
(co)datatype package. Andreas Lochbihler provided lots of comments on earlier
versions of the package, especially on the coinductive part. Brian Huffman
suggested major simplifications to the internal constructions. Ond\v{r}ej
Kun\v{c}ar implemented the transfer› and lifting› plugins.
Christian Sternagel and Ren\'e Thiemann ported the \keyw{derive} command
from the \emph{Archive of Formal Proofs} to the new datatypes. Gerwin Klein and
Lars Noschinski implemented the @{command simps_of_case} and @{command
case_of_simps} commands. Florian Haftmann, Christian Urban, and Makarius
Wenzel provided general advice on Isabelle and package writing. Stefan Milius
and Lutz Schröder found an elegant proof that eliminated one of the BNF
proof obligations. Mamoun Filali-Amine, Gerwin Klein, Andreas Lochbihler,
Tobias Nipkow, and Christian Sternagel suggested many textual improvements to
this tutorial.
›

end