Theory Scala

(*:maxLineLen=78:*)

theory Scala
imports Base
begin

chapter ‹Isabelle/Scala systems programming \label{sec:scala}›

text ‹
  Isabelle/ML and Isabelle/Scala are the two main implementation languages of
  the Isabelle environment:

     Isabelle/ML is for ‹mathematics›, to develop tools within the context
    of symbolic logic, e.g.\ for constructing proofs or defining
    domain-specific formal languages. See the ‹Isabelle/Isar implementation
    manual› cite"isabelle-implementation" for more details.

     Isabelle/Scala is for ‹physics›, to connect with the world of systems
    and services, including editors and IDE frameworks.

  There are various ways to access Isabelle/Scala modules and operations:

     Isabelle command-line tools (\secref{sec:scala-tools}) run in a separate
    Java process.

     Isabelle/ML antiquotations access Isabelle/Scala functions
    (\secref{sec:scala-functions}) via the PIDE protocol: execution happens
    within the running Java process underlying Isabelle/Scala.

     The ‹Console/Scala› plugin of Isabelle/jEdit cite"isabelle-jedit"
    operates on the running Java application, using the Scala
    read-eval-print-loop (REPL).

  The main Isabelle/Scala/jEdit functionality is provided by
  🗏‹$ISABELLE_HOME/lib/classes/isabelle.jar›. Further underlying Scala and
  Java libraries are bundled with Isabelle, e.g.\ to access SQLite or
  PostgreSQL via JDBC.

  Add-on Isabelle components may augment the system environment by providing
  suitable configuration in path‹etc/settings› (GNU bash script). The
  shell function bash_functionclasspath helps to write
  path‹etc/settings› in a portable manner: it refers to library ‹jar›
  files in standard POSIX path notation. On Windows, this is converted to
  native platform format, before invoking Java (\secref{sec:scala-tools}).

  
  There is also an implicit build process for Isabelle/Scala/Java modules,
  based on path‹etc/build.props› within the component directory (see also
  \secref{sec:scala-build}).
›


section ‹Command-line tools \label{sec:scala-tools}›

subsection ‹Java Runtime Environment \label{sec:tool-java}›

text ‹
  The @{tool_def java} tool is a direct wrapper for the Java Runtime
  Environment, within the regular Isabelle settings environment
  (\secref{sec:settings}) and Isabelle classpath. The command line arguments
  are that of the bundled Java distribution: see option ‹-help› in
  particular.

  The ‹java› executable is taken from @{setting ISABELLE_JDK_HOME}, according
  to the standard directory layout for regular distributions of OpenJDK.

  The shell function bash_functionisabelle_jdk allows shell scripts to
  invoke other Java tools robustly (e.g.\ ‹isabelle_jdk jar›), without
  depending on accidental operating system installations.
›


subsection ‹Scala toplevel \label{sec:tool-scala}›

text ‹
  The @{tool_def scala} tool is a direct wrapper for the Scala toplevel,
  similar to @{tool java} above. The command line arguments are that of the
  bundled Scala distribution: see option ‹-help› in particular. This allows
  to interact with Isabelle/Scala interactively.
›

subsubsection ‹Example›

text ‹
  Explore the Isabelle system environment in Scala:
  @{verbatim [display, indent = 2] ‹$ isabelle scala›}
  @{scala [display, indent = 2]
‹import isabelle._

val isabelle_home = Isabelle_System.getenv("ISABELLE_HOME")

val options = Options.init()
options.bool("browser_info")
options.string("document")›}


subsection ‹Scala script wrapper›

text ‹
  The executable @{executable "$ISABELLE_HOME/bin/isabelle_scala_script"}
  allows to run Isabelle/Scala source files stand-alone programs, by using a
  suitable ``hash-bang'' line and executable file permissions. For example:
  @{verbatim [display, indent = 2] ‹#!/usr/bin/env isabelle_scala_script›}
  @{scala [display, indent = 2]
‹val options = isabelle.Options.init()
Console.println("browser_info = " + options.bool("browser_info"))
Console.println("document = " + options.string("document"))›}

  This assumes that the executable may be found via the @{setting PATH} from
  the process environment: this is the case when Isabelle settings are active,
  e.g.\ in the context of the main Isabelle tool wrapper
  \secref{sec:isabelle-tool}. Alternatively, the full
  🗏‹$ISABELLE_HOME/bin/isabelle_scala_script› may be specified in expanded
  form.
›


subsection ‹Scala compiler \label{sec:tool-scalac}›

text ‹
  The @{tool_def scalac} tool is a direct wrapper for the Scala compiler; see
  also @{tool scala} above. The command line arguments are that of the
  bundled Scala distribution.

  This provides a low-level mechanism to compile further Scala modules,
  depending on existing Isabelle/Scala functionality; the resulting ‹class›
  or ‹jar› files can be added to the Java classpath using the shell function
  bash_functionclasspath.

  A more convenient high-level approach works via path‹etc/build.props›
  (see \secref{sec:scala-build}).
›


section ‹Isabelle/Scala/Java modules \label{sec:scala-build}›

subsection ‹Component configuration via path‹etc/build.props›

text ‹
  Isabelle components may augment the Isabelle/Scala/Java environment
  declaratively via properties given in path‹etc/build.props› (within the
  component directory). This specifies an output ‹jar› ‹module›, based on
  Scala or Java ‹sources›, and arbitrary ‹resources›. Moreover, a module can
  specify ‹services› that are subclasses of
  scala_type‹isabelle.Isabelle_System.Service›; these have a particular
  meaning to Isabelle/Scala tools.

  Before running a Scala or Java process, the Isabelle system implicitly
  ensures that all provided modules are compiled and packaged (as jars). It is
  also possible to invoke @{tool scala_build} explicitly, with extra options.

  
  The syntax of path‹etc/build.props› follows a regular Java properties
  file🌐‹https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/Properties.html#load(java.io.Reader)›,
  but the encoding is ‹UTF-8›, instead of historic ‹ISO 8859-1› from the API
  documentation.

  The subsequent properties are relevant for the Scala/Java build process.
  Most properties are optional: the default is an empty string (or list). File
  names are relative to the main component directory and may refer to Isabelle
  settings variables (e.g. ‹$ISABELLE_HOME›).

     ‹title› (required) is a human-readable description of the module, used
    in printed messages.

     ‹module› specifies a ‹jar› file name for the output module, as result
    of the specified sources (and resources). If this is absent (or
    ‹no_build› is set, as described below), there is no implicit build
    process. The contributing sources might be given nonetheless, notably for
    @{tool scala_project} (\secref{sec:tool-scala-project}), which includes
    Scala/Java sources of components, while suppressing ‹jar› modules (to
    avoid duplication of program content).

     ‹no_build› is a Boolean property, with default ‹false›. If set to
    ‹true›, the implicit build process for the given ‹module› is ‹omitted›
    --- it is assumed to be provided by other means.

     ‹scalac_options› and ‹javac_options› augment the default settings
    @{setting_ref ISABELLE_SCALAC_OPTIONS} and @{setting_ref
    ISABELLE_JAVAC_OPTIONS} for this component; option syntax follows the
    regular command-line tools ‹scalac› and ‹javac›, respectively.

     ‹main› specifies the main entry point for the ‹jar› module. This is
    only relevant for direct invocation like ``‹java -jar test.jar›''.

     ‹requirements› is a list of ‹jar› modules that are needed in the
    compilation process, but not provided by the regular classpath (notably
    @{setting ISABELLE_CLASSPATH}).

    A ‹normal entry› refers to a single ‹jar› file name, possibly with
    settings variables as usual. E.g. 🗏‹$ISABELLE_SCALA_JAR› for the main
    🗏‹$ISABELLE_HOME/lib/classes/isabelle.jar› (especially relevant for
    add-on modules).

    A ‹special entry› is of the form ‹env:›variable› and refers to a
    settings variable from the Isabelle environment: its value may consist of
    multiple ‹jar› entries (separated by colons). Environment variables are
    not expanded recursively.

     ‹resources› is a list of files that should be included in the resulting
    ‹jar› file. Each item consists of a pair separated by colon:
    source›‹:›target› means to copy an existing source file (relative to
    the component directory) to the given target file or directory (relative
    to the ‹jar› name space). A file› specification without colon
    abbreviates file›‹:›file›, i.e. the file is copied while retaining its
    relative path name.

     ‹sources› is a list of ‹.scala› or ‹.java› files that contribute to
    the specified module. It is possible to use both languages simultaneously:
    the Scala and Java compiler will be invoked consecutively to make this
    work.

     ‹services› is a list of class names to be registered as Isabelle
    service providers (subclasses of
    scala_type‹isabelle.Isabelle_System.Service›). Internal class names of
    the underlying JVM need to be given: e.g. see method @{scala_method (in
    java.lang.Object) getClass}.

    Particular services require particular subclasses: instances are filtered
    according to their dynamic type. For example, class
    scala_type‹isabelle.Isabelle_Scala_Tools› collects Scala command-line
    tools, and class scala_type‹isabelle.Scala.Functions› collects Scala
    functions (\secref{sec:scala-functions}).
›


subsection ‹Explicit Isabelle/Scala/Java build \label{sec:tool-scala-build}›

text ‹
  The @{tool_def scala_build} tool explicitly invokes the build process for
  all registered components.
  @{verbatim [display]
‹Usage: isabelle scala_build [OPTIONS]

  Options are:
    -f           force fresh build
    -q           quiet mode: suppress stdout/stderr

  Build Isabelle/Scala/Java modules of all registered components
  (if required).
›}

  For each registered Isabelle component that provides
  path‹etc/build.props›, the specified output ‹module› is checked against
  the corresponding input ‹requirements›, ‹resources›, ‹sources›. If
  required, there is an automatic build using ‹scalac› or ‹javac› (or both).
  The identity of input files is recorded within the output ‹jar›, using SHA1
  digests in ‹META-INF/isabelle/shasum›.

  
  Option ‹-f› forces a fresh build, regardless of the up-to-date status of
  input files vs. the output module.

  
  Option ‹-q› suppresses all output on stdout/stderr produced by the Scala or
  Java compiler.

   Explicit invocation of @{tool scala_build} mainly serves testing or
  applications with special options: the Isabelle system normally does an
  automatic the build on demand.
›


subsection ‹Project setup for common Scala IDEs \label{sec:tool-scala-project}›

text ‹
  The @{tool_def scala_project} tool creates a project configuration for all
  Isabelle/Java/Scala modules specified in components via
  path‹etc/build.props›, together with additional source files given on
  the command-line:

  @{verbatim [display]
‹Usage: isabelle scala_project [OPTIONS] [MORE_SOURCES ...]

  Options are:
    -D DIR       project directory (default: "$ISABELLE_HOME_USER/scala_project")
    -G           use Gradle as build tool
    -L           make symlinks to original source files
    -M           use Maven as build tool
    -f           force update of existing directory

  Setup project for Isabelle/Scala/jEdit --- to support common IDEs such
  as IntelliJ IDEA. Either option -G or -M is mandatory to specify the
  build tool.›}

  The generated configuration is for Gradle🌐‹https://gradle.org› or
  Maven🌐‹https://maven.apache.org›, but the main purpose is to import it
  into common IDEs like IntelliJ IDEA🌐‹https://www.jetbrains.com/idea›.
  This allows to explore the sources with static analysis and other hints in
  real-time.

  The generated files refer to physical file-system locations, using the path
  notation of the underlying OS platform. Thus the project needs to be
  recreated whenever the Isabelle installation is changed or moved.

  
  Option ‹-G› selects Gradle and ‹-M› selects Maven as Java/Scala build
  tool: either one needs to be specified explicitly. These tools have a
  tendency to break down unexpectedly, so supporting both increases the
  chances that the generated IDE project works properly.

  
  Option ‹-L› produces ‹symlinks› to the original files: this allows to
  develop Isabelle/Scala/jEdit modules within an external IDE. The default is
  to ‹copy› source files, so editing them within the IDE has no permanent
  effect on the originals.

  
  Option ‹-D› specifies an explicit project directory, instead of the default
  path‹$ISABELLE_HOME_USER/scala_project›. Option ‹-f› forces an existing
  project directory to be ‹purged› --- after some sanity checks that it has
  been generated by @{tool "scala_project"} before.
›


subsubsection ‹Examples›

text ‹
  Create a project directory and for editing the original sources:

  @{verbatim [display] ‹isabelle scala_project -f -L›}

  On Windows, this usually requires Administrator rights, in order to create
  native symlinks.
›


section ‹Registered Isabelle/Scala functions \label{sec:scala-functions}›

subsection ‹Defining functions in Isabelle/Scala›

text ‹
  The service class scala_type‹isabelle.Scala.Functions› collects Scala
  functions of type scala_type‹isabelle.Scala.Fun›: by registering
  instances via ‹services› in path‹etc/build.props›
  (\secref{sec:scala-build}), it becomes possible to invoke Isabelle/Scala
  from Isabelle/ML (see below).

  An example is the predefined collection of
  scala_type‹isabelle.Scala.Functions› in
  🗏‹$ISABELLE_HOME/etc/build.props›. The overall list of registered functions
  is accessible in Isabelle/Scala as
  scala_object‹isabelle.Scala.functions›.

  The general class scala_type‹isabelle.Scala.Fun› expects a multi-argument
  / multi-result function
  scala_type‹List[isabelle.Bytes] => List[isabelle.Bytes]›; more common are
  instances of scala_type‹isabelle.Scala.Fun_Strings› for type
  scala_type‹List[String] => List[String]›, or
  scala_type‹isabelle.Scala.Fun_String› for type
  scala_type‹String => String›.
›


subsection ‹Invoking functions in Isabelle/ML›

text ‹
  Isabelle/PIDE provides a protocol to invoke registered Scala functions in
  ML: this works both within the Prover IDE and in batch builds.

  The subsequent ML antiquotations refer to Scala functions in a
  formally-checked manner.

  \begin{matharray}{rcl}
  @{ML_antiquotation_def "scala_function"} & : & ML_antiquotation› \\
  @{ML_antiquotation_def "scala"} & : & ML_antiquotation› \\
  \end{matharray}

  rail‹
    (@{ML_antiquotation scala_function} |
     @{ML_antiquotation scala}) @{syntax embedded}

   @{scala_function name}› inlines the checked function name as ML string
    literal.

   @{scala name}› and @{scala_thread name}› invoke the checked function via
  the PIDE protocol. In Isabelle/ML this appears as a function of type
  ML_typestring list -> string list or ML_typestring -> string,
  depending on the definition in Isabelle/Scala. Evaluation is subject to
  interrupts within the ML runtime environment as usual. A scala‹null›
  result in Scala raises an exception MLScala.Null in ML. The execution
  of @{scala}› works via a Scala future on a bounded thread farm, while
  @{scala_thread}› always forks a separate Java/VM thread.

  The standard approach of representing datatypes via strings works via XML in
  YXML transfer syntax. See Isabelle/ML operations and modules @{ML
  YXML.string_of_body}, @{ML YXML.parse_body}, @{ML_structure XML.Encode},
  @{ML_structure XML.Decode}; similarly for Isabelle/Scala. Isabelle symbols
  may have to be recoded via Scala operations
  scala_method‹isabelle.Symbol.decode› and
  scala_method‹isabelle.Symbol.encode›.
›


subsubsection ‹Examples›

text ‹
  Invoke the predefined Scala function scala_functionecho:
›

ML val s = "test";
  val s' = scalaecho s;
  assert (s = s')

text ‹
  Let the Scala compiler process some toplevel declarations, producing a list
  of errors:
›

ML val source = "class A(a: Int, b: Boolean)"
  val errors =
    scalascala_toplevel source
    |> YXML.parse_body
    |> let open XML.Decode in list string end;

  assert (null errors)

text ‹
  The above is merely for demonstration. See MLScala_Compiler.toplevel
  for a more convenient version with builtin decoding and treatment of errors.
›


section ‹Documenting Isabelle/Scala entities›

text ‹
  The subsequent document antiquotations help to document Isabelle/Scala
  entities, with formal checking of names against the Isabelle classpath.

  \begin{matharray}{rcl}
  @{antiquotation_def "scala"} & : & antiquotation› \\
  @{antiquotation_def "scala_object"} & : & antiquotation› \\
  @{antiquotation_def "scala_type"} & : & antiquotation› \\
  @{antiquotation_def "scala_method"} & : & antiquotation› \\
  \end{matharray}

  rail‹
    (@@{antiquotation scala} | @@{antiquotation scala_object})
      @{syntax embedded}
    ;
    @@{antiquotation scala_type} @{syntax embedded} types
    ;
    @@{antiquotation scala_method} class @{syntax embedded} types args
    ;
    class: ('(' @'in' @{syntax name} types ')')?
    ;
    types: ('[' (@{syntax name} ',' +) ']')?
    ;
    args: ('(' (nat | (('_' | @{syntax name}) + ',')) ')')?

   @{scala s}› is similar to @{verbatim s}›, but the given source text is
  checked by the Scala compiler as toplevel declaration (without evaluation).
  This allows to write Isabelle/Scala examples that are statically checked.

   @{scala_object x}› checks the given Scala object name (simple value or
  ground module) and prints the result verbatim.

   @{scala_type T[A]}› checks the given Scala type name (with optional type
  parameters) and prints the result verbatim.

   @{scala_method (in c[A]) m[B](n)}› checks the given Scala method m› in
  the context of class c›. The method argument slots are either specified by
  a number n› or by a list of (optional) argument types; this may refer to
  type variables specified for the class or method: A› or B› above.

  Everything except for the method name m› is optional. The absence of the
  class context means that this is a static method. The absence of arguments
  with types means that the method can be determined uniquely as ‹(›m›‹ _)›
  in Scala (no overloading).
›


subsubsection ‹Examples›

text ‹
  Miscellaneous Isabelle/Scala entities:

     object: scala_object‹isabelle.Isabelle_Process›
     type without parameter: @{scala_type isabelle.Console_Progress}
     type with parameter: @{scala_type List[A]}
     static method: scala_method‹isabelle.Isabelle_System.bash›
     class and method with type parameters:
      @{scala_method (in List[A]) map[B]("A => B")}
     overloaded method with argument type: @{scala_method (in Int) "+" (Int)}

end