<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  xmlns="http://purl.org/rss/1.0/"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
>
  <channel rdf:about="http://www.cl.cam.ac.uk/techreports/">
    <title>Computer Laboratory Technical Reports</title>
    <link>http://www.cl.cam.ac.uk/techreports/</link>
    <description>Recent research reports published by the Computer Laboratory at the University of Cambridge.</description>
    <items>
      <rdf:Seq>
        <rdf:li resource="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-813.html" />
        <rdf:li resource="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-812.html" />
        <rdf:li resource="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-811.html" />
        <rdf:li resource="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-810.html" />
        <rdf:li resource="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-809.html" />
        <rdf:li resource="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-808.html" />
        <rdf:li resource="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-807.html" />
        <rdf:li resource="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-806.html" />
        <rdf:li resource="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-805.html" />
        <rdf:li resource="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-804.html" />
        <rdf:li resource="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-803.html" />
        <rdf:li resource="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-802.html" />
        <rdf:li resource="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-801.html" />
        <rdf:li resource="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-800.html" />
        <rdf:li resource="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-799.html" />
      </rdf:Seq>
    </items>
  </channel>
  <item rdf:about="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-813.html">
    <title>Reconstructing compressed photo and video data</title>
    <link>http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-813.pdf</link>
    <dc:creator>Lewis, Andrew B.</dc:creator>
    <dc:publisher>University of Cambridge, Computer Laboratory</dc:publisher>
    <dc:date>2012-02</dc:date>
    <description>
        Forensic investigators sometimes need to verify the integrity
        and processing history of digital photos and videos. The
        multitude of storage formats and devices they need to access
        also presents a challenge for evidence recovery. This thesis
        explores how visual data files can be recovered and analysed in
        scenarios where they have been stored in the JPEG or H.264
        (MPEG-4 AVC) compression formats.
        
        My techniques make use of low-level details of lossy compression
        algorithms in order to tell whether a file under consideration
        might have been tampered with. I also show that limitations of
        entropy coding sometimes allow us to recover intact files from
        storage devices, even in the absence of filesystem and container
        metadata.
        
        I first show that it is possible to embed an imperceptible
        message within a uniform region of a JPEG image such that the
        message becomes clearly visible when the image is recompressed
        at a particular quality factor, providing a visual warning that
        recompression has taken place.
        
        I then use a precise model of the computations involved in JPEG
        decompression to build a specialised compressor, designed to
        invert the computations of the decompressor. This recompressor
        recovers the compressed bitstreams that produce a given
        decompression result, and, as a side-effect, indicates any
        regions of the input which are inconsistent with JPEG
        decompression. I demonstrate the algorithm on a large database
        of images, and show that it can detect modifications to
        decompressed image regions.
        
        Finally, I show how to rebuild fragmented compressed bitstreams,
        given a syntax description that includes information about
        syntax errors, and demonstrate its applicability to H.264/AVC
        Baseline profile video data in memory dumps with randomly
        shuffled blocks.
    </description>
  </item>
  <item rdf:about="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-812.html">
    <title>Abstracting information on body area networks</title>
    <link>http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-812.pdf</link>
    <dc:creator>Brandão, Pedro</dc:creator>
    <dc:publisher>University of Cambridge, Computer Laboratory</dc:publisher>
    <dc:date>2012-01</dc:date>
    <description>
        Healthcare is changing, correction, healthcare is in need of
        change. The population ageing, the increase in chronic and heart
        diseases and just the increase in population size will overwhelm
        the current hospital-centric healthcare.
        
        There is a growing interest by individuals to monitor their own
        physiology. Not only for sport activities, but also to control
        their own diseases. They are changing from the passive
        healthcare receiver to a proactive self-healthcare taker. The
        focus is shifting from hospital centred treatment to a
        patient-centric healthcare monitoring.
        
        Continuous, everyday, wearable monitoring and actuating is part
        of this change. In this setting, sensors that monitor the heart,
        blood pressure, movement, brain activity, dopamine levels, and
        actuators that pump insulin, ‘pump’ the heart, deliver drugs to
        specific organs, stimulate the brain are needed as pervasive
        components in and on the body. They will tend for people’s need
        of self-monitoring and facilitate healthcare delivery.
        
        These components around a human body that communicate to sense
        and act in a coordinated fashion make a Body Area Network (BAN).
        In most cases, and in our view, a central, more powerful
        component will act as the coordinator of this network. These
        networks aim to augment the power to monitor the human body and
        react to problems discovered with this observation. One key
        advantage of this system is their overarching view of the whole
        network. That is, the central component can have an
        understanding of all the monitored signals and correlate them to
        better evaluate and react to problems. This is the focus of our
        thesis.
        
        In this document we argue that this multi-parameter correlation
        of the heterogeneous sensed information is not being handled in
        BANs. The current view depends exclusively on the application
        that is using the network and its understanding of the
        parameters. This means that every application will oversee the
        BAN’s heterogeneous resources managing them directly without
        taking into consideration other applications, their needs and
        knowledge.
        
        There are several physiological correlations already known by
        the medical field. Correlating blood pressure and cross
        sectional area of blood vessels to calculate blood velocity,
        estimating oxygen delivery from cardiac output and oxygen
        saturation, are such examples. This knowledge should be
        available in a BAN and shared by the several applications that
        make use of the network. This architecture implies a central
        component that manages the knowledge and the resources. And this
        is, in our view, missing in BANs.
        
        Our proposal is a middleware layer that abstracts the underlying
        BAN’s resources to the application, providing instead an
        information model to be queried. The model describes the
        correlations for producing new information that the middleware
        knows about. Naturally, the raw sensed data is also part of the
        model. The middleware hides the specificities of the nodes that
        constitute the BAN, by making available their sensed production.
        Applications are able to query for information attaching
        requirements to these requests. The middleware is then
        responsible for satisfying the requests while optimising the
        resource usage of the BAN.
        
        Our architecture proposal is divided in two corresponding
        layers, one that abstracts the nodes’ hardware (hiding node’s
        particularities) and the information layer that describes
        information available and how it is correlated. A prototype
        implementation of the architecture was done to illustrate the
        concept.
    </description>
  </item>
  <item rdf:about="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-811.html">
    <title>Active electromagnetic attacks on secure hardware</title>
    <link>http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-811.pdf</link>
    <dc:creator>Markettos, A. Theodore</dc:creator>
    <dc:publisher>University of Cambridge, Computer Laboratory</dc:publisher>
    <dc:date>2011-12</dc:date>
    <description>
        The field of side-channel attacks on cryptographic hardware has
        been extensively studied. In many cases it is easier to derive
        the secret key from these attacks than to break the cryptography
        itself. One such sidechannel attack is the electromagnetic
        side-channel attack, giving rise to electromagnetic analysis
        (EMA).
        
        EMA, when otherwise known as ‘TEMPEST’ or ‘compromising
        emanations’, has a long history in the military context over
        almost the whole of the twentieth century. The US military also
        mention three related attacks, believed to be: HIJACK
        (modulation of secret data onto conducted signals), NONSTOP
        (modulation of secret data onto radiated signals) and TEAPOT
        (intentional malicious emissions).
        
        In this thesis I perform a fusion of TEAPOT and HIJACK/NONSTOP
        techniques on secure integrated circuits. An attacker is able to
        introduce one or more frequencies into a cryptographic system
        with the intention of forcing it to misbehave or to radiate
        secrets.
        
        I demonstrate two approaches to this attack:
        
        To perform the reception, I assess a variety of electromagnetic
        sensors to perform EMA. I choose an inductive hard drive head
        and a metal foil electric field sensor to measure near-field EM
        emissions.
        
        The first approach, named the re-emission attack, injects
        frequencies into the power supply of a device to cause it to
        modulate up baseband signals. In this way I detect
        data-dependent timing from a ‘secure’ microcontroller. Such
        up-conversion enables a more compact and more distant receiving
        antenna.
        
        The second approach involves injecting one or more frequencies
        into the power supply of a random number generator that uses
        jitter of ring oscillators as its random number source. I am
        able to force injection locking of the oscillators, greatly
        diminishing the entropy available.
        
        I demonstrate this with the random number generators on two
        commercial devices. I cause a 2004 EMV banking smartcard to fail
        statistical test suites by generating a periodicity. For a
        secure 8-bit microcontroller that has been used in banking ATMs,
        I am able to reduce the random number entropy from 2³² to 225.
        This enables a 50% probability of a successful attack on cash
        withdrawal in 15 attempts.
    </description>
  </item>
  <item rdf:about="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-810.html">
    <title>Proximity Coherence for chip-multiprocessors</title>
    <link>http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-810.pdf</link>
    <dc:creator>Barrow-Williams, Nick</dc:creator>
    <dc:publisher>University of Cambridge, Computer Laboratory</dc:publisher>
    <dc:date>2011-11</dc:date>
    <description>
        Many-core architectures provide an efficient way of harnessing
        the growing numbers of transistors available in modern
        fabrication processes; however, the parallel programs run on
        these platforms are increasingly limited by the energy and
        latency costs of communication. Existing designs provide a
        functional communication layer but do not necessarily implement
        the most efficient solution for chip-multiprocessors, placing
        limits on the performance of these complex systems. In an era of
        increasingly power limited silicon design, efficiency is now a
        primary concern that motivates designers to look again at the
        challenge of cache coherence.
        
        The first step in the design process is to analyse the
        communication behaviour of parallel benchmark suites such as
        Parsec and SPLASH-2. This thesis presents work detailing the
        sharing patterns observed when running the full benchmarks on a
        simulated 32-core x86 machine. The results reveal considerable
        locality of shared data accesses between threads with
        consecutive operating system assigned thread IDs. This pattern,
        although of little consequence in a multi-node system,
        corresponds to strong physical locality of shared data between
        adjacent cores on a chip-multiprocessor platform.
        
        Traditional cache coherence protocols, although often used in
        chip-multiprocessor designs, have been developed in the context
        of older multi-node systems. By redesigning coherence protocols
        to exploit new patterns such as the physical locality of shared
        data, improving the efficiency of communication, specifically in
        chip-multiprocessors, is possible. This thesis explores such a
        design – Proximity Coherence – a novel scheme in which L1 load
        misses are optimistically forwarded to nearby caches via new
        dedicated links rather than always being indirected via a
        directory structure.
    </description>
  </item>
  <item rdf:about="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-809.html">
    <title>Distributed virtual environment scalability and security</title>
    <link>http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-809.pdf</link>
    <dc:creator>Miller, John L.</dc:creator>
    <dc:publisher>University of Cambridge, Computer Laboratory</dc:publisher>
    <dc:date>2011-10</dc:date>
    <description>
        Distributed virtual environments (DVEs) have been an active area
        of research and engineering for more than 20 years. The most
        widely deployed DVEs are network games such as Quake, Halo, and
        World of Warcraft (WoW), with millions of users and billions of
        dollars in annual revenue. Deployed DVEs remain expensive
        centralized implementations despite significant research
        outlining ways to distribute DVE workloads.
        
        This dissertation shows previous DVE research evaluations are
        inconsistent with deployed DVE needs. Assumptions about avatar
        movement and proximity – fundamental scale factors – do not
        match WoW's workload, and likely the workload of other deployed
        DVEs. Alternate workload models are explored and preliminary
        conclusions presented. Using realistic workloads it is shown
        that a fully decentralized DVE cannot be deployed to today's
        consumers, regardless of its overhead.
        
        Residential broadband speeds are improving, and this limitation
        will eventually disappear. When it does, appropriate security
        mechanisms will be a fundamental requirement for technology
        adoption.
        
        A trusted auditing system (“Carbon”) is presented which has good
        security, scalability, and resource characteristics for
        decentralized DVEs. When performing exhaustive auditing, Carbon
        adds 27% network overhead to a decentralized DVE with a WoW-like
        workload. This resource consumption can be reduced
        significantly, depending upon the DVE's risk tolerance. Finally,
        the Pairwise Random Protocol (PRP) is described. PRP enables
        adversaries to fairly resolve probabilistic activities, an
        ability missing from most decentralized DVE security proposals.
        
        Thus, this dissertation's contribution is to address two of the
        obstacles for deploying research on decentralized DVE
        architectures. First, lack of evidence that research results
        apply to existing DVEs. Second, the lack of security systems
        combining appropriate security guarantees with acceptable
        overhead.
    </description>
  </item>
  <item rdf:about="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-808.html">
    <title>Resource-sensitive synchronisation inference by abduction</title>
    <link>http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-808.pdf</link>
    <dc:creator>Botinčan, Matko</dc:creator>
    <dc:creator>Dodds, Mike</dc:creator>
    <dc:creator>Jagannathan, Suresh</dc:creator>
    <dc:publisher>University of Cambridge, Computer Laboratory</dc:publisher>
    <dc:date>2012-01</dc:date>
    <description>
        We present an analysis which takes as its input a sequential
        program, augmented with annotations indicating potential
        parallelization opportunities, and a sequential proof, written
        in separation logic, and produces a correctly-synchronized
        parallelized program and proof of that program. Unlike previous
        work, ours is not an independence analysis; we insert
        synchronization constructs to preserve relevant dependencies
        found in the sequential program that may otherwise be violated
        by a naïve translation. Separation logic allows us to
        parallelize fine-grained patterns of resource-usage, moving
        beyond straightforward points-to analysis.
        
        Our analysis works by using the sequential proof to discover
        dependencies between different parts of the program. It
        leverages these discovered dependencies to guide the insertion
        of synchronization primitives into the parallelized program, and
        ensure that the resulting parallelized program satisfies the
        same specification as the original sequential program. Our
        analysis is built using frame inference and abduction, two
        techniques supported by an increasing number of separation logic
        tools.
    </description>
  </item>
  <item rdf:about="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-807.html">
    <title>Second-order algebraic theories</title>
    <link>http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-807.pdf</link>
    <dc:creator>Mahmoud, Ola</dc:creator>
    <dc:publisher>University of Cambridge, Computer Laboratory</dc:publisher>
    <dc:date>2011-10</dc:date>
    <description>
        Second-order universal algebra and second-order equational logic
        respectively provide a model theory and a formal deductive
        system for languages with variable binding and parameterised
        metavariables. This dissertation completes the algebraic
        foundations of second-order languages from the viewpoint of
        categorical algebra.
        
        In particular, the dissertation introduces the notion of
        second-order algebraic theory. A main role in the definition is
        played by the second-order theory of equality M, representing
        the most elementary operators and equations present in every
        second-order language. We show that M can be described
        abstractly via the universal property of being the free
        cartesian category on an exponentiable object. Thereby, in the
        tradition of categorical algebra, a second-order algebraic
        theory consists of a cartesian category TH and a strict
        cartesian identity-on-objects functor from M to TH that
        preserves the universal exponentiable object of M.
        
        At the syntactic level, we establish the correctness of our
        definition by showing a categorical equivalence between
        second-order equational presentations and second-order algebraic
        theories. This equivalence, referred to as the Second-Order
        Syntactic Categorical Type Theory Correspondence, involves
        distilling a notion of syntactic translation between
        second-order equational presentations that corresponds to the
        canonical notion of morphism between second-order algebraic
        theories. Syntactic translations provide a mathematical
        formalisation of notions such as encodings and transforms for
        second-order languages.
        
        On top of the aforementioned syntactic correspondence, we
        furthermore establish the Second-Order Semantic Categorical Type
        Theory Correspondence. This involves generalising Lawvere's
        notion of functorial model of algebraic theories to the
        second-order setting. By this semantic correspondence,
        second-order functorial semantics is shown to correspond to the
        model theory of second-order universal algebra.
        
        We finally show that the core of the theory surrounding Lawvere
        theories generalises to the second order as well. Instances of
        this development are the existence of algebraic functors and
        monad morphisms in the second-order universe. Moreover, we
        define a notion of translation homomorphism that allows us to
        establish a 2-categorical type theory correspondence.
    </description>
  </item>
  <item rdf:about="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-806.html">
    <title>On joint diagonalisation for dynamic network analysis</title>
    <link>http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-806.pdf</link>
    <dc:creator>Fay, Damien</dc:creator>
    <dc:creator>Kunegis, Jérôme</dc:creator>
    <dc:creator>Yoneki, Eiko</dc:creator>
    <dc:publisher>University of Cambridge, Computer Laboratory</dc:publisher>
    <dc:date>2011-10</dc:date>
    <description>
        Joint diagonalisation (JD) is a technique used to estimate an
        average eigenspace of a set of matrices. Whilst it has been used
        successfully in many areas to track the evolution of systems via
        their eigenvectors; its application in network analysis is
        novel. The key focus in this paper is the use of JD on matrices
        of spanning trees of a network. This is especially useful in the
        case of real-world contact networks in which a single underlying
        static graph does not exist. The average eigenspace may be used
        to construct a graph which represents the `average spanning
        tree' of the network or a representation of the most common
        propagation paths. We then examine the distribution of
        deviations from the average and find that this distribution in
        real-world contact networks is multi-modal; thus indicating
        several modes in the underlying network. These modes are
        identified and are found to correspond to particular times. Thus
        JD may be used to decompose the behaviour, in time, of contact
        networks and produce average static graphs for each time. This
        may be viewed as a mixture between a dynamic and static graph
        approach to contact network analysis.
    </description>
  </item>
  <item rdf:about="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-805.html">
    <title>A model personal energy meter</title>
    <link>http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-805.pdf</link>
    <dc:creator>Hay, Simon</dc:creator>
    <dc:publisher>University of Cambridge, Computer Laboratory</dc:publisher>
    <dc:date>2011-09</dc:date>
    <description>
        Every day each of us consumes a significant amount of energy,
        both directly through transport, heating and use of appliances,
        and indirectly from our needs for the production of food,
        manufacture of goods and provision of services.
        
        This dissertation investigates a personal energy meter which can
        record and apportion an individual's energy usage in order to
        supply baseline information and incentives for reducing our
        environmental impact.
        
        If the energy costs of large shared resources are split evenly
        without regard for individual consumption each person minimises
        his own losses by taking advantage of others. Context awareness
        offers the potential to change this balance and apportion energy
        costs to those who cause them to be incurred. This dissertation
        explores how sensor systems installed in many buildings today
        can be used to apportion energy consumption between users,
        including an evaluation of a range of strategies in a case study
        and elaboration of the overriding principles that are generally
        applicable. It also shows how second-order estimators combined
        with location data can provide a proxy for fine-grained sensing.
        
        A key ingredient for apportionment mechanisms is data on energy
        usage. This may come from metering devices or buildings
        directly, or from profiling devices and using secondary
        indicators to infer their power state. A mechanism for profiling
        devices to determine the energy costs of specific activities,
        particularly applicable to shared programmable devices is
        presented which can make this process simpler and more accurate.
        By combining crowd-sourced building-inventory information and a
        simple building energy model it is possible to estimate an
        individual's energy use disaggregated by device class with very
        little direct sensing.
        
        Contextual information provides crucial cues for apportioning
        the use and energy costs of resources, and one of the most
        valuable sources from which to infer context is location. A key
        ingredient for a personal energy meter is a low cost, low
        infrastructure location system that can be deployed on a truly
        global scale. This dissertation presents a description and
        evaluation of the new concept of inquiry-free Bluetooth tracking
        that has the potential to offer indoor location information with
        significantly less infrastructure and calibration than other
        systems.
        
        Finally, a suitable architecture for a personal energy meter on
        a global scale is demonstrated using a mobile phone application
        to aggregate energy feeds based on the case studies and
        technologies developed.
    </description>
  </item>
  <item rdf:about="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-804.html">
    <title>The HasGP user manual</title>
    <link>http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-804.pdf</link>
    <dc:creator>Holden, Sean B.</dc:creator>
    <dc:publisher>University of Cambridge, Computer Laboratory</dc:publisher>
    <dc:date>2011-09</dc:date>
    <description>
        HasGP is an experimental library implementing methods for
        supervised learning using Gaussian process (GP) inference, in
        both the regression and classification settings. It has been
        developed in the functional language Haskell as an investigation
        into whether the well-known advantages of the functional
        paradigm can be exploited in the field of machine learning,
        which traditionally has been dominated by the
        procedural/object-oriented approach, particularly involving
        C/C++ and Matlab. HasGP is open-source software released under
        the GPL3 license. This manual provides a short introduction on
        how install the library, and how to apply it to supervised
        learning problems. It also provides some more in-depth
        information on the implementation of the library, which is aimed
        at developers. In the latter, we also show how some of the
        specific functional features of Haskell, in particular the
        ability to treat functions as first-class objects, and the use
        of typeclasses and monads, have informed the design of the
        library. This manual applies to HasGP version 0.1, which is the
        initial release of the library.
    </description>
  </item>
  <item rdf:about="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-803.html">
    <title>Computational approaches to figurative language</title>
    <link>http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-803.pdf</link>
    <dc:creator>Shutova, Ekaterina V.</dc:creator>
    <dc:publisher>University of Cambridge, Computer Laboratory</dc:publisher>
    <dc:date>2011-08</dc:date>
    <description>
        The use of figurative language is ubiquitous in natural language
        text and it is a serious bottleneck in automatic text
        understanding. A system capable of interpreting figurative
        language would be extremely beneficial to a wide range of
        practical NLP applications. The main focus of this thesis is on
        the phenomenon of metaphor. I adopt a statistical data-driven
        approach to its modelling, and create the first open-domain
        system for metaphor identification and interpretation in
        unrestricted text. In order to verify that similar methods can
        be applied to modelling other types of figurative language, I
        then extend this work to the task of interpretation of logical
        metonymy.
        
        The metaphor interpretation system is capable of discovering
        literal meanings of metaphorical expressions in text. For the
        metaphors in the examples “All of this stirred an unfathomable
        excitement in her” or “a carelessly leaked report” the system
        produces interpretations “All of this provoked an unfathomable
        excitement in her” and “a carelessly disclosed report”
        respectively. It runs on unrestricted text and to my knowledge
        is the only existing robust metaphor paraphrasing system. It
        does not employ any hand-coded knowledge, but instead derives
        metaphorical interpretations from a large text corpus using
        statistical pattern-processing. The system was evaluated with
        the aid of human judges and it operates with the accuracy of
        81%.
        
        The metaphor identification system automatically traces the
        analogies involved in the production of a particular
        metaphorical expression in a minimally supervised way. The
        system generalises over the analogies by means of verb and noun
        clustering, i.e. identification of groups of similar concepts.
        This generalisation makes it capable of recognising previously
        unseen metaphorical expressions in text, e.g. having once seen a
        metaphor ‘stir excitement’ the system concludes that ‘swallow
        anger’ is also used metaphorically. The system identifies
        metaphorical expressions with a high precision of 79%.
        
        The logical metonymy processing system produces a list of
        metonymic interpretations disambiguated with respect to their
        word sense. It then automatically organises them into a novel
        class-based model of logical metonymy inspired by both empirical
        evidence and linguistic theory. This model provides more
        accurate and generalised information about possible
        interpretations of metonymic phrases than previous approaches.
    </description>
  </item>
  <item rdf:about="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-802.html">
    <title>Latent semantic sentence clustering for multi-document summarization</title>
    <link>http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-802.pdf</link>
    <dc:creator>Geiß, Johanna</dc:creator>
    <dc:publisher>University of Cambridge, Computer Laboratory</dc:publisher>
    <dc:date>2011-07</dc:date>
    <description>
        This thesis investigates the applicability of Latent Semantic
        Analysis (LSA) to sentence clustering for Multi-Document
        Summarization (MDS). In contrast to more shallow approaches like
        measuring similarity of sentences by word overlap in a
        traditional vector space model, LSA takes word usage patterns
        into account. So far LSA has been successfully applied to
        different Information Retrieval (IR) tasks like information
        filtering and document classification (Dumais, 2004). In the
        course of this research, different parameters essential to
        sentence clustering using a hierarchical agglomerative
        clustering algorithm (HAC) in general and in combination with
        LSA in particular are investigated. These parameters include,
        inter alia, information about the type of vocabulary, the size
        of the semantic space and the optimal numbers of dimensions to
        be used in LSA. These parameters have not previously been
        studied and evaluated in combination with sentence clustering
        (chapter 4).
        
        This thesis also presents the first gold standard for sentence
        clustering in MDS. To be able to evaluate sentence clusterings
        directly and classify the influence of the different parameters
        on the quality of sentence clustering, an evaluation strategy is
        developed that includes gold standard comparison using different
        evaluation measures (chapter 5). Therefore the first compound
        gold standard for sentence clustering was created. Several human
        annotators were asked to group similar sentences into clusters
        following guidelines created for this purpose (section 5.4). The
        evaluation of the human generated clusterings revealed that the
        human annotators agreed on clustering sentences above chance.
        Analysis of the strategies adopted by the human annotators
        revealed two groups – hunters and gatherers – who differ clearly
        in the structure and size of the clusters they created (chapter
        6).
        
        On the basis of the evaluation strategy the parameters for
        sentence clustering and LSA are optimized (chapter 7). A final
        experiment in which the performance of LSA in sentence
        clustering for MDS is compared to the simple word matching
        approach of the traditional Vector Space Model (VSM) revealed
        that LSA produces better quality sentence clusters for MDS than
        VSM.
    </description>
  </item>
  <item rdf:about="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-801.html">
    <title>Software lock elision for x86 machine code</title>
    <link>http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-801.pdf</link>
    <dc:creator>Roy, Amitabha</dc:creator>
    <dc:publisher>University of Cambridge, Computer Laboratory</dc:publisher>
    <dc:date>2011-07</dc:date>
    <description>
        More than a decade after becoming a topic of intense research
        there is no transactional memory hardware nor any examples of
        software transactional memory use outside the research
        community. Using software transactional memory in large pieces
        of software needs copious source code annotations and often
        means that standard compilers and debuggers can no longer be
        used. At the same time, overheads associated with software
        transactional memory fail to motivate programmers to expend the
        needed effort to use software transactional memory. The only way
        around the overheads in the case of general unmanaged code is
        the anticipated availability of hardware support. On the other
        hand, architects are unwilling to devote power and area budgets
        in mainstream microprocessors to hardware transactional memory,
        pointing to transactional memory being a “niche” programming
        construct. A deadlock has thus ensued that is blocking
        transactional memory use and experimentation in the mainstream.
        
        This dissertation covers the design and construction of a
        software transactional memory runtime system called SLE_x86 that
        can potentially break this deadlock by decoupling transactional
        memory from programs using it. Unlike most other STM designs,
        the core design principle is transparency rather than
        performance. SLE_x86 operates at the level of x86 machine code,
        thereby becoming immediately applicable to binaries for the
        popular x86 architecture. The only requirement is that the
        binary synchronise using known locking constructs or calls such
        as those in Pthreads or OpenMP libraries. SLE_x86 provides
        speculative lock elision (SLE) entirely in software, executing
        critical sections in the binary using transactional memory.
        Optionally, the critical sections can also be executed without
        using transactions by acquiring the protecting lock.
        
        The dissertation makes a careful analysis of the impact on
        performance due to the demands of the x86 memory consistency
        model and the need to transparently instrument x86 machine code.
        It shows that both of these problems can be overcome to reach a
        reasonable level of performance, where transparent software
        transactional memory can perform better than a lock. SLE_x86 can
        ensure that programs are ready for transactional memory in any
        form, without being explicitly written for it.
    </description>
  </item>
  <item rdf:about="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-800.html">
    <title>Improving cache utilisation</title>
    <link>http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-800.pdf</link>
    <dc:creator>Srinivasan, James R.</dc:creator>
    <dc:publisher>University of Cambridge, Computer Laboratory</dc:publisher>
    <dc:date>2011-06</dc:date>
    <description>
        Microprocessors have long employed caches to help hide the
        increasing latency of accessing main memory. The vast majority
        of previous research has focussed on increasing cache hit rates
        to improve cache performance, while lately decreasing power
        consumption has become an equally important issue. This thesis
        examines the lifetime of cache lines in the memory hierarchy,
        considering whether they are live (will be referenced again
        before eviction) or dead (will not be referenced again before
        eviction). Using these two states, the cache utilisation
        (proportion of the cache which will be referenced again) can be
        calculated.
        
        This thesis demonstrates that cache utilisation is relatively
        poor over a wide range of benchmarks and cache configurations.
        By focussing on techniques to improve cache utilisation, cache
        hit rates are increased while overall power consumption may also
        be decreased.
        
        Key to improving cache utilisation is an accurate predictor of
        the state of a cache line. This thesis presents a variety of
        such predictors, mostly based upon the mature field of branch
        prediction, and compares them against previously proposed
        predictors. The most appropriate predictors are then
        demonstrated in two applications: Improving victim cache
        performance through filtering, and reducing cache pollution
        during aggressive prefetching
        
        These applications are primarily concerned with improving cache
        performance and are analysed using a detailed microprocessor
        simulator. Related applications, including decreasing power
        consumption, are also discussed, as are the applicability of
        these techniques to multiprogrammed and multiprocessor systems.
    </description>
  </item>
  <item rdf:about="http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-799.html">
    <title>A separation logic framework for HOL</title>
    <link>http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-799.pdf</link>
    <dc:creator>Tuerk, Thomas</dc:creator>
    <dc:publisher>University of Cambridge, Computer Laboratory</dc:publisher>
    <dc:date>2011-06</dc:date>
    <description>
        Separation logic is an extension of Hoare logic due to O’Hearn
        and Reynolds. It was designed for reasoning about mutable data
        structures. Because separation logic supports local reasoning,
        it scales better than classical Hoare logic and can easily be
        used to reason about concurrency. There are automated separation
        logic tools as well as several formalisations in interactive
        theorem provers. Typically, the automated separation logic tools
        are able to reason about shallow properties of large programs.
        They usually consider just the shape of data structures, not
        their data-content. The formalisations inside theorem provers
        can be used to prove interesting, deep properties. However, they
        typically lack automation. Another shortcoming is that there are
        a lot of slightly different separation logics. For each
        programming language and each interesting property a new kind of
        separation logic seems to be invented.
        
        In this thesis, a general framework for separation logic is
        developed inside the HOL4 theorem prover. This framework is
        based on Abstract Separation Logic, an abstract, high level
        variant of separation logic. Abstract Separation Logic is a
        general separation logic such that many other separation logics
        can be based on it. This framework is instantiatiated in a first
        step to support a stack with read and write permissions
        following ideas of Parkinson, Bornat and Calcagno. Finally, the
        framework is further instantiated to build a separation logic
        tool called Holfoot. It is similar to the tool Smallfoot, but
        extends it from reasoning about shape properties to fully
        functional specifications.
        
        To my knowledge this work presents the first formalisation of
        Abstract Separation Logic inside a theorem prover. By building
        Holfoot on top of this formalisation, I could demonstrate that
        Abstract Separation Logic can be used as a basis for realistic
        separation logic tools. Moreover, this work demonstrates that it
        is feasable to implement such separation logic tools inside a
        theorem prover. Holfoot is highly automated. It can verify
        Smallfoot examples automatically inside HOL4. Moreover, Holfoot
        can use the full power of HOL4. This allows Holfoot to verify
        fully functional specifications. Simple fully functional
        specifications can be handled automatically using HOL4’s tools
        and libraries or external SMT solvers. More complicated ones can
        be handled using interactive proofs inside HOL4. In contrast,
        most other separation logic tools can reason just about the
        shape of data structures. Others reason only about data
        properties that can be solved using SMT solvers.
    </description>
  </item>
</rdf:RDF>

