Technical reports
Modelling orchestration
May 2025, 116 pages
This technical report is based on a dissertation submitted August 2024 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Trinity College.
DOI | https://doi.org/10.48456/tr-998 |
Abstract
Modern cloud services operate at significant and increasing scale. The growth of these services has led to the need for automated management to keep them operational across many thousands of nodes and multiple geo-distributed sites. Orchestrators are the platforms designed to automate this management and standardise the workflows involved.
The significant uptake of modern orchestrators means that they have expanded their scope out of private datacenters, into the public cloud, and now even towards the edge of the network. These are environments for which they are not designed, and while they share some characteristics with private datacenters, the differences are sufficiently significant to require rethinking the design of the orchestrators.
In this dissertation, I examine orchestrator design, focusing on the global state they maintain in their central datastores. To do this I propose a definition of the orchestration problem and provide a lightweight formalisation using model checking. I use this model to explore the properties of an existing orchestrator, explaining observed failures arising from changes in the consistency model. I then explore the impact of variations to the consistency model of the global state on properties and performance of the model checking.
Using insights from this model and its consistency analysis I then propose two new datastores to support the control-plane of orchestration platforms, for the public cloud and the near-edge. In the public cloud data confidentiality is paramount, trying to minimise the actors within the trust boundary to enable secure, trusted deployments. For the near-edge I focus on availability of a single cluster, enabling individual locations to process requests without reliance on persistent non-local communication.
Together, these components, the model and the two datastores, enable orchestration platforms to be optimised for their environments, enabling more widespread use.
Full text
PDF (3.6 MB)
BibTeX record
@TechReport{UCAM-CL-TR-998, author = {Jeffery, Andrew}, title = {{Modelling orchestration}}, year = 2025, month = may, url = {https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-998.pdf}, institution = {University of Cambridge, Computer Laboratory}, doi = {10.48456/tr-998}, number = {UCAM-CL-TR-998} }