Causal, Adaptive, Distributed, and Efficient Tracing System (CADETS)


Contemporary computer systems face attackers who are sophisticated, pervasive, and persistent, and who are also able to exploit information asymmetries that allow them to ``hide in plain sight’’ due to the opacity of current software designs. Security and systems researchers have little to boast about; current auditing and forensic tools are trapped in the 1980s and 1990s as: (1) they focus on symptomatic events that are easy to log (e.g., system calls that were the focus of the Orange Book or Common Criteria) rather than causal relationships (e.g., relating application-internal security behavior to system-wide events); (2) they remain fundamentally local as our most critical systems (and hence attacks on them) have become distributed, scaling poorly with respect to both performance and analysis capabilities; (3) they are unresponsive to changes in analyst requirements, especially as forensic investigators shift their focus within live systems; and (4) they fail to support a virtuous cycle in which analysts and software authors improve the self-descriptive capabilities of deployed systems as experience is gained, to make them more responsive to analysis over time.

We propose a Causal, Adaptive, Distributed, and Efficient Tracing System (CADETS), which will address flaws in current audit and information-flow systems through fundamental improvements in dynamic instrumentation, scalable distributed tracing, and programming-language support. CADETS has three major components: Event Query (EQ) is a new query language, loosely based on DTrace’s D, that will drive in-application, whole-system, and distributed tracing using temporal expressions and information flow. Watchman is a host-based tracing framework that dynamically introduces variable-granularity instrumentation within and between executing programs. DEQUE distributes EQ expressions over many hosts to track inter-node information flows and temporal sequences, implementing post-hoc trace aggregation, or as needed, tagging of TCP/IP packets, filesystem RPCs, and application-layer protocols with temporal and information-flow labels.