ELI: Bare meta performance for I/O Virtualization IBM Virtualisation cool, but costs performance. Hardware virtualization tools help. IO intensive workloads still struggle. Cause of overhead: privileged instructions still trap, and IO-intensive stuff has lots of them. Direct device assignment: still only gets 60% of bare-metal performance on some workloads. Problem is interrupts. They managed to find a workload which needs 48k interrupts per second Anyway, they propose exitless interrupts for unmodified, untrusted guests (ELI). Includes both delivery and completion . IDT works like an IDT. ELI: shadow IDT inside the guest. Hardware is configured so that IDT register points at the shadow IDT. Non-guest interrupts are configured so that they generate immediate exceptions, so that allows the hypervisor to regain control. Completion: guest tries to write to EOI in LAPIC . LAPIC registers are MMIO, and there are other registers on same page, so can't give guest page table access to the EOI register. Alternative: x2APIC uses MSRs for EOI, and those can be trapped at register granularity. Result: netperf goes from 100k exits/second, 60% of time in guest to 98% of time in guests with 800 exits/second. Apache: 91k exits/second to 1k exits/second. . UDP ping-pong latency: latency dropped from 36us to 29us; bare-metal 28us. Ah, now they're talking about interrupt moderation. Admit that this is less effective if you have an aggressive moderation policy, but still worth a few percent. Q: Malicious guests? How trusted are they? A: Details of threat model in paper. Claim it's safe. ------------------------------- DejaVu: accelerating resource allocation in virualized environments Nedeljo Vasic EPFL Virtualization enables cloud computing. Different workloads have different demands, and therefore need different resources. Allocation is non-trivial, especially as workloads change dynamically. Some automated tuning systems, but require lots of computational resources. Also need to coordinate between tenants, who may not know about each other. Lack of SLO is a popular reason to not use the cloud, according to market research. Aims of DejaVu: -- Minimise resource i.e. don't spend lots of time tuning, but also adapt quickly to changing workload. -- General approach i.e. not app specific -- Estimate interference from other tenants and react to it. Basic idea: memoisation of configurations. Implemented in obvious way: cluster workloads, train on clusters, then characterise workload online and select an appropriate cluster from training data. If there's nothing good, go back to training mode. Characterisation done by duplicating a fraction of the workload. Suppose you have a service in the cloud with some clients. Introduce a proxy so that you can send some requests to the characterisation sub-cluster. Workload signatures: use a vector of low-level metric e.g. perf counters. Use a feature selection to find out which ones matter for this workload . Then run a clustering algorithm. Training phase expensive, but hopefully don;'t run it all that often. Checking for interference: check SLA and then do re-training if we've missed it . Use that to estimate interference index == perf in production/perf in isolation. Use that as another entry in workload allocation table. Eval: using real-world races from hotmail and MSM. Results are results. Claim to allow resource allocation to closely track load while maintaining SLA. React more quickly than RightScale, by more than an order of magnitude. Q: Size of profiling cluster vs production cluster? A: Profiling cluster is one machine. Q: Is the load on the profiling machine the same as the load on the production cluster? A: Q: Do you share the same backend between production and profiling cluster? A: Q: Anomalous surge of traffic not seen in training e.g. slashdotting? A: Need to do re-tuning the first time around. Q: What kind of interference do you consider? A: Don't look at specific types, just look at SLA. ---------------------------------- Jakub Szefer Princeton Architectural support for hypervisor-secure virtualisation Hypervisor is a high-value target for attacks. Want to provide better isolation between virtual machines in a semi-hostile cloud. Trusted computing crap to protect VMs from malicious hypervisor, plus some attestation stuff. Includes: -- protect memory from hypervisor -- secure communication between VM and hypervisor. -- attestation -- Processor registers etc. Customer specifies a lump of memory which it wants to protect. Implementation involves making part of memory hardware-accessible only. Amounts to a shadow of part of the page tables, and checked on TLB fill. Secure communication amounts to getting custom hardware to generate an appropriate host key for the VM. There's some attestation stuff. Not sure why you'd do that, rather than just relying on the hardware being correct, since it's the hardware doing the attestation anyway. They've simulated it and got reasonable performance out of it. Simulation platform OpenSPARC. Q: Capability-based hardware. Would it help? A: Yes. Q: Protect stuff already in memory. How does it get there? A: Start off with no confidential information. Can then generate safe keys and push stuff in over network. Q: Live migration support? A: Can't do hypervisor-driven migration, but could do self-migration. Q: Any feedback from processor manufacturers? Likely implementation costs? A: No feedback. Costs should be plausible. Q: What if binary translation subverts the new crypto instructions? A: Attestation says you start in the right place, and the hypervisor then can't access memory to do the emulation. ------------------------------------------- Region scheduling: efficiently using the cache architecture via page-level affinity Min Lee Georgia Institute of Technology Last-level cache is important for performance. Suppose you have a task with memory regions R1 and R2. Aim is to balance caches Q: Baseline for perf tests? A: Just running default (credit) Xen scheduler. Q: Conflict between Xen schduler and region scheduling? A: Explained in paper. Scheduling still done by credit scheduler, region scheduling hooked in at load balancer level. Q: LLC is shared and doesn't have these problems? A: Westmere has two LLCs. Don't consider case where there's only one LLC. Q: Forming regions. Interference <...> A: Regions strongly correlated with threads