Classifying data races with portend EPFL Data race == accesses to a single location where at least one is a write and there is no suitable synchronisation. 1000 races loading firefox, so want automatic harmless/harmful classification. PLDI2007 Narayanasamy -- 80% misclassified. Finer classificatrions: harmful, output differs, harmless, not a race. Harmful == violates specification. Harmless is defined relative to some number of witness executions. Two approaches: multi-path, multi-schedule data race analysis. Symbolic output comparison. Second one not in talk, but in paper. Prior work: single-path analysis. Essentially, run both ways of doing the race, then compare memory snapshots. Their work: consider different input values. First stage: just consider the two schedules and compare its outputs. Can also use this to check whether the race is actually possible (e.g. custom synchronisation primitives). Second stage: do symbolic execution forwards to explore lots of paths after the race. Third stage: do a kind of inverse symbolic execution to find alternative paths to the race. Use KLEE as their symbolic execution engine. They think there are races in memcached? Summary: lots of no-races, a few output diff, very few spec violations. Q: Have you looked at the interactions of multiple races? A: No, but that would be interesting. Q: What do you mean by specification? A: Common one: doesn't crash, doesn't deadlock, etc. Could use more powerful spec if it were available. Q: How does that differ from output variation? A: Output variation is app-specific, spec violation is pretty app-agnostic. Q: You use k witnesses. How do you choose k? A: For apps we used, k=5 was pretty good. Don't have a good justification for that beyond empirical one. Q: Space and time complexity of algorithm? A: Depends on number of preemption points and number of relevant branches. More details in paper. Q: Commercialisation? Release? A: Will release later. Q: Some say any data race is a bug. Thoughts? A: Valid argument, but there are lots of races in real programs. Need to prioritise. --------------------------------------- Scalable address spaces using RCU balanced trees Austin Clements CSAIL Linux VM system doesn't scale. Want to fix that. Aim to fix mmap, munmap, and page faults. Current design: PTEs pulled in lazily by page fault handler. Two invariants: internal consistency of VMA tree, consistency between VM representations. Protect with an rwlock. mmap,munmap hold for write, page fault holds for read. Address space RCU tricky: lots of pointer updates, and the page fault handler isn't quite read-only. VMA tree. Balancing is not RCU-friendly. Fix -> bonsai tree. Mutation creates a new structure. Obvious way needs to copy O(logn) nodes. Can eliminate some of the copies by pushing the copy further down: you're allowed a single pointer mutation, so don't need to duplicate the entire path to the root. How do we do page tables? Mostly care about page fault vs munmap races. Can work around it by doing a final check just before writing to the page table Eval: Seems to help scalability of metis quite a bit, once you get past ~20 cores: 76x speedup on 80 cores rather than 22x. Also Psearchy and dedup. Microbenchmark: have one core do mmap/munmap, and the others doing lots of page faults. Seems to behave pretty well almsot regardless of what you hit it with. Q: Metis: why not pre-allocate things, rather than taking lots of page faults? A: Could do, but pre-allocating gives you a large startup cost. Q: A: Q: You're introducing races. Most languages don't give semantics for them. Did you do a proof of correctness? A: No, relied on testing. Languages don't assign semantics to races, but there are lots of kernel things which don't really have language semantics. Q: Do you know what exactly you're depending on? A: Not formally. Working on it. Q: This is a better mousestrap. Any unpleasant side-effects? e.g. greater susceptibility to bugs. A: Well, hardware bugs and bugs in kernel could cause problems. Interface to userspace unchanged; minimal changes to rest of kernel. Q: Is anyone else using it? A: Not yet. ---------------------------------- Applying transactional memory to concurrency bugs Haris Volos Uwisc at Madison Transactional memory is a thing. Concurrency bugs are a thing. Looking at deadlocks and atomicity violations. Source of bugs: Lu, Asplos08 Starting from bug reports. Apply local reasoning. Introduce some transactions to fix it. proc_info race in mysql> Deadlock fix: detect the deadlock and roll back far enough to release a lock in the cycle i.e. dynamic approach rather than trying to statically insert transactions. Depends on using ``preeemptible lock''; not sure quite what they mean by that. ah: it's prior work. txLock, ASPLOS08. Ah, no, now he's saying that the atomic() blocks are inserted statically, but then used dynamically for deadlock recovery. Their atomicity fixer relies on a global lock; fine. Semantics are a bit odd; need to check paper for details. Eval: 22 deadlock bugs, fixes ~55% of bugs. They claim that the fixes are often ``better'', in some sense, but that's a bit in-the-eye-of-the-beholder. Atomicity violations: try 38. Seems to provide a fix in ~80% of cases. When does it not work? Long latency calls lead to long transactions. ``Nested-monitor lockout'' -> look at the paper. Q: Atomic regions can get quite big with this kind of thing. Did you have problems with that? A: Q: Danger of just sprinkling random transactions in, which might make things more complicated if you have complex STM semantics. A: Working with relatively simple STM semantics. Don't do unsafe operations from inside transactions. Q: Privatisation safety? A: Use an STM which does that correctly. Q: What happens with very short lock-based critical sections? TM turns into a no-op, rather than waiting for previous csects to finish. A: Haven't run into that. Q: How do you decide when you apply transactions and when not? A: