272 abstracts, 172 full papers. 3 rounds of reviews. Dropped 71 in first round. Author response, then dropped another 27 in second round. Accepted 37 papers total. Pocketweb: instant web browsing for mobile devices MSR Pocket cloudlets -- replicate cloud services onto mobile devices. Current system only does search. New system more general. Prefetching a bad idea: drains battery. User-centric approach: per-used predictions of what the user is going to do next. Start by analysing 8000 logs of real users (even split smartphone/desktop). 60% of visits are repeats for 70% of users. Define an URL to be targeted iff it's accessed at least 5 times a month. Most people have less than a dozen targeted URLs, and they usually account for ~70% of total traffic. Web traffic tends to batch: 40% of visits withing 6 minutes of another visit (not necessarily to same URL). Discovered that users behave differently in important ways. Architecture: Offline: take a trace from this user, do feature extraction, then apply those features to the trace , then use MART to generate a model. Online: monitor user action, generate feature vextor, and then apply the MART-generated model. Feature extraction: lots of ad-hoc things. Predictions are reasonably accurate: 80% of targeted accesses can be predicted for 80% of users. Also good precision: 80% of predictions are correct for 80% of users. Important features: spatiotemporal, popularity, temporal, spatial, in descending order of importance. Spatial features basically useless. For 80% of users, 80% reduction in effective load time of light pages, 50% reduction in time for heavy pages. *Reduced* radio enery: 65-90% for lightweight pages, 45-85% for heavyweight. Seems to be mostly a batching benefit. Q: How do you deal with private data e.g. logins? A: We don't deal with that. We don't get user passwords, so can't prefetch things which require them. Q: Do you obey caching rules? A: Yes. Q: How does that affect results? e.g. cached page vs. uncached page A: Didn't eval that. Q: Have you looked at how relevant user location is? A: No, our data doesn't include that, for privacy reasons. Q: Do you know the type of wireless connection e.g. 3G vs WIFI? A: Only looked at 3G networks. Suspect win is smaller when dealing with WiFi, but probably happens. ----------------------------- Reflex: using low-power processes in smarthphones without knowing them. Rice Diverse smartphone apps. Some are very demanding, but also want support for simple periodic tasks. Current approach: powerful processor for heavy apps, but wastes power for simple ones. Alternative: chip-level heterogeneity e.g. OMAP4. Board-level heterogeneity. Trend to more heterogeneous systems, often without coherent shared memory. Manufacturers (TI) use message passing API. Reflex: emulate a single-system with shared memory. Aim is an energy-efficient DSVM. Goal: keep strong processor asleep as much as possible -> host objects on weak processor whenever possible. All coherence traffic issued by strong processor. In particular, if the strong processor owns an object, the weak processor must just wait for the strong one to return it -> weak never has to wake up strong? Memory model: release consistency (i.e. acquire/release object primitives). Strong processor writes things back eagerly (i.e. when it releases), weak processor lazy (i.e. when strong processor does an acquire). Eval: N900 with the camera replaced by an external two-processor board. Obviously. External board is slower than main pprocessor. All running same instruction set. Comparison is implementing an app on Reflex versus implementing it purely on the N900 main processor. Fairly useful performance win, as you might expect. (600MHz, 72MHz, 3MHz) Also try a DSVM with either strong or last-owning processor owning objects, which does much worse. Adding reflex doesn't seem to make things much more complicated? www.cs.rice.edu/~x16/reflex Q: Your lower-power processors have fewer clocks per joule? Would it be better to run on the main processor, get out fast, and then put it to sleep? A: e.g. soundsense benchmark involves lots of very short tasks, so you get lots of sleep/wakeup, and that doesn't really work. Q: Is this the right approach, or is it a workaround for the lack of hardware coherency? A: Hardware coherency imposes costs on big processor. e.g. Tegra3 -- hardware coherency, but relatively small opportunities for power saving. --------------------------------------- Totally green: Evaluating and designing servers for lifecycle environmental impact Justin Meza CMU+HPL Interested in environmental impact of computing. Need to consider whole lifecycle: make vs. operate costs. They're using exergy as a measure of environment damage. Combines energy and entropy somehow; not sure of details. Previous work did a per-material breakdown. They want to do it per component instead. They have a model, but the numbers which go into it appear to have been pulled out of thin air? What changes if you look at the whole lifecycle? Compare a standard container data center, one based on low-power blades, and one based on ``dematerialised'' blades. Predicted cooling and operational energy, plus performance, using simulators, basically. Validated on a miniature prototype. LP blade saves 58% of exergy, demat design saves 71%. Their conclusion seems to be that optimising for reducing embedded exergy improves operational exergy, but that really doesn't follow. Also found that they sometimes conflict. Power proportionality and server consolidation. Consider best case for both. Proportionality reduces operating power more than consolidation, but if you include the cost of making them then consolidation works better. Q: How do you deal with peripherals like external graphics card and NICs? A: Our systems didn't have any, but they'd just show up as extra components Q: If you were in charge of architecting something, what would you do? A: The same as we did when we evaled the optimisations we tried. Q: Ignoring dollar cost of systems? A: Not really the focus of the work, no. Some other work looks at that. Q: Do the optimisations you proposed increase or decrease costs? A: Haven't considered that. Q: Assume a particular operational lifetime? A: Yes, assume 3 years here. Didn't try to compute optimal time. Trying different lifespans didn't make that much difference.