Data Centric Systems and Networking (2015-2016 Michaelmas Term)
|
OverviewThis module provides an introduction to data centric systems, where data is a token in the programming flow and networking, and its impact on the computer system's architecture. Large-scale distributed applications with big data processing will grow ever more in importance and become a pervasive aspect of the lives of millions of users. Supporting the design and implementation of robust, secure, and heterogeneous large-scale distributed systems is essential. This course provides various perspectives on data centric systems and networking, including data-flow programming, stream processing, (content-based routing) and large-scale graph data processing, thus providing a solid basis to work on the next generation of distributed systems and communication paradigms. This course provides various perspectives on data centric systems and networking, including data-flow programming, stream processing, content-based routing and large-scale graph data processing, thus providing a solid basis to work on the next generation of distributed systems and communication paradigms. On completion of this module, the students should:
Module StructureThe module consists of 8 sessions, with 5 sessions on specific aspects of data-centric systems and networking research. Each session discusses ~3 papers, led by the assigned students. One session is a hands-on tutorial on data flow programming with Amazon EC2. The 1st session advises on how to read/review a paper together with a brief introduction of different perspectives in data-centric systems. The last session is dedicated to the presentation of the open-source project studies presented by the students. One guest lecture is planned, covering inspiring current research on stream processing systems. Schedule and Reading ListWe’ll meet in GS15 every Tuesday (from October 13 to December 1) in 2015. The time slot is 15:00-17:00 (except October 27 starting at 16:00). 2015/10/13 Session 1: Introduction to Data Centric Systems and Networking
2015/10/20 Session 2: Programming in Data Centric Environment
James Trever
(slides)
1. Yuan Yu, Michael Isard, D. Fetterly, M. Budiu, U.
Erlingsson, P.K. Gunda, J. Currey:
DryadLINQ: A System for General-Purpose
Distributed Data-Parallel Computing Using a High-Level Language, OSDI, 2008.
Christopher Little
(slides)
2. M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma,
M. McCauley, M. Franklin, S. Shenker, I. Stoica:
Resilient Distributed
Datasets: A Fault-Tolerant Abstraction for In-Memory
Cluster Computing,
NSDI, 2013. 4. J. Dean, S. Ghemawat: MapReduce: Simplified Data Processing on Large Clusters, OSDI, 2004. 5. Derek Murray, Malte Schwarzkopf, Christopher Smowton, Steven Smith, Anil Madhavapeddy and Steven Hand: Ciel: a universal execution engine for distributed data-flow computing, NSDI 2011.
Frank McSherry's Talk on Differential Dataflow is
here. 6.1. Frank McSherry, Rebecca Isaacs, Michael Isard, and Derek G. Murray, Composable Incremental and Iterative Data-Parallel Computation with Naiad, no. MSR-TR-2012-105, 2012. Olivia Wiles (slides) 6.2. D. Murray, F. McSherry, R. Isaacs, M. Isard, P. Barham, M. Abadi: Naiad: A Timely Dataflow System, SOSP, 2013. 7. P. Bhatotia, A. Wieder, R. Rodrigues, U. A. Acar, and R. Pasquini: Incoop: MapReduce for incremental computation, ACM SOCC, 2011. 8. Dionysios Logothetis, Christopher Olston, Benjamin Reed, Kevin Webb and Kenneth Yocum: Stateful Bulk Processing for Incremental Analytics, SOCC, 2010. 2015/10/27 Session 3: Processing Models of Large-Scale Graph Data
Muthu Karunarathna (slides) 1. G. Malewicz, M. Austern, A. Bik, J. Dehnert, I. Horn, N. Leiser, and G. Czajkowski: Pregel: A System for Large-Scale Graph Processing, SIGMOD, 2010. 2. U. Kang, C. E. Tsourakakis, C. Faloutsos: PEGASUS: A peta-scale graph mining system - Implementation and observations, ICDM , 2009. Kenneth Liu (slides) 3. Z. Qian, X. Chen, N. Kang, M. Chen, Y. Yu, T. Moscibroda, Z.Zhang: MadLINQ: large-scale distributed matrix computation for the cloud, EuroSys, 2012. 4. Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, J. Hellerstein: Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud, VLDB, 2012.
James Trever
(slides)
7. J. Gonzalez, Y. Low, H. Gu, D.
Bickson, and C. Guestrin:
Powergraph: distributed
graph-parallel computation on natural Olivia Wiles (slides) 5. .J. Shun and G. Blelloch: Ligra: A Lightweight Graph Processing Framework for Shared Memory, PPoPP, 2013. 6. J. Kim, W. Han, S. Lee, K. Park, H. Yu: OPT: A New Framework for Overlapped and Parallel Triangulation in Large-scale Graphs, SIGMOD, 2014. 7. J. Gonzalez, R. Xin, A. Dave, D. Crankshaw, M. Franklin, I. Stoica: GraphX: Graph Processing in a Distributed Dataflow Framework, OSDI, 2014. 8. M. Han, K. Daudjee, K. Ammar, M. Ozsu, X. Wang, T. Jin: An Experimental Comparison of Pregel like Graph Processing Systems, SIGMOD, 2014. 9. B. Shao, H. Wang, Y. Li: Trinity: A Distributed Graph Engine on a Memory Cloud, SIGMOD, 2013. 10. Arabesque: A System for Distributed Graph Mining, SOSP, 2015. , , , , , : 2015/11/03 Session 4: Data Flow Programming Handson Tutorial with Amazon EC2
2015/11/10 Session 5: Stream Data Processing and Data/Query Model
1. T. Akidau, A. Balikov, K. Bekiroglu, S. Chernyak, J. Haberman, R. Lax, S. McVeety, D. Mills, P. Nordstrom, S. Whittle: MillWheel: Fault-Tolerant Stream Processing at Internet Scale , VLDB, 2013.
2. V. Zaharia, T.
Das, H. Li, T.
Hunter, S. Shenker, I.
Stoica:
Discretized Streams:
Fault-Tolerant Streaming Computation at Scale,
SOSP, 2013. Kenneth Liu (slides) 6. B.Gedik, H. Andrade, K. Wu, P. Yu, and M. Doo: SPADE: the system S Declarative Stream Processing Engine , SIGMOD. 2008. Christopher Little (slides) 7. T. Akidau, R. Bradshaw, C. Chambers, S. Chernyak, R. Fernandez-Moctezuma, R. Lax, S. McVeety, D. Mills, F. Perry, E. Schmidt, S. Whittle: The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing, VLDB, 2015. 8. R. Cheng, J. Hong, A. Kyrola, Y. Miao, X. Weng, M. Wu, F. Yang, L. Zhou, F. Zhao, E. Chen: Kineograph: Taking the Pulse of a Fast-Changing and Connected World, EuroSys, 2012. 2015/11/17 Session 6: Optimised Approaches in Data Processing
James Trever (slides)1. A. Kyrola and G. Blelloch: Graphchi: Large-scale graph computation on just a PC, OSDI, 2012. Olivia Wiles (slides) 2. A. Roy, I. Mihailovic, W. Zwaenepoel: X-Stream: Edge-Centric Graph Processing using Streaming Partitions, SOSP, 2013. 3. A. Roy, L. Bindschaedler, J. Malicevic and W. Zwaenepoel: Chaos: Scale-out Graph Processing from Secondary Storage , SOSP, 2015. 4. F. McSherry, M. Isard and D. Murray: Scalability! But at what COST? , HOTOS, 2015. 5. X. Hu, Y. Tao, C. Chung: Massive Graph Triangulation, SIGMOD, 2013. 6. W. Xie, G. Wang, D.Bindel, A. Demers, J. Gehrke: Fast Iterative Graph Computation with Block Updates, VLDB, 2014.
7. W. Han, S. Lee, K. Park, J. Lee, M. iKim, J. Kim,
H. Yu: TurboGraph: A Fast
Parallel Graph Engine Handling 8. A. Buluc, K. Madduri: Parallel Breadth-First Search on Distributed Memory Systems, SC, 2005. 9. C. Luk, S. Hong, H. Kim: Qilin: Exploiting Parallelism on Heterogeneous Multiprocessors with Adaptive Mapping, PLDI, 2009. Muthu Karunarathna (slides) 10. S. Hong, H. Chafi, E. Sedlar, K.Olukotun: Green-Marl: A DSL for Easy and Efficient Graph Analysis, ASPLOS, 2012. Christopher Little (slides) 11. D. Prountzos, R. Manevich, K. Pingali: Elixir: A System for Synthesizing Concurrent Graph Programs, OOPSLA, 2012. 12. D. Nguyen, A. Lenharth, K. Pingali: A Lightweight Infrastructure for Graph Analytics, SOSP 2013. 13. M. Kulkarni, P. Carribault, K. Pingali, G. Ramanarayanan, B. Walter, K. Bala, L. P. Chew: Scheduling Strategies for Optimistic Parallel Execution of Irregular Programs, SPAA, 2008. 14. S. Salihoglu, J. Widom: Optimizing Graph Algorithms on Pregellike Systems, VLDB, 2014. 15. D. Merrill, M. Garland, A. Grimshaw: Scalable GPU Graph Traversal, PPoPP, 2012. 16. J. Zhong, B. He: Medusa: Simplified Graph Processing on GPUs, IEEE TPDS, 2013. 17. A. Gharaibeh, E. Santos-Neto, L. Costa, M. Ripeanu: Efficient Large-Scale Graph Processing on Hybrid CPU and GPU Systems, IEEE TPC, 2014. 18. C. Rossbach, Y. Yu, J. Currey, J-P. Martin, D. D. Fetterly: Dandelion: a Compiler and Runtime for Heterogeneous Systems, SOSP 2013. 19. C. COZ: Finding Code that Counts with Causal Profiling, SOSP 2015. , : 2015/11/24 Session 7: Machine Learning for Computer System's Optimisation
Olivia Wiles (slides) 1. N.J. Yadwadkar, B. Hariharan, J. Gonzalez and R. Katz: Faster Jobs in Distributed Data Processing using Multi-Task Learning, SDM, 2015. 2. N. Roy et al.: Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting, CLOUD, 2011. 3. E. Barrett et al.: Applying reinforcement learning towards automating resource allocation and application scalability in the cloud, Concurrency and Computation Practice and Experience, Wiley, 2013. Kenneth Liu (slides) 4. K. LaCurts et al.: Cicada: Introducing Predictive Guarantees for Cloud Networks, HOTCLOUD, 2014. Christopher Little (slides) 5. M. Carvalho et al.: Long-term SLOs for reclaimed cloud computing resources, SOCC, 2014. 6. H. Hoffmann et al.: Dynamic Knobs for Responsive Power-Aware Computing, Asplos, 2011. 7. S. Alici et al:Timestamp-based Result Cache Invalidation for Web Search Engines, SIGIR, 2011. Muthu Karunarathna (slides) 8. X. Dutreih et al.: Using Reinforcement Learning for Autonomic Resource Allocation in Clouds: Towards a Fully Automated Workflow, ICAS, 2011. 9. G. Tesauro et al.: Utility-Function-Driven Resource Allocation in Autonomic Systems, ICAC, 2005. 10. G. Tesauro et al.: A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation, ICAC, 2006. 11. G. Gouriten et al.: Scalable, Generic, and Adaptive Systems for Focused Crawling, HT, 2014. 12. J. Eastep et al.: Smartlocks: Lock Acquisition Scheduling for Self-Aware Synchronization, ICAC, 2010. 13. J. Eastep et al.: Smart Data Structures: An Online Machine Learning Approach to Multicore Data Structures, ICAC, 2011. 14. H. Hoffmann et al.: SEEC: A Framework for Self-aware Management of Multicore Resources, MIT Technical Report, 2011. 15. F. Hutter et al.: Algorithm runtime prediction: Methods&evaluation, Elsevier J. AI, 2014. 16. F. Hutter et al.: Sequential Model-Based Optimization for General Algorithm Configuration, LION, 2011. 17. E. Ipek et al.: Self-Optimizing Memory Controllers: A Reinforcement Learning Approach, ISCA, 2008. 18. J. Snoek et al.: Practical Bayesian Optimization of Machine Learning Algorithms, Archive, 2012. 2015/12/01 Session 8: Presentation of Open Source Project Study
Coursework 1 (Reading Club)The reading club will require you to read between 1
and 3 papers every week. You need to fill out simple review_log
(MS word format,
text
format) prior to each session and email me by the end of Sunday. The minimum requirement of review_log is one per session, but
you can read as many as you want and fill the review_log for each paper you
read. review_log is not marked but 'tick'.
Coursework 2 (Reports)The following three reports are required, which could be extended from the reading assignment of the reading club or a different one within the scope of data centric networking.
The reports 1 should be handed in by the end of 5th week (November 13, 2015 - 16:00) and the report 2 by 7th week (November 27, 2015 - 16:00 ...extended to December 7, 2015 16:00! The report 3 should be by the end of the Michaelmas term (January 15, 2016 - 16:00 - but if you could finish by 21st of December, that will be good!). AssessmentThe final grade for the course will be provided as a letter grade or percentage and the assessment will consist of two parts:
Open Source ProjectsSee the candidates of Open Source Projects in data centric networking. The list is not exhausted. If you take anything other than the one in the list, please discuss with me. The purpose of this assignment is to understand the prototype of the proposed architecture, algorithms, and systems through running an actual prototype and present/explain to the other people how the prototype runs, any additional work you have done including your own applications and setup process of the prototype. This experience will give you better understanding of the project. These Open Source Projects come with a set of published papers and you should be able to examine your interests in the paper through running the prototype. Some projects are rather large and may require extensive environment and time; make sure you are able to complete this assignment. How to Read/Review a PaperThe following papers aid how to read/review a paper.
Further supplement: see ‘how to read/review a paper’ section in Advanced Topics in Computer Systems by Steven Hand. PresentationsPresentations should be about 20-25 minutes long, where you need to cover the following aspects.
The following document aids in presenting a review.
How to write a survey paperA survey paper provides the readers with an exposition of existing work that is comprehensive and organized. It must expose relevant details associated in the surveying area, but it is important to keep a consistent level of details and to avoid simply listing the different works. Thus a good survey paper should demonstrate a summary of recent research results in a novel way that integrates and adds understanding to work in the field. For example, you can take an approach by classifying the existing literature in your own way; develop a perspective on the area, and evaluate trends. Thus, after defining the scope of your survey, 1) classify and organize the trend, 2) critical evaluation of approaches (pros/cons), and 3) add your analysis or explanation (e.g. table, figure). Also adding reference and pointer to further in-depth information is important (summary from Rich Wolski’s note). Papers for OS Principles (Distributed Storage and Deterministic Parallelism)
3. F. Chang et al: BigTable: A Distributed Storage System for Structured Data, USENIX OSDI 2006. 4. G. DeCandia et al: Dynamo: Amazon's Highly Available Key-value Store, ACM SOSP 2007.
6. A. Aviram, et al: Efficient System-Enforced Determistic Parallelism, USENIX OSDI 2010. 7. T. Liu et al: Dthreads: Efficient and Determistic Multithreading, ACM SOSP 2011. Contact EmailPlease email to eiko.yoneki@cl.cam.ac.uk for your submission of course work or any question. |