Computer Laboratory

Data Centric Networking (2011-2012 Lent Term)

DCN - R202

Additional References

review_log

Open Source Projects

Reading Club Papers

Contact

 

 

 

 

 

 

 

  

 

Overview

This module provides an introduction to data centric networking, where data is a communication token in networking and its impact to the computer system's architecture. Data centric networking in distributed systems relies on content addressing instead of host addressing, thus providing network independence for applications. Integration of complex data processing with networking is a key vision for future computing. This course provides various aspects in data centric networking ranging from content-based routing, data-flow programming, to graph structured data processing providing a solid basis to work on the next generation of communication paradigm. On completion of this module, you should:

  • Understand key concepts of data centric approaches in future networking
  • Obtain a clear understanding of building distributed systems using data centric approach

Module Structure

The module consists of 8 sessions, of which 6 sessions focus on a specific aspect of the topic in data centric networking research. Each session discusses 2-3 papers, led by the assigned students. Each student will present about 2 paper reviews during the course. The first session advises how to read/review a paper and a brief introduction of different perspectives in data centric networking. The last session is dedicated to the presentation of the open source project studies present by the students. One hands-on session on data-flow programing and two guest lectures are planned (subject to change), covering inspiring current research in the data centric networking domain.

Schedule and Reading List

We’ll meet in SW01 every Tuesday (from January 24 to March 13) in 2012. The time slot is 14:00-16:00 on Tuesday.

 2012/01/24 Session 1: Introduction to Data Centric Networking (DCN)

  • Introduction (slides)
  • Assignment details
  • Guidance of how to read/review/present a paper
  • Various Faces of Data Centric Networking (slides)

 2012/01/31 Session 2: Content-Centric Networking (CCN) and Content Distribution Networks (CDN)

1. T. Koponen, M. Chawla, B. Chun. K. Kim, S. Shenker, A. Ermolinskiy, I. Stoica: A Data-Oriented (and Beyond) Network Architecture, SIGCOMM 2007.

Xinghong Fang (Slides)
2.1. V.  Jacobson, D. Smetters, J. Thornton, M. Plass, N. Briggs, R. Braynard: Networking Named Content, CoNEXT, 2009.
2.2. VJacobson, D. Smetters, J. Thornton, M. Plass, N. Briggs, R. Braynard: Networking Named Content, CACM, January, 2012.

Valentin Dalibard (Slides)
3. A. Ghodsi, T. Koponen, B. Raghavan, S. Shenker, A. Singla, and J. Wilcox: Information-Centric Networking: Seeing the Forest for the Trees, HotNets, 2011.
 
4. P. Jokela, A. Zahemszky, C. E. Rothenberg, S. Arianfar, and P. Nikander: LIPSIN: Line Speed Publish/Subscribe Inter-networking, SIGCOMM, 2009.

  •  Content distribution overlay

1.1. A. Carzaniga, D.S. Rosenblum, A.L. Wolf: Achieving scalability and expressiveness in an internet-scale event notification service, PODC, 2001.
1.2. A. Carzaniga, M.J. Rutherford, A.L. Wolf: A Routing Scheme for Content-Based Networking, INFOCOM, 2004.

Thomas Pasquier (Slides)
1.3. A. Carzaniga, A.L. Wolf: Forwarding in a content-based network, SIGCOMM, 2003.

2. M. Castro, M. B. Jones, A-M. Kermarrec, A. Rowstron, M. Theimer, H. Wang and A. Wolman: An Evaluation of Scalable Application-level Multicast Built Using Peer-to-peer overlays, INFOCOM, 2003.

3.1. S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker: A scalable content addressable network, SIGCOMM, 2001.
3.2. S. Ratnasamy, M. Handley, R. Karp, S. Shenker:
Application-level multicast using content addressable networks, NGC, 2001.
 
4.1. M.J. Freedman, E. Freudenthal, D. Mazières: Democratizing Content Publication with Coral, NSDI, 2004.
4.2. M.J. Freedman: Experiences with CoralCDN: A Five-Year Operational View, NSDI, 2010.

 2012/02/07 Session 3: MapReduce Handson Tutorial using CIEL with Amazon EC2  



 2012/02/14 Session 4: Programming in Data Centric Environment

  • Network meets programming

Arman Idani (Slides)
1. Yuan Yu, Michael Isard, D. Fetterly, M. Budiu, U. Erlingsson, P.K. Gunda, J. Currey: DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language, OSDI, 2008.

2.1. Boon Thau Loo, Tyson Condie, Joseph M. Hellerstein, Petros Maniatis, Timothy Roscoe, and Ion Stoica: Implementing Declarative Overlays, SOSP, 2005.
2.2. Boon Thau Loo, Tyson Condie, Minos Garofalakis, David E. Gay, Joseph M. Hellerstein, Petros Maniatis, Raghu Ramakrishnan, Timothy Roscoe, Ion Stoica: Declarative Networking, Communications of the ACM, Vol. 52 No. 11, pp. 87-95, 2009.

3. Peter Alvaro, Tyson Condie, Neil Conway, Khaled Elmeleegy, Joseph M. Hellerstein, Russell Sears: Boom analytics: exploring data-centric, declarative programming for the cloud, Eurosys 2010.
 
Robert Hoff (Slides)
4. J. Dean, S. Ghemawat: MapReduce: Simplied Data Processing on Large Clusters, OSDI, 2004.

Jonathan Humphrey (Slides)
5. Derek Murray, Malte Schwarzkopf, Christopher Smowton, Steven Smith, Anil Madhavapeddy and Steven Hand: Ciel: a universal execution engine for distributed data-flow computing, NSDI 2011. 

 2012/02/21 Session 5: Stream Data Processing and Data/Query Model 

1. V. Gulisano, R. Jimenez-Peris, M. Patiño-Martinez, P. Valduriez: StreamCloud: A Large Scale Data Streaming System, ICDCS, 2010.

2. Peter Pietzuch, Jonathan Ledlie, Jeffrey Shneidman, Mema Roussopoulos, Matt Welsh, and Margo Seltzer: Network-Aware Operator Placement for Stream-Processing Systems, ICDE, 2006.

3.1. Geoffrey Mainland, Greg Morrisett, Matt Welsh: Flask: Staged Functional Programming for Sensor Networks, ICFP, 2008.
3.2. Geoffrey Mainland, Matt Welsh, Greg Morrisett: Flask: A Language for Data-driven Sensor Network Programs, Harvard University Technical Report TR-13-06, 2006.
 
4. S. Babu, J. Widom: Continuous Queries over Data Streams, SIGMOD Record 30(3), 2001.  

Hao Zhang (Slides)
5. T. Condie, N. Conway, P. Alvaro, and J. M. Hellerstein, K. Elmeleegy and R. Sears: MapReduce Online, NSDI, 2010.  

Thomas Pasquier (Slides)
6. E. Zeitler and T.Risch: Massive scale-out of expensive continuous queries, VLDB, 2011.

 2012/02/28 Session 6: Graph Structured Data: Network, Storage, and Query Processing 

  • Scalable distribution of graph structured data for query, storage, and networking

1. J. Pujol, V. Erramilli, G. Siganos, X. Yang, N. Laoutaris, P. Chhabra, P. Rodriguez: The Little Engine(s) That Could: Scaling Online Social Networks, SIGCOMM, 2010.

Arman Idani (Slides)
2. G. Malewicz, M. Austern, A. Bik, J. Dehnert, I. Horn, N. Leiser, and G. Czajkowski: Pregel: A System for Large-Scale Graph Processing, SIGMOD, 2010.

Valentin Dalibard (Slides)
3. Kunwadee Sripanidkulchai, Bruce Maggs, Hui Zhang: Efficient content location using interest-based locality in peer-to-peer systems, INFOCOM, 2003.
 
Valentin Dalibard (Slides)
4. A. Lakshman, P. Malik, Cassandra - A Decentralized Structured Storage System: LADIS, 2009. 

 2012/03/06 Session 7: Network holds Data in Delay Tolerant Networks (DTN)

  • Network holds data

1.1 E. Nordström, P. Gunningberg, C. Rohner: A Search-based Network Architecture for Mobile Devices, Uppsala University Technical Report 2009-003, 2009.
Jonathan Humphrey (Slides)
1.2 N. Ristanovic, G. Theodorakopoulos and J.-Y. Le Boudec: Traps and Pitfalls of Using Contact Traces in Performance Studies of Opportunistic Networks , INFOCOM, 2012.


Xinghong Fang (Slides)
2.1 N. Laoutaris, G. Smaragdakis, P. Rodriguez, R. Sundaram: Delay Tolerant Bulk Data Transfers on the Internet, SIGMETRICS, 2009.
2.2 N. Laoutaris, M. Sirivianos, X. Yang, P. Rodriguez: Inter-Datacenter Bulk Transfers with NetStitcher,"  SIGCOMM, 2011.

Hao Zhang (Slides)
3. M. Grossglauser, D. Tse: Mobility increases the capacity of ad-hoc wireless networks, IEEE/ACM Trans. on Networking, 10:477–486, 2002.
 
4. K. Fall: A delay-tolerant network architecture for challenged internets, SIGCOMM, 2003.

Compact routing (Optional)

5.1. Dmitri Krioukov, kc claffy, Kevin Fall, Arthur Brady: On compact routing for the internet, ACM 37 (3), 2007.
5.2. Dmitri Krioukov, Kevin Fall, Xiaowei Yang: Compact routing on Internet-like graphs, INFOCOM, 2004.  

 2012/03/13 Session 8: Presentation of Open Source Project Study

  • Presentation of Open Source Project Study by all (10 minutes of presentation and 5 minutes of QA for each student)

  1. 14:00 Valentin Dalibard(CIEL) Implementing the Bulk Synchronous Parallel Model in CIEL (Slides)
  2. 14:15 Xinghong Fang(Blackadder) Bandwidth Efficient Multimedia Communication Tools using Blackadder (Slides)
  3. 14:30 Robert Hoff(XML Blaster)
  4. 14:45 Jonathan Humphrey(DTN) Delay-tolerant Networking: Routing Protocol Development With The One Simulator (Slides)
  5. 15:00 Arman Idani(CIEL/MapReduce) Multimedia Data Processing on CIEL (Slides)
  6. 15:15 Thomas Pasquier(CCN/NDN) CCN/NDN (Slides)
  7. 15:30 Hao Zhang(CoralCDN) CORAL Content Distribution Network: The Design and Performance Analysis (Slides)

    15:45-16:00 Wrap-up Discussion (Slides)

 Coursework 1 (Reading Club)

The reading club will require you to read between 1 and 3 papers every week. You need to fill out a review_log (MS word format, text format) except sections 6&7 prior to each session and email me by 12:00 noon on Monday. After the session, you finish filling out sections 6&7 and email me by the end of following Wednesday. The minimum requirement of review_log is one, but you can read as many as you want and fill the review_log for each paper you read.

At each session, around 3 papers are selected under the session topic, and if you are assigned to present your review work, please prepare 20-25 minutes slides for presenting your review work. Your presented material should also be emailed by the following day Wednesday. You would present your review work approximately twice during the course. The paper includes following two types and you can focus on the specified aspects upon reviewing the paper.

  1. Full length papers 
    • What is the significant contribution?
    • What is the difference from the existing works?
  2. Short length papers 
    • What is the novel idea?
    • What is required to complete the work?

 Coursework 2 (Reports)

The following three reports are required, which could be extended from the reading assignment of the reading club or a different one within the scope of data centric networking.

  1. Review report on a full length of paper (1800 words)
    • Describe the contribution of paper in depth with criticism
    • Crystallise the significant novelty in contrast to the other related work
    • Suggestion for future work
  2. Survey report on sub-topic in data centric networking (aim at 1500-2000 words - max 2000 words)
    • Pick up to 5 papers as core papers in your survey scope
    • Read the above and expand your reading through related work
    • Comprehend your view and finish as your survey paper
    • See how to write a survey paper
  3. Project study and exploration of a prototype (2500 words)
    • What is the significance of the project in the research domain?
    • Compare with the similar and succeeding projects
    • Demonstrate the project by exploring its prototype
    • Please email your project selection (MS word format, text format <150 words) by February 10, 2012
    • Project presentation on March 13, 2012

The reports 1 and 2 should be handed in by the end of 5th week (Feb 21, 2012 - 12:00 noon ) and 7th week (March 19, 2012 - 12:00 noon) of the course (not in any particular order). The report 3 should be by the end of the Lent term (March 28, 2012 - 12:00 noon).

 Assessment

The final grade for the course will be provided as a letter grade or percentage and the assessment will consist of two parts:

  1. 25%: for a reading club (Presentation, participation and review_log)
  2. 75%: for the three reports
    • 20%: Intensive review report
    • 25%: Survey report
    • 30%: Project study

Open Source Projects

See the candidates of Open Source Projects in data centric networking. The list is not exhausted. If you take anything other than the one in the list, please discuss with me. The purpose of this assignment is to understand the prototype of the proposed architecture, algorithms, and systems through running an actual prototype and present/explain to the other people how the prototype runs, any additional work you have done including your own applications and setup process of the prototype. This experience will give you better understanding of the project. These Open Source Projects come with a set of published papers and you should be able to examine your interests in the paper through running the prototype. Some projects are rather large and may require extensive environment and time; make sure you are able to complete this assignment.

How to Read/Review a Paper

The following papers aid how to read/review a paper.

Further supplement: see ‘how to read/review a paper’ section in Advanced Topics in Computer Systems by Steven Hand.

Presentations

Presentations should be about 20-25 minutes long, where you need to cover the following aspects.

  1. What are the background and the problem domain of the paper? What is the motivation of the presented work? What is the difference from the existing works?  What is the novel idea? How did the paper change/unchange the research in the research community?

  2. What is the significant contribution? How did the authors tackle the problem? Did the authors obtain expected result from their trial?

  3. How do you like the paper and why? What is the takeaway message to you (and to research community)? What is required to complete the work?

The following document aids in presenting a review.

How to write a survey paper

A survey paper provides the readers with an exposition of existing work that is comprehensive and organized. It must expose relevant details associated in the surveying area, but it is important to keep a consistent level of details and to avoid simply listing the different works. Thus a good survey paper should demonstrate a summary of recent research results in a novel way that integrates and adds understanding to work in the field. For example, you can take an approach by classifying the existing literature in your own way; develop a perspective on the area, and evaluate trends. Thus, after defining the scope of your survey, 1) classify and organize the trend, 2) critical evaluation of approaches (pros/cons), and 3) add your analysis or explanation (e.g. table, figure). Also adding reference and pointer to further in-depth information is important (summary from Rich Wolski’s note).

Contact Email

Please email to eiko.yoneki@cl.cam.ac.uk for your submission of course work or any question.