Computer Laboratory

Course pages 2012–13

Data Centric Networking

Principal lecturer: Dr Eiko Yoneki
Taken by: MPhil ACS, Part III
Code: R202
Hours: 16 (8 × two-hour seminar sessions)
Class limit: 30 students
Prerequisites: An undergraduate network architectures course

Aims

This module provides an introduction to data-centric networking, where data is a communication token for networking which impacts the computer system's architecture through a large volume of data processing. Integration of complex data processing with networking is a key vision for future computing systems. This course provides various aspects of data-centric networking ranging from content-based routing, data-flow parallel computing (e.g. MapReduce), to large graph structured data processing.

Syllabus

The module consists of 8 sessions, with 5 sessions on specific aspects of data-centric networking research. Each session discusses 2-3 papers, led by the assigned students. Each student will present about 2 paper reviews during the course. The 3rd session is hands-on tutorial session on MapReduce using data flow programming with Amazon EC2. The 1st session advises on how to read/review a paper together with a brief introduction of different perspectives in data-centric networking. The last session is dedicated to the presentation of the open-source project studies presented by the students. Two guest lectures are planned, covering inspiring current research on data-centric networking.

  1. Introduction to Data-Centric networking
    • Data centric networking from different perspectives
  2. Content-Centric Networking (CCN) and Content Distribution Networks (CDN)
    • Content-based routing
    • Content distribution overlay
    • Naming - Content represents network identifier
    • Publish/Subscribe
    • Caching – Network as a storage
  3. MapReduce Tutorial
    • Hands-on tutorial session of MapReduce parallel computing using CIEL/Skywriting data flow programming
  4. Programming in Data Centric Environment
    • Network meets data flow programming
    • Parallel data processing (e.g. Map/Reduce, Dryad/LINQ, CIEL)
    • Declarative networking (e.g. P2, Declarative Sensor Network)
  5. Stream data processing and data/query model
    • Stream data processing and continuous query processing
    • Advanced data processing in networks (e.g. data model)
  6. Big Graphs Data Processing
    • Graph Specific Data Parallel Programming
    • Graphs for the storage and querying of data – graph database
    • Distributed parallel query/storage platform for graph data
  7. Network holds data in delay tolerant networks
    • Delay tolerant data
    • Networked storage
    • Opportunistic networking
  8. Presentation of Open Source project study

Objectives

On completion of this module, students should:

  • understand key concepts of data centric approaches in future networking and systems;
  • obtain a clear understanding of building distributed systems using data centric programming and communication.

Coursework

Reading Club:

The reading club will involve 1 to 3 papers every week. At each session, around 3 papers are selected under the given topic, and the students present their review work.

Reports:

The following three reports are required, which could be extended from the assignment of the reading club or a different one within the scope of data centric networking.

  1. Review report on a full length of paper (max 1800 words)
    • Describe the contribution of the paper in depth with criticisms
    • Crystallise the significant novelty in contrast to other related work
    • Suggestions for future work
  2. Survey report on sub-topic in data centric networking (max 2000 words)
    • Pick up to 5 papers as core papers in the survey scope
    • Read the above and expand reading through related work
    • Comprehend the view and finish an own survey paper
  3. Project study and exploration of a prototype (max 2500 words)
    • What is the significance of the project in the research domain?
    • Compare with similar and succeeding projects
    • Demonstrate the project by exploring its prototype

The reports 1 and 2 should be handed in by the end of 5th week and 7th week of the course (not in any particular order). The report 3 should be handed in by the end of the Lent term.

Assessment

The final grade for the course will be provided as a percentage and the assessment will consist of two parts:

  1. 25%: for reading club (participation)
  2. 75%: for the three reports
    • 20%: Intensive review report
    • 25%: Survey report
    • 30%: Project study

Recommended reading

[1] Malewicz, G., Austern, M., Bik, A., Dehnert, J., Horn, I., Leiser, N. & G. Czajkowski (2010) Pregel: A System for Large-Scale Graph Processing, SIGMOD, 2010.
[2] Jacobson, V., Smetters, D.K., Thornton, J.D., Plass, M.F., Briggs, N.H., & R.L. Braynard (2009) Networking named content, CoNEXT, 2009.
[3]Murray, D., Schwarzkopf, M., Smowton, C., Smith, S., Madhavapeddy, A., & Hand, S. (2010) Ciel: a universal execution engine for distributed data-flow computing, NSDI, 2010.

A complete list can be found on the course web page.

Notes:

R202 Data Centric Networking cannot be taken in conjunction with L21 Interactive Formal Verification in 2012-13.