Malcolm Scott

Supervision exercises: Computer Networking (2017)

You are expected to produce solutions to all the questions for each supervision. I don't expect long essay-style answers or intricate diagrams (I've listed a number of marks for each question as a clue to how much you should write); spend longer thinking about the questions than writing your answers.

Email your solutions to me at least one working day before the supervision starts (e.g. for a supervision at 1pm on Monday, submit work by 1pm on Friday). If you prefer to hand-write your answers, I'd much prefer it if you could email scans. I prefer PDF, formatted with wide margins and plenty of white space, so that I can annotate it.

When emailing me regarding supervisions (or any other lab business) please only use my lab address, or your email will be misfiled and may slip by unnoticed:

Some of these questions were inspired by, adapted from or blatantly stolen from numerous sources in the lab and elsewhere, to whom I am grateful.

Supervision 1

NB: these questions start around Topic 4 (network layer) and assume big-picture knowledge of protocol layering etc..

1.0. Modern applications vs. the hourglass model (review question)

The Internet was designed according to an "hourglass model" in which IP acts as a lowest-common-denominator network layer protocol, underneath a variety of transport layer protocols and countless application layers (and, in turn, masking the details of various different data-link and physical layers from the application).

However, modern Internet applications have moved the neck of the hourglass upwards, often using HTTP (or HTTPS) as a lowest-common-denominator protocol, despite HTTP being originally intended only for the transfer of web pages. For example, Twitch.tv streams realtime live video over HTTP, Facebook uses HTTPS for text chat, Gmail provides a HTTPS interface to email and Skype will in some circumstances use HTTPS for realtime audio/video calls.

Why did the designers of these and other applications choose to use HTTP for a purpose other than that for which it was designed? What benefit does this bring? [4 marks]
What are the drawbacks of using HTTP for realtime audio/video? [2 marks]
What effect might this ubiquity of HTTP have on future software developers, network engineers and/or computer scientists? [3 marks]

1.1. Network layer protocols

Draw diagrams showing the structure of an IPv4 packet and of an IPv6 packet. Briefly state the purpose of each header field. (You might want to make use of Wireshark to examine some real-life IP headers and how they relate to the raw data in a packet.) [10 marks]
Where in relation to this structure would you look to find a link-layer header (e.g. Ethernet MAC) and a transport-layer header (e.g. TCP or UDP)? [2 marks]
Automatic configuration is the process by which a node obtains an IP address and other details necessary to communicate at the network layer.
1. IPv4 autoconfiguration is handled by DHCP. Describe the main configuration parameters which DHCP provides to nodes, and the purpose of each. [4 marks]
2. Compare and contrast DHCP with IPv6 autoconfiguration, in terms of
  - how IP addresses are assigned to nodes,
  - the configuration parameters provided to nodes,
  - the interactions between nodes and the configuration server, and
  - the design motivation
  [8 marks]
Describe the evolution of IPv4 addressing (class-based networks, subnetting, CIDR) including, for each:
- the basic operation
- the motivation for this scheme's introduction
- how a router looks up the next hop for a packet
[9 marks]
During the last few years, almost all of the remaining IPv4 address space has been allocated. State a few consequences this could have on the efficiency of IPv4 addressing and routing. [3 marks]

1.2. Routing protocols

State the purpose of routing protocols and routing tables. [2 marks]
What is an Autonomous System (AS)? Contrast interior and exterior routing without describing in detail specific routing algorithms or protocols. [6 marks]
What are the pros and cons of distance vector versus link state routing protocols? Give examples from protocols in use today. [10 marks]
Where are hybrid schemes employed and why? [5 marks]
Find out how to view the IPv4 routing table on your computer. How might this differ (in ways other than size) to the routing table on a router on the university network, and to the routing table on a router connected to an Internet Exchange Point (e.g. LINX)? [3 marks]

(First part of question is loosely adapted from 2008 Paper 8 Question 3)

Supervision 2

2.0. Transport layer protocol selection

Explain in what situations you would choose to use TCP, and when you would use UDP. You might like to consider example cases such as:
- Fast transactions (where you want to send a short notification to another system with minimal latency)
- Bulk data transfers
- Timeliness (suppose data is being output from a sensor at one sample per second, and it is important for the receiver to have the most recent value of the sensor's reading rather than all values—i.e. it is better for the receiver to get the current value rather than an outdated value)
[4 marks]—inspired by Computer Networking, 3rd ed, review questions 2.3 & 2.5
UDP provides very few features above those provided by IP, and is used by applications which do not require TCP's reliability and flow control (e.g. live video streaming). Why could such an application not just use IP directly without a transport layer protocol? (Hint: state a feature provided by UDP but not by IP.) [2 marks]
DNS (the Domain Name System, an application layer protocol used to look up the corresponding IP addresses of host names such as www.cl.cam.ac.uk) originally only used UDP. Suggest reasons why DNS might have been designed based on UDP rather than TCP. [4 marks]
Recently the internet has started to deploy DNSSEC (DNS Security extensions), which involves transmitting cryptographic signatures along with DNS data. These signatures are sometimes multiple kilobytes in size. Why might this motivate the use of TCP rather than UDP for DNS? (No specific knowledge of DNSSEC is required to answer this question.) [4 marks]

2.1. TCP performance

Use the approximate equation for throughput as a function of drop rate:

throughput = (√1.5 × MSS) / (RTT × √p)

Assume RTT, the round trip time, is 40 ms and MSS, the maximum segment size, is 1000 bytes. p is the proportion of packets which are dropped: one packet is dropped in every 1/p packets. In the following questions ignore packet headers in your calculations.

What drop rate p would lead to a throughput of 1 Gbps (1 Gigabit per second)? [1 mark]
What drop rate p would lead to a throughput of 10 Gbps? [1 mark]
If the connection is sending data at a rate of 10 Gbps, how long on average is the time interval between drops? [1 mark]
What window size W (measured in multiples of the MSS) would be required to maintain a sending rate of 10 Gbps? [1 mark]
If a connection suffered a packet drop on reaching 10 Gbps, how long would it take for it to return to 10 Gbps after undergoing a fast retransmit? [1 mark]
What can you conclude about TCP's performance on high-capacity links? [4 marks]

2.2. Simplified TCP flow control

This question is adapted from 2002 Paper 9 Question 3, which is actually a Digital Communication II question. You should still be able to answer it, but you may give less detail than a Part II student might!

TCP incurs loss to discover the available capacity.

Why does it do this? [1 mark]
How can packet loss usually be detected without the sender waiting for an ACK timeout? How quickly can an overflowing router buffer be detected? [2 marks]
Describe the AIMD (Additive Increase, Multiplicative Decrease) mechanism, and fast retransmit, and show how this leads to the characteristic "saw tooth" throughput behaviour of TCP over time. [5 marks]
Consider a TCP connection operating in steady-state whereby each time the congestion window increases to W segments a single packet loss occurs (i.e. fast retransmit is working as intended, the congestion window is oscillating around the maximum capacity of the channel, and there is no other traffic competing for the channel). In terms of W and the round trip time (R), derive a simple formula for the time between the minimum and maximum data rates achieved. In this optimal scenario, how many packets are sent between each loss event? [4 marks]
Derive the connection's average throughput in terms of the fraction of packets lost (p), the connection's round trip time (R), and the segment size (B). [4 marks]
Under what conditions is this very simplistic model likely to be accurate? [3 marks]

Supervision 3

3.0. Physical & Data Link Layers

Consider a communication network consisting of a room full of people, where one or more people are exchanging thoughts and ideas with one or more others by talking.
1. For each of the abstract terms
  - node
  - channel
  - entity
  - layer
  - transmission (the act thereof)
  - coding
  - addressing
  - multiplexing
  identify one or more corresponding concrete components or activities within the system. If the correspondence is not exact, give the closest approximation you can, and explain why it is not exact. [4 marks]
2. Briefly compare and contrast this network with a shared-media wireless Ethernet and with a single Ethernet link, for the following channel criteria:
  - physical medium [1 mark]
  - total capacity [1 mark]
  - maximum user-to-user capacity [1 mark]
  - medium access control [2 marks]
  - geograhical area [1 mark]
  - failure modes [2 marks]
1. What kind of network traffic is suited to synchronous time-division multiplexing? [2 marks]
2. Give three ways in which asynchronous time-division multiplexing is more complex than synchronous time-division multiplexing. Why is asynchronous TDM used on the Internet? [5 marks]

Supervision 4

4.0. DNS details

A tool called "dig" is used to interrogate DNS servers for debugging purposes. It's fairly commonly available on UNIX systems (including linux.ds.cam.ac.uk which you can probably SSH to). Familiarise yourself with the dig command. (The manual page, also viewable with the command man dig, provides documentation; this quick tutorial might also be useful.) If you don't have access to a machine with dig installed, there is a web-based version available.

Find examples of at least five different types of DNS record via dig and briefly explain their purpose. [5 marks]
Manually perform an iterative DNS query of a domain name of your choice, by running a separate dig command for each iteration. Explain the process. [3 marks]
Why are iterative queries necessary? [8 marks]
Why are iterative queries typically not done by individual client hosts (i.e. your desktop, laptop or phone)? What do client hosts do instead? [3 marks]
Bonus question slightly outside the course: Clients may in some circumances want to perform their own iterative queries regardless, instead of using the recursive DNS server provided by the network to which they are connected. Why? What extra guarantees can this provide?

4.1. Client/server vs. peer-to-peer

Compare (briefly) the relative merits of client/server and peer-to-peer application models. [4 marks]
Classify each of the systems below as client/server, peer-to-peer or hybrid, and explain why the designers may have decided to implement the system in that way:
1. eBay
2. Skype
3. BitTorrent
4. SSH (Secure Shell)
5. DNS
For those which are peer-to-peer, explain how a new client joins the network and locates the data it is interested in. [10 marks]—adapted from Computer Networking, 3rd ed, review question 2.1
The Cambridge supervision system can be thought of as a peer-to-peer network. The department or your college acts as a rendezvous point or tracker, keeping lists of students and supervisors and enabling them to communicate directly, with the intention that the supervisor will (in theory!) pass knowledge data to the student. Furthermore a student who has taken the course may then choose to become a graduate student and rejoin the network as a supervisor in order to reshare the data. Comment on the durability and scalability of this system, comparing it with one of the peer-to-peer systems from the previous question. [4 marks]

4.2. Scaling HTTP

Traditionally, a website would be served from a single HTTP server. A single server is sometimes not sufficient to handle the load of a very busy modern website, though. The most obvious concern—but not the only concern—is the sheer quantity of network traffic involved in serving a media-rich website.

List some other ways (besides the volume of network traffic exceeding the capabilities of the server's network connection) in which a single server may struggle to meet the demands of a busy website. [4 marks]
List some advantages and disadvantages of using a load balancer to allow multiple web servers to host a single website. [4 marks]
List some advantages and disadvantages of using a content distribution network (CDN) to cache this website. [2 marks]
Is anycast useful to improve the scalability of a website? Why / why not? [6 marks]

Review questions

For your revision supervision I suggest you answer a few past exam questions. Here's a selection for you to choose from.

Bear in mind that the course was substantially refactored in 2011; several of these are from older courses but have been selected to roughly align with the material you have covered. Digital Communication II is actually a closer match for parts of the new course than Digital Communication I is.

Please tell me at least 2 working days before the supervision which exam questions you will be attempting / have attempted (some of the examiner's notes need digging out of a locked filing cabinet ~~stuck in a disused lavatory with a sign on the door saying 'Beware of the Leopard'~~).

TCP

2011 Paper 5 Question 6 (TCP and IP fragmentation)
2010 Paper 5 Question 8 parts (b), (c) from Digital Communication I (TCP/IP packet processing on hosts)
1998 Paper 8 Question 2 from Digital Communication II (interesting high-level question on TCP)
2004 Paper 7 Question 2 from Digital Communication II (TCP)
2003 Paper 8 Question 3 from Digital Communication II (TCP)

Generic flow control

2012 Paper 5 Question 5 part (a) (latency, capacity and layer interactions)—don't bother answering part (b) as you've already done a superset of this question
2006 Paper 7 Question 2 from Digital Communication II (flow control)—possibly uses terminology not covered in this course, but you can probably work out what it means

Generic error control

2011 Paper 5 Question 5 (design a transport layer protocol)
1998 Paper 5 Question 3 from Digital Communication I (packet loss and error control)—note that "ARQ" is a somewhat old term for "error control" i.e. retransmission of lost packets; in the case of TCP, this is tied somewhat to flow control but is logically separate

Layering

2010 Paper 5 Question 8 part (a) from Digital Communication I (TCP/IP packet processing on hosts)
2007 Paper 5 Question 3 from Digital Communication I (layering and multiplexing)
1996 Paper 6 Question 3 from Digital Communication I (layers / protocol design)
2008 Paper 7 Question 3 from Digital Communication II (OSI model bookwork, and layer violations)
1995 Paper 5 Question 3 from Digital Communication I (layering + routing/addressing)—note that a "MAC level bridge" would these days more usually be called a switch

Routing, forwarding and addressing

2013 Paper 5 Question 6 parts (b)-(f) (routing loop avoidance in Ethernet and IP)
2008 Paper 6 Question 3 from Digital Communication I (addressing and routing)
2010 Paper 9 Question 6 from Digital Communication II (routing protocols)—note that part (c) is not covered by this course, but I have mentioned it in supervisions
1998 Paper 6 Question 3 from Digital Communication I (protocol design)
1995 Paper 5 Question 3 from Digital Communication I (layering + routing/addressing)—note that a "MAC level bridge" would these days more usually be called a switch

Media access control

2012 Paper 5 Question 6 part (b) (multiplexing)
2009 Paper 8 Question 5 from Digital Communication II (media access control and QoS)

Application Layer

2013 Paper 5 Question 5 (HTTP, caching and Content Distribution Networks)
2012 Paper 5 Question 6 part (a) (DNS)
2010 Paper 5 Question 7 from Digital Communication I (Skype; protocol design and tradeoffs)

Miscellanea

2011 Paper 5 Question 4 (essay question on whether the Internet has been a "success"—exactly the kind of question you shouldn't touch with a barge-pole in the exam, but try it if you like!)
2007 Paper 8 Question 4 from Digital Communication II (routing protocols + impact on other protocols)

Computer Laboratory