Chapter 25 LINUX, Solaris and contemporary UNIX

Exercises

25-1 (a) What are the advantages of structuring an OS using kernel loadable modules?  Are there any disadvantages?

It allows new functionality to be introduced without requiring that the machine be rebooted.  This is particularly important for supporting ‘plug-and-play’ devices which can be added or removed without needing to turn off the machine.  Loadable modules allow the system to support a wide range of devices, file systems, network protocols and so on without requiring that the code that implements them remains resident in memory.  It is important to remember that, once loaded, there is no isolation between a module and the remainder of the kernel.

(b) Suggest three components of the kernel which could not easily be implemented in this way.

Some portions of the kernel must always be available for correct operation.  For instance the process scheduler, the basic IPC mechanisms and support for memory management.

 

25-2 (a) What did the designers of BSD 4.2 seek to achieve by performing network protocol processing at a lower interrupt priority level than the level at which the device driver initially executes? 

Switching to a lower IPL allows interrupts to be re-enabled more quickly, reducing the likelihood that the internal buffers on a network device will be exhausted.

(b) Describe the phenomenon of receive livelock and a situation in which it could occur.

Receive livelock occurs when there is insufficient time to finish performing the operations that are initiated by interrupt delivery – for instance, although there may be time to receive a vast number of Ethernet packets, there may not be time to perform higher level protocol processing on them.  The system makes no useful progress.

(c) A designer suggests that receive livelock could be avoided by performing network protocol processing within a new system process ‘nwproto’ executing in kernel mode but under the control of the CPU scheduler.  Do you agree that this would be a worthwhile change?

This would allow the amount of time spent in network protocol processing to be controlled (assuming that the CPU scheduler provides such facilities).  Although it may still not be possible to complete network protocol processing for all packets that are received, it may be possible to allow other processes to continue acceptable operation.  The scheme may work better if it also limits the rate at which interrupts are received from the network device so that protocol processing is not even initiated on more packets than can be managed correctly.

 

25-3 Compare and contrast the IPC mechanisms provided by SVr4 UNIX and the mmap interface developed for BSD.  What are the strengths and weaknesses of each?

See the introduction to sections 25.3 and 25.4.

 

25-4 Figure 25.5 shows two processes that are operating in a system with three shared memory segments defined. (a) What system call would the processes use to make shared segment 1 available to both of them? (b) Process A maps the segment at address 0x10000 and Process B maps it at address 0x20000 in its virtual address space.  What problems would arise storing a linked-list data structure in this segment?  Suggest how these problems could be overcome.

The shmat system call would be used to attach to an existing shared segment.  If the two processes map the segment at different virtual addresses then they cannot directly use pointer-based data structures within this region because the pointers will be interpreted differently by each process.  These problems could be overcome by storing addresses in the shared segment in a relative form, perhaps relative to the start of the segment or relative to the address that holds the pointer.

 

25-5 A ‘big reader’ lock provides multiple-reader-single-writer semantics optimised for workloads in which updates are rare.  Using the compare and swap instruction sketch a design of such a lock by analogy with the simple spin lock design presented in Figure 10.5.

(A full implementation can be found in the Linux kernel source code)

In outline, aside from correctly implementing MRSW semantics, it is worthwhile ensuring that each CPU accesses separate memory locations when attempting to gain access to the lock in the common ‘reader’ mode.  For example, each CPU could have an individual flag indicating whether or not it is currently reading from the protected data structure.  This will interact well with the processor data caches on a multi-processor machine because the flags can each remain local to their associated CPU (assuming each is on a separate cache line).  In order to acquire the lock in write mode, that thread would have to update all of the flags to a special value indicating that a write is in progress.  This makes writing more costly, but may improve the performance of read operations.

 

25-6 A server process is going to deal with a small number of concurrent clients, each of which makes a complex series of interactions with it over a TCP connection.  Which of the structures suggested in Figure 25.8 would be most appropriate?  Justify your answer.

There are only a few clients so the frequency of context switches is less of a concern than in a server with many clients.  The complexity of the interactions may make it easier to structure the server as a number of threads, with one for each client, since each thread can directly maintain state about the associated client.

 

25-7 The sendfile system call, provided on some UNIX systems, transfers data from one file descriptor to another without further intervention from the process.  Why are the source and destination specified using file descriptors rather than file names?

It allows sendfile to be used with resources that are not named in the file system – for example with a file descriptor representing a TCP network connection that the process has open.

 

25-8 Some UNIX systems and tools attempt to intercept accesses to files with names such as ‘/dev/tcp/www.cl.cam.ac.uk/80’ as requests to open a TCP connection to the specified server on a specified port.  (a) Is this functionality best provided in the shell, in a kernel loadable module or as a core part of the file subsystem? (b) Describe the strengths and weaknesses of this approach in comparison to using the sockets API directly.

This functionality is probably best provided by a kernel loadable module, perhaps as a special kind of file system.  However, implementing this may not be straightforward depending on the environment in which the file system code is executed – in particular whether it is able to initiate network connections.  The shell could provide the functionality by allowing such names to be opened as the standard input, output or error streams for a process.  However, this may lead to a confused naming scheme because other processes would not be aware of the special interpretation of such names.  Compared with using the sockets API directly, this scheme may be easier for application programmers to use and avoid the construction of small ‘wrapper’ applications which open a network connection and then fork an existing application to communicate over it.

 

25-9 The JVM is running on an otherwise unloaded 4-processor machine running the Solaris operating system. Would you expect each instance of java.lang.Thread to be associated with a separate LWP?

You would certainly expect that multiple threads running within the JVM could execute with genuine parallelism on the 4-processor machine.  It may be that the JVM provides 4 LWPs and then performs additional multiplexing of threads over those, or it may be that it exposes each thread directly as an LWP.  The former solution may be preferable when there are a large number of threads.  The latter solution may simplify the JVM implementation.