Chapter 25 LINUX, Solaris and contemporary
UNIX
Exercises
25-1 (a) What
are the advantages of structuring an OS using kernel loadable modules? Are there any disadvantages?
It allows new functionality to be introduced without requiring that the machine be rebooted. This is particularly important for supporting ‘plug-and-play’ devices which can be added or removed without needing to turn off the machine. Loadable modules allow the system to support a wide range of devices, file systems, network protocols and so on without requiring that the code that implements them remains resident in memory. It is important to remember that, once loaded, there is no isolation between a module and the remainder of the kernel.
(b) Suggest
three components of the kernel which could not easily be implemented in this
way.
Some portions of the kernel must always be available for correct operation. For instance the process scheduler, the basic IPC mechanisms and support for memory management.
25-2 (a) What did the designers of BSD
4.2 seek to achieve by performing network protocol processing at a lower
interrupt priority level than the level at which the device driver initially
executes?
Switching to a lower IPL allows interrupts to be re-enabled more quickly, reducing the likelihood that the internal buffers on a network device will be exhausted.
(b) Describe
the phenomenon of receive livelock and a situation in which it could occur.
Receive livelock occurs when there is insufficient time to finish performing the operations that are initiated by interrupt delivery – for instance, although there may be time to receive a vast number of Ethernet packets, there may not be time to perform higher level protocol processing on them. The system makes no useful progress.
(c) A
designer suggests that receive livelock could be avoided by performing network
protocol processing within a new system process ‘nwproto’ executing
in kernel mode but under the control of the CPU scheduler. Do you agree that this would be a
worthwhile change?
This would allow the amount of time spent in network protocol processing to be controlled (assuming that the CPU scheduler provides such facilities). Although it may still not be possible to complete network protocol processing for all packets that are received, it may be possible to allow other processes to continue acceptable operation. The scheme may work better if it also limits the rate at which interrupts are received from the network device so that protocol processing is not even initiated on more packets than can be managed correctly.
25-3 Compare
and contrast the IPC mechanisms provided by SVr4 UNIX and the mmap interface
developed for BSD. What are the
strengths and weaknesses of each?
See the
introduction to sections 25.3 and 25.4.
25-4 Figure 25.5 shows two processes
that are operating in a system with three shared memory segments defined. (a)
What system call would the processes use to make shared segment 1 available to
both of them? (b) Process A maps the segment at address 0x10000 and Process B
maps it at address 0x20000 in its virtual address space. What problems would arise storing a
linked-list data structure in this segment? Suggest how these problems could be
overcome.
The shmat system call would be used to attach to an existing shared segment. If the two processes map the segment at different virtual addresses then they cannot directly use pointer-based data structures within this region because the pointers will be interpreted differently by each process. These problems could be overcome by storing addresses in the shared segment in a relative form, perhaps relative to the start of the segment or relative to the address that holds the pointer.
25-5 A ‘big reader’ lock
provides multiple-reader-single-writer semantics optimised
for workloads in which updates are rare.
Using the compare and swap instruction
sketch a design of such a lock by analogy with the simple spin lock design
presented in Figure 10.5.
(A full implementation can be found in the Linux kernel source code)
In outline, aside from correctly implementing MRSW semantics, it is worthwhile ensuring that each CPU accesses separate memory locations when attempting to gain access to the lock in the common ‘reader’ mode. For example, each CPU could have an individual flag indicating whether or not it is currently reading from the protected data structure. This will interact well with the processor data caches on a multi-processor machine because the flags can each remain local to their associated CPU (assuming each is on a separate cache line). In order to acquire the lock in write mode, that thread would have to update all of the flags to a special value indicating that a write is in progress. This makes writing more costly, but may improve the performance of read operations.
25-6 A server process is going to deal
with a small number of concurrent clients, each of which makes a complex series
of interactions with it over a TCP connection. Which of the structures suggested in
Figure 25.8 would be most appropriate?
Justify your answer.
There are only a few clients so the frequency of
context switches is less of a concern than in a server with many clients. The complexity of the interactions may
make it easier to structure the server as a number of threads, with one for
each client, since each thread can directly maintain state about the associated
client.
25-7 The sendfile
system call, provided on
some UNIX systems, transfers data from one file descriptor to another without
further intervention from the process.
Why are the source and destination specified using file descriptors
rather than file names?
It allows sendfile to be used with resources that
are not named in the file system – for example with a file descriptor
representing a TCP network connection that the process has open.
25-8 Some UNIX systems and tools
attempt to intercept accesses to files with names such as
‘/dev/tcp/www.cl.cam.ac.uk/80’ as requests to open a TCP connection
to the specified server on a specified port. (a) Is this functionality best provided
in the shell, in a kernel loadable module or as a core part of the file
subsystem? (b) Describe the strengths and weaknesses of this approach in
comparison to using the sockets API directly.
This functionality is probably best provided by a kernel loadable module, perhaps as a special kind of file system. However, implementing this may not be straightforward depending on the environment in which the file system code is executed – in particular whether it is able to initiate network connections. The shell could provide the functionality by allowing such names to be opened as the standard input, output or error streams for a process. However, this may lead to a confused naming scheme because other processes would not be aware of the special interpretation of such names. Compared with using the sockets API directly, this scheme may be easier for application programmers to use and avoid the construction of small ‘wrapper’ applications which open a network connection and then fork an existing application to communicate over it.
25-9 The JVM is running on an
otherwise unloaded 4-processor machine running the Solaris operating system. Would
you expect each instance of java.lang.Thread to be associated with a separate LWP?
You would certainly expect that multiple threads
running within the JVM could execute with genuine parallelism on the
4-processor machine. It may be that
the JVM provides 4 LWPs and then performs additional multiplexing of threads
over those, or it may be that it exposes each thread directly as an LWP. The former solution may be preferable
when there are a large number of threads.
The latter solution may simplify the JVM implementation.