PART IV CASE STUDIES

PART IV CASE STUDIES

Chapter 24 UNIX

Exercises

24-1 What is meant by saying that UNIX is a procedural system? How might the hardware support or assume this style of system design (see Sections 8.3 and 4.11, for an example)?

We must have at least two privilege states of processor execution and trap instructions which cause a change to privilege state in order to enter the OS. There may also be support in the memory management hardware for protected OS regions.

How does procedural execution of the kernel affect the design of the mechanism for synchronisation between the hardware and processes?

If there are no dedicated device handler processes the interrupt service routine cannot just signal the one interested process. We need a more general event structure.

How might a procedural design make it difficult to guarantee a bound for the time between an event occurring and a process that was waiting for it running at user-level?

Everything is done in-process. Your thread of control (which was sleeping, waiting for an event) may have to undertake work for other processes between being awakened (scheduled as a kernel thread) and returning to user mode.

24-2 Assume that all the directories and files in the following pathname and all the inodes except that of the root are on disk rather than in main memory. Assume that each directory may be read by a single disk access.

How many disk reads would it take to resolve the pathname:

/usr/guru/examples/C-examples/C-prog

We need to access 4 additional inodes and read 5 directories (root, usr, guru, examples, C-examples). The number of disk reads depends on whether any of the inodes are in the same disk page and whether the relevant part of each directory can be read with a single disk read.

24-3 Suppose a filing system uses a block size of 4K bytes and a block pointer occupies 4 bytes. What size of file can be accessed from 12 direct pointers in an inode?

48Kbytes

What size of file can be accessed from 12 direct pointers and 2 indirect pointers (see Section 24.6)?

Assume that indirection is via a block which is also of size 4Kbytes, so each indirect block can contain up to 1K block pointers. The maximum file size is 48K+2*(1K * 4K) bytes, or (48K + 2**24) bytes or 12+2K blocks=2060 blocks.

What size of file could be accessed with an additional double and triple indirect pointer?

Add on 1K*1K*4K bytes = 2**32 bytes for a double indirect pointer.

Add on 1K*1K*1K*4Kbytes = 2**42 bytes for a triple indirect pointer.

Figure 24.11 shows the system open file table. Each entry contains a pointer into an open file for use in reading and writing. If the table is designed so that this pointer occupies 4 bytes, how large can a file be?

The maximum offset is 2**32 bytes.

24-4 What are the advantages and disadvantages of using process-relative names, such as small integers, for open files, pipes, sockets etc?

Small numbers are more convenient to handle than long bit-strings.

They have no meaning outside the context of the owning process(es), so cannot be passed around generally to achieve co-operative sharing. UNIX assumes that this will be achieved by parent-child co-operation - the parent’s open file id’s are passed to the child on create.

24-5 Contrast the effect of the UNIX fork system call with that of the language-level fork described in Sections 4.16, 11.2.4 and 12.8.

UNIX fork creates a new address space with a separate copy of the data space, open file table and so on. A language level fork creates a new process (or thread) which runs in the same address space as its parent. It shares the same memory management tables, open file tables etc. A new stack is created for the new process. The OS may or may not be aware of the new process.

24-6 The UNIX command interpreter (shell) is run as a user-level process. Explain how the shell sets up a process to execute a command and how the command’s arguments might be passed to it. How would you run your own command interpreter instead of the standard shell?

We assume that a shell instantiation has received a command by means of standard input. The command program is to be executed by a new process and arguments must be made available to it. Recall that on fork, the parent’s data space is replicated in the child. This can be used to make the command arguments available to the child. After fork completes the child (process-id=0 returned by fork) execve’s the command program which can pick up its arguments from a place agreed by convention.

The login procedure looks in your password file entry to find out which command interpreter you wish to use.

24-7 UNIX keeps a pool of character buffers of a defined size (say, 32 characters) for terminal I/O. Work out in some detail how a character would be delivered by a device, accepted by an interrupt handler, checked to see whether any special action should be taken (for example on a break or flow control character) and put in a buffer.

A buffer pool of c-blocks is used. For a given terminal a c-list is set up, with c-blocks added as required, for input and for output. On input, each character must be examined; control characters are acted on, not buffered. A process may be blocked on output, before passing all the characters, if its output list is getting too long (until the device has done output to reduce the size of the list).

M J Bach (1986) and Leffler at al. (1989) give details.

24-8 UNIX keeps a cache of disk block buffers. Each buffer has a data area which is the size of a disk block and a header for control information. What information would you keep in the header to control the use of the buffers and to ensure mutually exclusive access to them?

An example is given after Chapter 10.

24-9 A process creates and opens a file and a pipe then creates two child processes. Which system calls can the children use to access both the file and the pipe? Which system calls can be used for the file but not for the pipe?

Read, write and close may be used for both the file and the pipe.

Seek may be used for the file but not for the pipe.

24-10 Extend Figure 24.11 to show two independent sharers of an open file and two sharers of a file which have inherited it from a common ancestor.

The figure should show the independent sharers with a separate entry in the open file table and therefore a pointer each. The children who have inherited the file share a single entry and therefore a single pointer.

24-11 Are UNIX signals a general interprocess synchronisation mechanism? Discuss.

No. In their use they are more akin to exceptions in programming languages. Signals tend to be associated with error conditions.

Re the signal mechanism: a process does not block, waiting for a signal. It may test to see whether a signal has been sent, if not it continues. This typically happens when a process is about to return from kernel to user mode and sometimes during the sleep/wakeup process.

24-13 Why is there no need for semaphores in the UNIX kernel?

Processes executing the kernel are scheduled non-pre-emptively. We always have control of which process is accessing which data structure. Interrupts are forbidden so that a process and interrupt service routine do not access the same data structure at the same time.

24-13 How might a crash make a UNIX filing system inconsistent? Consider a sequence of events that might lose a block to the filing system or might cause a block to be recorded as allocated both to a file and to the free list. How does the way UNIX records the blocks allocated to files help to make consistency checks relatively easy?

Assume main memory is lost on a crash. The inode for a file, the free list and file data are all changed first in main memory, then the changes are written to disk. Blocks may be held in the buffer cache instead of being written to disk synchronously with an operation.

How to lose a block: suppose a block is freed that was in use in some file. The main memory copy of the inode is changed, the block is added to the record of the free list in main memory, the inode is written to disk, there is a crash.

How to doubly allocate a block: a new block is needed, say to append to a file. The main memory inode and free list are changed. The inode is written to disk. There is a crash.

All the blocks allocated to files can be found by reading the inode table (and indirect blocks). There are no chains of blocks to follow.

24-14 Why can a directory held on one file system not contain a hard link to a file on a separately-mounted file system? Suggest how the file subsystem could be extended to provide this facility and describe whether your scheme introduces any disadvantages.

A hard-link is the usual kind of directory entry in which the inode of the linked-to file is held in the directory containing the link. The inode number is local to a particular file system. It would be possible to support hard-links to files on separately-mounted file systems by including some kind of ‘file system identifier’ (FSID) specifying in which file system to look up the specified inode. However, this raises the problem of how to form these FSIDs and how the system should behave if they refer to a file system held on removable media.

24-15 Some UNIX systems have a vfork system call that creates a new process but requires that the child does not return from the function that called vfork and that it invokes only the exit or execve system calls. Why is this curious system call a useful addition? How and why do modern CPUs reduce the need for vfork.

The restrictions imposed by the vfork system call mean that the child process can initially share its address space with its parent, rather than requiring that the OS makes a copy – the time spent doing that copy would be wasted if the child performs an exit or execve system call soon after. Vfork is less important in modern systems that support copy on write.

24-16 The fork system call creates a new process by making a near-identical copy of its parent. Assess the advantages and disadvantages of this operation in contrast to an simple create_process system call that would take the name of a program to load and execute.

The fork system call may initially appear more complex to use because it makes loading and executing a program in a new process a two-stage operation – first of all forking a new process and then using execve to load and execute the specified program. The benefit of this is that fork can be used to create hierarchies of co-operating processes from the same executable image (i.e. when fork is not followed by execve) whereas a simple create_process operation could only load and execute files that can be named.