#figure2355#
Figure: Remote File Access
Note that there are several places that protocols are involved.
Between the client and server machine, and between the client
application and the client filesystem access code, and between the
server and the filestore. Typically, the client and server protocols
are made as similar to local access as possible (for example by using
remote procedure calls that are nearly the same as the system
procedures for accessing local files).
However, there are more modes in which the system can fail, and so the
failure semantics are changed.
There are also more opportunities for concurrency, and therefore
chances of inconsistency since a server cannot tell what a client
application is doing with data once it has given that data over, and
there may be multiple clients of a given server (and of a given file on
that server).
#table2360#
Table: Changing from local to remote access
Since disks are relatively slow, even local access is typically cached
in memory. When file access is remote, this leads to the further
choice of whether there is caching at the server, or at the client or
both. The choice is dependent on the semantics of remote file access.
#table2366#
Table: Examples of choice
When designing a network or distributed filesystem, performance is a
key parameter. To select the right paradigm, first we must look at the
underlying access patterns for local file access. Then we try and
predict if this access will remain the same for remote access, and if
not, how it will change.
There have been many studies of file access patterns, though most have
been on the same kinds of system (Unix). They mainly find that there
is a very high degree of what is called ;SPM_quot;locality of reference, both in
time and space. Put simply, if you access a part of a file you are
much more likely to access the next part of the file, and soon, than
some other part of the file, or another file, at some far off time in
the future (through symmetry arguments, the past behavior resembles
the future).
Another part of the picture is that the majority of files are opened
for reading only, and rarely for writing. This is very important when
considering how expensive a concurrency control scheme one should use
since there's no need to invoke it for read only file access.
When looking at the service used to access the file, we should
distinguish between the service seen by the application, and that
actually carried out by the system. This is no different from local
access: Local file access in many systems is provided by a
<#2373#> stream<#2373#> abstraction. In fact, hardware access to the file
consists of a possibly arbitrary scattering of blocks across a disk,
but layers of software conspire to hide this. A Remote File Access
protocol may well preserve the stream appearance of access, while
actually translating it into unique access to blocks (NFS works
approximately this way). Alternatively,
it may provide a lower layer <#2374#> side-effect<#2374#>, whereby the entire
file is moved from server to client as a stream, and access at the
client is mapped into access to the copy (AFS works roughly like
this). See figure #fn103#2375>.