If you have used a WWW client such as Mosaic, you have probably already used a proxy client. Mosaic and other clients built upon LibWWW can contact servers for protocols such as ftp and gopher, and then convert the output of such servers into HTML for formatting and display on your screen (see figure 5.5)
Figure 5.5: A WWW Proxy Client contacting an FTP Server
Proxy servers take this one step further - instead of your client contacting remote servers directly, your client makes an HTTP request to a proxy server. The proxy server then contacts the relevant FTP or GOPHER server, and converts the results to HTML, before transferring them back to your client (see figure 5.6).
Figure 5.6: A FTP Proxy Server answering and HTTP request
A proxy server can also make connections to remote HTTP servers. At first glance, this wouldn't appear to benefit you, as the proxy then performs no conversion functionality, but it provides a way to provide network services to machines on a secure subnet without those machines having to have direct access to the outside world. Thus secure sites can run a proxy server on their firewall machine, or SOCKSify only their proxy server without needing to modify the WWW client programs for all their different architectures (see figure 5.7). "Socksify" is the term used for taking a communications program written using the socket or winsock API, and making it more secure using a public domain package called "socks". This package allows a server to be reached indirectly, so that it can operate behind a firewall. A firewall is a router (or system of routers) that performs a number of extra checks to a site making it potentially more secure.
Figure 5.7: A Proxy Server on a Firewall
Even if you do not need this level of security, CERN's HTTPD can also provide caching facilities for clients using the server as a proxy. Caching facilities in the World Wide Web are currently in their infancy, as many servers do not return expiry date information with documents, so deciding how long data should be cached before going back to look at the original is not a clear cut issue. However, CERN's server uses whatever information is available to it to make a decision about cache timeouts, and although it doesn't always do the right thing, it does substantially improve performance for frequently accessed pages, and most of the time it gets it right. (see figure 5.8).
Figure 5.8: A Caching Proxy Server