Administrative databases and web services

A discussion paper for the Computer Laboratory Admin Support Panel

Markus Kuhn

Status quo

The Computer Laboratory has for many years operated a Microsoft SQL Server that hosts a database called "Administration". It currently runs Microsoft SQL Server 2008 on FARRINGDON (farringdon.ad.cl.cam.ac.uk). The "Administration" database contains currently over 140 tables, including information about

The contents and configuration of databases on this server can be browsed and configured by any interested staff using the Microsoft SQL Server Management Studio software, which is installed, for example, on the Windows terminal server ts01.ad.cl.cam.ac.uk. Graham Titmus administers the SQL Server and can grant interested staff access permissions.

Existing users:

Shortcomings of the existing SQL server arrangements:

Requirements and available technologies

Data storage and exchange

Machine-readable data shared between human users and automatically running software applications can and is being maintained in the department in a number of ways:

Using a dedicated storage server process (whether SQL or Subversion) has particular advantages for some applications:

As we may not have the resources to develop custom-made editors for each data set, an important requirement is also that generic tools are available for directly browsing and editing data. With relational databases systems, these tools should in particular allow users to edit table joins, such that the normalization of data does not hinder direct access by human users.

Some existing users are very comfortable with Microsoft Access as a front-end, others strongly prefer plain-text files edited with e.g. emacs.

Web front-ends

Over the last few years, a number of very sophisticated and comfortable web application frameworks have emerged for the development of web applications and services, in particular those that interact closely with relational databases. These include usually:

Ruby on Rails was perhaps the first widely used MVC-based framework that integrated many of these techniques into a single package. This approach has since been ported to many other programming languages, such as Pylons and Django under Python and Catalyst under Perl. They all promise to minimize the work involved in designing new database web front-ends and simplify collaboration on such projects by making the structure of such applications very predictable.

Pylons looks particularly promising, firstly because the underlying programming language (Python) is very widely used already in the department, and secondly because it follows a rather modular approach and allows a choice of components from separately maintained Python libraries. Python's standardized Web Server Gateway Interface (WSGI) makes this modular approach particularly practical. Pylons is normally used with the SQLAlchemy database toolkit, which offers a very flexible ORM.

Others considered: Perl would seem to have the advantage of having been a widely used sys-admin tool for two decades. But the only state-of-the-art Perl framework around, Catalyst, uses very advanced OO features of Perl, turning it essentially into a different language from what sys-admin are accustomed to. This substantially reduces the advantage of not having to learn a new language. Django for Python is an all-in-one solution that leaves less flexibility in case we don't like one particular component, e.g. a templating engine that does not work nicely with ucampas. There are lots of others, but given that it is usually more work to first familiarize oneself with a framework than to then develop a medium complexity application, we are unlikely to be able to look at many and have to rely on the recommendations received from experienced users in the department, who named mostly the above candidates.

Suggested action items

  1. Familiarization with candidate MVC/ORM web application frameworks, probably starting with Pylons. In particular pay attention to interaction between template engine and ucampas for easy application of house styles.
  2. Development of 1–2 demonstration applications, probably:

    (One small and one medium complex application, both immediately useful and largely independent of existing applications.)

  3. Finding a solution for improved Linux connectivity to the Microsoft SQL Server (in particular to Python and SQLAlchemy), which is a precondition on its use in the many new teaching-related web applications that we will have to implement in the foreseeable future.
    Candidate database connectivity library stacks:
    See also:
  4. Explore mod_wsgi, a recommended way of running production Python web applications with Apache web servers.
    (For prototyping, we can use some Python webserver or existing Apache CGI facilities with a WSGI wrapper.)
  5. Start thinking about better documentation of our databases and associated processes.
  6. Get together small technical working groups that start looking after individual applications, workflows, table sets, and their documentation, probably starting with various student-admin tables.
  7. Explore Linux alternatives to MS Access as generic table editors (e.g., rekall, open office)

created 2009-12-01 – last modified 2009-12-02 – http://www.cl.cam.ac.uk/~mgk25/admin-support/databases.html