Next: HTML and SGML Up: HTML Grammar Previous: HTML Grammar

Introduction to Grammars

Grammars are something we may have learned at school. However, those sort of grammars are for natural languages, like Chinese, Spanish or English. We remember rules like "Subject, Verb, Object" and how to conjugate a verb or decline a noun (cases of, like nominative, vocative, accusative, theraputic etc).

Computer languages, we can include communications protocols, are specified using similar meta-rules. Computer languages are a bit simpler, and most are specified using a Grammar of Grammers, invented by two great computer scientists, Backus and Naur, and named after them Backus Naur Form or BNF.

A BNF consists of a set of production rules for sentences in the language. A finished sentence (think of it as a program statement, or in the case of HTML, a document) consists of a collection of terminals or base words from the language, put together according to the production rules. The words may be drawn from a simpler grammar called a lexicon, which is basically your spelling rules, and they are made up from characters (letters) drawn from an alphabet. In any computer grammar, we define all 3 of these, in tiny, exhaustive (tedious) detail. Alphabets are just lists of letters allowed, and lexicons are fairly obvious too, but grammars are less so.

Production rules build up a tree of rules to generate legal sentences. An example of a BNF grammar for a simple language might be:

Table A.1: Trivial Nadsat Grammar

So a legal utterances in this language would include:

fire holds dangerousthings
women puzzles fire
dangerousthings amuses women
...

We can see that a compact grammar can describe a huge vista of utterances!

Jon Crowcroft
Wed May 10 11:46:29 BST 1995