BNF syntax
This is a BNF-like description of the Uniform Resource Locator syntax. A vertical line "|" indicates alternatives, and [brackets] indicate optional parts. Spaces are represented by the word "space", and the vertical line character by "vline". Single letters stand for single letters. All words of more than one letter below are entities described somewhere in this description.
The "generic" production gives a higher level parsing of the same URLs as the other productions. The "national" and "punctuation" characters fo not appear in any productions and therefore may not appear in URLs.
The "afsaddress" is left in as historical note, but is not a url production
fragmentaddress uri [ # fragmentid ]
uri url
url generic | httpaddress | ftpaddress |
newsaddress | prosperoaddress | telnetaddress
| gopheraddress | waisaddress
generic scheme : path [ ? search ]
scheme ialpha
httpaddress h t t p : / / hostport [ / path ] [ ?
search ]
ftpaddress f t p : / / login / path
afsaddress a f s : / / cellname / path
newsaddress n e w s : groupart
waisaddress waisindex | waisdoc
waisindex w a i s : / / hostport / database [ ? search
]
waisdoc w a i s : / / hostport / database / wtype /
digits / path
groupart * | group | article
group ialpha [ . group ]
article xalphas @ host
database xalphas
wtype xalphas
prosperoaddress prosperolink
prosperolink p r o s p e r o : / / hostport / hsoname [ \%
0 0 version [ attributes ] ]
hsoname path
version digits
attributes attribute [ attributes ]
attribute alphanums
telnetaddress t e l n e t : / / login
gopheraddress g o p h e r : / / hostport [/ gtype [
selector ] ] [ ? search ]
login [ user [ : password ] @ ] hostport
hostport host [ : port ]
host hostname | hostnumber
cellname hostname
hostname ialpha [ . hostname ]
hostnumber digits . digits . digits . digits
port digits
selector path
path void | xpalphas [ / path ]
search xalphas [ + search ]
user xalphas
password xalphas
fragmentid xalphas
gtype xalpha
xalpha alpha | digit | safe | extra | escape
xalphas xalpha [ xalphas ]
xpalpha xalpha | +
xpalphas xpalpha [ xpalphas ]
ialpha alpha [ xalphas ]
alpha a | b | c | d | e | f | g | h | i | j | k |
l | m | n | o | p | q | r | s | t | u | v |
w | x | y | z | A | B | C | D | E | F | G |
H | I | J | K | L | M | N | O | P | Q | R |
S | T | U | V | W | X | Y | Z
digit 0 |1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
safe $ | - | \_ | @ | . | &
extra ! | * | " | ' | ( | ) | : | ; | , | space
escape \% hex hex
hex digit | a | b | c | d | e | f | A | B | C |
D | E | F
national { | } | vline | [ | ] | \ | ^ | ~
punctuation < | >
digits digit [ digits ]
alphanum alpha | digit
alphanums alphanum [ alphanums ]
void
URIs, including URLs, will ideally be transmitted though protocols which accept them and data formats which define a context for them. However, in practice nowadays there are many occasions when URLs are included in plain ASCII non-marked-up text such as electronic mail and usenet news messages.
In this case, it is convenient to have a separate wrapper syntax to define delimiters which will enable the human or automated reader to recognize that the URI is a URI.
The recommendation is that the angle brackets (less than and greater than signs) of the ASCII set be used for this purpose.
These wrappers do not form part of the URL, are not mandatory, and should not be used in contexts (such as SGML parameters, HTTP requests, etc) in which delimiters are already specified.
Example:
Yes, Jim, I found it under \tt{<ftp://info.cern.ch/pub>} bu
t
you can probably pick it up from \tt{<ftp://ds.internic.net/rfc>}.
The URL scheme does not in itself pose a security threat. Users should beware that there is no general guarantee that a URL which at one time points to a given object continues to do so, and does not even at some later time point to a different object due to the movement of objects on servers.
The use of URLs containing passwords is clearly unwise.