Writing CD-ROMs on snow.cl.cam.ac.uk

Here are some simple instructions on how to create and write a CD-ROM on the CD writer attached to snow.cl.cam.ac.uk in the Rainbow Room.

For personal use, you can buy media from Staples or HCS global, but someone will usually have got about 50 CDRs and be selling them on at somewhere near cost. If you are writing CDs for Lab use, rrw has some in his office.

The current local experts in CD writing are rrw and bms: ask them if anything goes horribly wrong.


Introduction

Ricoh have a fairly good introduction to CD-ROM issues in their CD Recordable Handbook: this goes into far greater depth than I can here.

Physical layout

A CD-ROM contains a single spiral track of data, in which data bits are formed by adjusting the reflectivity of sections of the track to laser light. Logical `tracks' are layered onto this using the Red Book protocol, which provides for up to 99 tracks of sequential 2,352-byte sectors, and defines an EDC (error detection code) and ECC (error correction code) which allow for the detection and correction of sector errors.

The Red Book specifies only how audio (CD-DA) tracks are written. A separate document, the Yellow Book (CD-ROM) specifies how data tracks are to be written (strictly speaking, CD-ROM Mode 1 for data, and Mode 2 for compressed audio/video). Yellow book reduces the bytes per sector to 2048: the remainder is taken up with yet another level of ECC. CD-I and CD-ROM XA formats use Yellow book Mode 2 tracks.

Similarly, Green book specifies CD-I, and Orange book deals with r-w media like magento-optical disk (CD-MO) and write-once disk (CD-WO).

A CD-ROM alters reflectivity using an aluminium-covered disc into which pits are pressed. This provides high reflectivity changes, but requires expensive masters to be made. CD-Rs use thermally active dyes: when a CD-R is burnt, a writing laser is focussed on parts of the disk, causing them to change reflectivity (by various means).

This results in far lower reflectivity changes (so old CD-ROM drives will not read CD-Rs: some CD-ROM drives shipped with Dell machines c. 1992 seem to have this problem).

CD-RW uses a similar technique (except that the physical change in the dye is reversible), and provides even lower reflectivity change, so CD-RWs can only be read in CD-RW-capable CD-ROM drives.

Filesystems

Each Yellow Book mode 1 track contains nothing more than a linear collection of data. For this to be useful, we must write our data to disk in some sensible format.

Although you can write any format as a Yellow book mode 1 track (tar format, ext2fs image ...), the standard format is ISO9660: this is readable by practically every system.

You will typically build a filesystem image in a disk file and then write it to CD-R with a program like cdrecord.

ISO9660 comes in two interesting flavours: Level 1 is basically DOS filesystem mode: 8.3 case-insensitive filenames, and a limited nesting depth for directories. ISO9660 level 3 gives you filenames up to 32.3. All forms of ISO9660 give a version number to each file (denoted by eg. ';1' after the filename), but these are usually ignored by OS driver software.

ISO9660 doesn't support either UNIX or MS-specific file attributes or long or unicode filenames. The UNIX community has an extension, called High Sierra or Rock Ridge, which provides UNIX-like naming information (and deep directory relocation) on systems that support it. Microsoft has also specified an extension (called Joliet) which supports long names, UNICODE names, and MS-specific file attributes.

Both Rock Ridge and Joliet can coexist on a disk, so you can build a single disk which works correctly on UNIX, Windows, and other systems (which just fall back to the underlying ISO9660).

Macs have their own special problems: mail Richard.Watts@cl.cam.ac.uk for details.

Media and lifetime

CD-R media seem to be fairly critical: different CD-ROM drives have different levels of tolerance of the low reflectivities found in CD-Rs, and some will not read CD-Rs at all. The type of CD-R used seems to make a considerable difference, so be sure to get the right media.

Robert King at the Computing Service has run some tests on various CD-R blanks, and found:

CD type Transfer rate (k/s)
Commercial 1 (CD) 613
Commercial 2 (CD) 613
Sony 1 (CD-R) 613
Sony 2 (CD-R) 601.3
Maxell 1 (CD-R) 146.9
Maxell 2 (CD-R) 217.2
Maxell 3 (CD-R) 227.7
Maxell 4 (CD-R) 183.2
KAO (CD-R) 611
TDK (CD-R) 612.9

Since then, we've been using Maxell CDR-74XLs, which seem to be working fine.

You should also note that CD-Rs have an expected life of between 5 and 10 years: do not use them for long-term storage. Use pelican.cam or consult the COs instead.

Single vs. multi-session

CD-Rs are usually single-session, single-track: you will typically write a single data track containing all your data. Any remaining space on the CD-R is lost.

It is possible to write multi-session CD-Rs: this requires cooperation between the filesystem image (which must provide hooks for future sessions) and the program you use to actually write the CD. mkhybrid and cdrecord can (nearly) do this: mail Richard.Watts@cl.cam.ac.uk for details.

Bootable CD-ROMs

There is now a standard for bootable CD-ROMs, called El Torito. Basically, you just make a 1.44Mbyte file containing the image of a boot floppy, and give your cd recording program that file as an argument. On boot, the system will boot from the floppy image. Newer versions of cdrecord have support: mail Richard.Watts@cl.cam.ac.uk for details.

Capacity

CD-Rs are 74 minutes long, or 764Mbytes raw (ie. in Red Book format). Yellow book Mode I (ie. data format) brings this down to 666Mbyte, and manufacturing margins, catalogue data and such take the prudent maximum down to about 640Mbyte of Yellow book data. In practice, you should account 20Mbyte for each of ISO9660 directory information, Rock Ridge extensions, and Joliet extensions.

Personally, I tend to avoid writing CD-Rs much above 560Mbyte of raw data (ie. files before transfer to the ISO image). YMMV.


How to blow a CD

You can use the following process to blow a data CD-R on snow's CD-Writer in the Rainbow room.

1.

Collect the files you want to blow onto CD-R, and make a filesystem image on snow.cl's disk. There should be enough space in /local/scratch on snow - use sudo /usr/bin/mkscratchdir to make a directory for yourself.

2.

Check your filesystem.

3.

Check your CD-R blank: make sure it's free of lint and specks of dust. Either of these can seriously snarf your CD-R.

Make sure the CD-writer isn't about to get any nasty shocks: it needs stability. Likewise, boot anyone else off snow - the cdrecord process needs to feed data to the CD writer at a constant rate, and though everything in the system has a fair amount of buffering, a buffer underrun will likely trash your CD.

4.

Write your data to CD using cdrecord. cdrecord needs to be run from sudo at present - mail pb for permission, or ask someone who has permission - rrw, bms, and mjb are the three people who have it at present (check whether you do with sudo -l). You can find out about cdrecord's options from its manpage or by doing cdrecord --help on snow.

You will want to run something like:

$ cdrecord dev=1,0 speed=2 -v myfile
As our CD writer is on SCSI ID 1, LUN 0, works best at speed 2 (though feel free to try 1 or 4), and you want the pretty 'Mbyte done' counter.

You can add the -dummy option to just pretend you're making a disk (do everything with the laser off), so you can check for things like buffer underruns. This shouldn't normally be necessary, though.

5.

Check your CD. Note that different CD-ROM drives will have different degrees of error depending on how well they cope with the particular type of low reflectivity on your CD-R.

The way I recommend that you do this is to run an md5sum on the image and the data on CD: this ensures (with overwhelming probability) that the CD is an exact duplicate of your image.

Summing the image is easy:

$ md5sum myfile
Summing the CD-ROM is more difficult, because you'll get the end of the last sector read out, followed by I/O errors, if you try md5sum /dev/cdrom. There is a cdcheck utility in /homes/rrw1000/public/cdcheck2 that takes the name of the CD-ROM device and the number of bytes in the image and does an md5sum for you:

$ /homes/rrw1000/public/cdcheck2/cdcheck myfile length
length will typically be somewhere in the 500000000-650000000 byte range.

The executable above is compiled for Lab linux systems: if you need it for another architecture, copy the contents of /homes/rrw1000/public/cdcheck2, and do gcc -o cdcheck -g -O2 *.c -lm to build it.

As it sums the file, cdcheck will give you a running bandwidth in k/s. On modern CD-ROM drives, this will drop as the disk is scanned from the inside to the outside, but any sudden `dips' are indicative of large numbers of retried reads, and hence a slightly dodgy disk: you may also find warnings about this in your kernel logs.

If the sums out of md5sum and cdcheck agree, your disk should be fine. If they don't, it's likely corrupted - contact rrw if you want to investigate further.


Audio and multi-track data CDs

Each file given to cdrecord is interpreted as a track, so if you want to write multi-track CDs (perhaps an ISO9660 track followed by tar followed by ext2), do something like:

$ cdrecord dev=1,0 speed=2 -v track1.iso track2.tar track3.ext2
... but note that most systems will only mount the first track on the CD without special software.

You can record Red book audio tracks by putting them in CD-DA format (16-bit 44.1Ksample/s stereo data with byte order MSB left, LSB left, MSB right, LSB right, ..., and must be an integer multiple of 2352 bytes long). You can also record .wav or .au files (in stereo 16bit 44.1Khz, naturally): see the cdrecord manpage for more, and prefixing them with '-audio'. Data tracks should then be prefixed with '-data':

$ cdrecord dev=1,0 speed=2 -v -data track1.iso -audio track2.au -data track3.iso
If you want to do this, please read the cdrecord manpage (in /local/scratch/rrw1000/cdrecord/cdrecord-1.6 on snow if not on the Lab's systems). Mail rrw if you have any problems.


Richard Watts <Richard.Watts@cl.cam.ac.uk>
Last modified: Wed Dec 9 19:35:51 GMT 1998