7th International Python Conference
Trip report by Frank Stajano
Having had not one but three contributions accepted, I had no excuse not to attend the 7th International Python Conference in Houston (actually League City), Texas. About 100 people. Mostly white males: women, blacks, asians were all < 5% each, which I found slightly disappointing -- I hoped that Python might cover a broader spectrum. No suits. "Strong refereeing process" for the papers, said the program chair, compared to previous years.
For those who don't know yet, Python is a high-level scripting language like Tcl and Perl. It is elegantly designed and very readable (thanks in part to its indentation-based syntax, à la Occam). Like other scripting languages it offers powerful primitives (sort, regular expressions...) and data structures (lists, associative arrays...). As I argued in the final section of the paper I presented, it is better suited than Tcl to the development of large software projects thanks to its object orientation, its support for modularity and embedded documentation, and particularly its rich standard library, which makes it even more of a practical very-high-level language. Compared to Perl, it is just as flexible and versatile, but it's easier on the brain: you can read the source code and it makes sense, and the language has a coherent design. Biased opinion, of course, but what else did you expect?
Python is younger, less used and less known than Tcl. This is something
that I palpably perceived at the conference, much more so than in the
newsgroup. The user community is quite small, still mostly geeks and very
few big players. Still, I rate it as a good penny-share bet: the language
has the potential to grow into something very successful. It has a more
bazaar-like slant than Tcl, which means that it improves at a faster rate
(if you haven't read Eric Raymond's eye-opening The
cathedral and the bazaar paper on the development patterns of free
software, do so: it provides unique insights).
As I discovered at the conference, on the organisational front Python has
the backing of Albert Vezza (ex MIT - Laboratory for Computer Science), now
Guido van Rossum's boss at CNRI. Vezza was behind the X Consortium at MIT
as well as behind the W3C -- he was the one who hired Tim Berners-Lee out
of CERN. Past history shows that he knows how to successfully fertilise,
support and fund major computing projects in the spirit of Open Source. The
fact that he has now started the Python Consortium (so that
Guido can feel free to hack Python all day instead of having to do other
bread-earning stuff) is a good thing for Python's future.
Speaking of Guido: to properly pronounce the Master's name, first render the "G" as the "throat spit" phoneme (like the X in TeX if you follow Knuth's own instructions, as in "the terminal may become slightly moist"), then ignore the "u" and finally say "ido". But few do, even among his colleagues... Clearly for most people there's too much inertia against using a phoneme that's not in the set from one's own language.
The first day was devoted to tutorials: one (out of 4 parallel tracks: Intro, Numeric, CORBA, COM) in the morning and one (out of three: SWIG, JPython, XML) in the afternoon. I attended COM and XML respectively, but later bought the course material for all the others too, from the leftovers pile. ORL people: you're welcome to borrow any that you're interested in, as long as you promise to return them.
The COM tutorial by Python celebrity Christian Tismer (webmaster of Starship Python) taught us an immediately practical and useful skill -- how to control the Microsoft Office applications from a Python script. In theory, any other Win32 application can be so controlled, including Netscape, Internet Explorer, Photoshop, Corel Draw etc. In practice, though, the API exported to the COM level (and thus available for scripting) is rarely documented in the application unless one buys that application's developer's kit. For the programs in the Office suite one way to find out what the available methods are is to go into Visual Basic for Applications. Christian is working on an inspector tool to be used on other programs: he will e-mail it on a CD together with extra goodies to the people who were at his tutorial. As an alternative, one can have a peek at the automatically generated Python interface module, though some dynamic stuff may be missing. As an example, the following code (nothing omitted!) will start up a copy of Word, make a document in it and write some text in the document.
>>> import win32com.client >>> word = win32com.client.Dispatch("Word.Application") >>> word.Visible = 1 >>> doc = word.Documents.Add() >>> doc.Range().Text = "Hello world"
Lots more info along these lines for Excel and Access. Overall a very worthwhile tutorial giving practical and useful skills.
The XML tutorial by Paul Prescod and Sean McGrath was mostly an introduction to XML (extensible markup language, HTML's big brother or more accurately SGML's little brother) and marginally an overview of what the available Python XML tools let you do. (This suited me very well, by the way -- I wouldn't have liked a fully pythonesque tutorial assuming full prior knowledge of XML.) Paul is co-author of The XML handbook; Sean is the author of XML by example: building e-commerce applications and of PARSEME.1ST: SGML for Software Developers, as well as an invited expert to the W3C's XML SIG.
XML is a flexible textual markup language based on nested tags. It aims to be a portable (OS-, architecture- and language-independent) file format for structured documents and to become a transparent and ubiquitous infrastructure for carrying documents, just like ASCII, only at a slightly higher level. For example, a word processor could save the document and its structure in XML instead of its proprietary format.
XML plus XSL (rendering) can replace HTML. Initially, browsers won't interpret XML directly but you can still hold the source in XML and render it to HTML before making it available (or dynamically on the server). The same XML source can generate HTML, LaTeX, text, Postscript, PDF and so on.
The DTD (document type definition, which you can define yourself) specified what is valid in a given XML document. It defines a comparatively simple grammar (hand-writable and machine verifiable, but much easier than YACC's stuff) against which the document can be checked and declared valid or invalid. Contrast this with HTML (lots of proprietary extensions), Word's RTF (on incorrect documents, Word crashes instead of reporting an error), TeX (too complex: you can't even parse it without the full TeX engine).
The language offering the best support for XML is currently Java. The second-best is Python. If you are into SGML/XML you want to know about James Clark (not a Python person yet) who has written the best parsers that exist, is technical lead for the W3C's SGML activities and is generally regarded as God by Sean. Python tools exist (included in the newest distributions, of course) that eat an XML document and parse it, returning a tree and/or invoking user defined callbacks whenever certain elements of the document occur.
Overall, a good tutorial giving an understandable overview of the XML ideas (and alphabet soup). XML looks like a technology to watch, and possibly even use, although early adopters may have to spend time writing tools (to pull all the bits together) instead of contents.
Other tutorials, which I couldn't attend, introduced among other things FNORB (a CORBA ORB written in Python) and JPython (Python reimplemented in 100% pure Java, with amazing integration and access to the Java library from Python). There was an invited talk on the latter the next day (part of the reason why I went to another tutorial), which was very interesting. More on that below.
The paper presentation sessions occupied most of the next two days. There were 13 refereed papers and 4 invited talks. The invited talks were all great -- shame that the organisers didn't ask the invited speakers to write them up so that they would also go in the proceedings!
At any rate, you'll find the first (and, to me, best overall) talk written
up on the author's own web site: Homesteading the
noosphere by Eric Raymond was
a fascinating exploration of the reasons that motivate people to give
software away and of the social customs that the community evolved to
protect and reward this gift culture. The presentation was very interactive
and ESR said he would incorporate some of this feedback (notably Paul
Dubois's intuition on "young bachelor predators") in a revised version of
his paper (which he regards as continually evolving anyway).
My own talk was, I'm pleased to say, very well received by the
audience. Out of an attendance of about 100, at least 30 came to me
individually over the next couple of days, after the talk itself or at
meals or informal gatherings, to say that they had found it interesting and
entertaining. And my meme of "Python is great
because it comes with batteries included" caught on pretty fast, with other
pythoneers adopting it there and then and reusing it as now-established
jargon (it's now even the title of Cameron Laird's
SunWorld OnLine column
covering the conference). ESR later told me that he had been one of the
reviewers of my paper and had recommended that my talk be given a
"prime-time" spot because he felt it was going to be good. Cool! (You can read the paper online in HTML or PDF, or read
the slides from the talk. If you're
interested in the short message and websucking stuff, as opposed to the Tcl
vs Python and batteries included stuff, you will also want to read my other paper for ACM Mobile Computing and
Communications Review, which presents the system's architecture in greater
detail.)
Contrary to my expectations, though, nobody took up the hacker's selfgen challenge to produce a Python program that would generate a Tcl program that would generate the original Python program, though Guido suggested the variant of writing a self generating program whose text was at the same time legal Python and legal Tcl.
The next talk, by Ken Manheimer, presented Mailman, a versatile mailing list manager written entirely in Python which has been adopted as the official GNU mailing list software. The administrative interface, both for the list owner and for the individual users, is web-based and looks very pretty and functional. The software looks quite interesting and I intend to give it a try.
Sean McGrath gave substance to his previous tutorial by presenting a real-life case of XML and Python at work: capturing multi-gigabyte legacy data (Proceedings of the Irish parliament, 600 volumes, 125 feet of shelf space) into a form suitable for electronic publication.
Guido van Rossum talked (with great modesty) about what he sees in the
future of Python. Dismissing the title as a joke (nobody can predict this
far ahead in computing), he gave a rough schedule of the next few releases:
Tkinter threading is ok on Unix, broken on Windows: "I understand exactly why it hangs, but I have no clue on how to fix it (without hacking Tk to make it work, which I don't want to do)".
What's the point of reimplementing Python in Java other than
buzzword compliance? Especially since it will probably run like slow
molasses, if at all? Well, let Jim
Hugunin have a go at it and you'll be impressed. What you get is access
to all the libraries written for Java and portability to any platform that
has a JVM (including browsers, so yes, you can write applets in Python and
they will run without the need for a plugin). But what is most amazing is
the raw speed of Jim's code: not 1/50 or 1/10 but a very respectable half
the speed of CPython!
Another interesting aspect of this work is that it provides Python with another plausible implementation, which helps immensely in defining what is language law and what is implementation accident, something that Python needs to mature, as a real language, beyond the stage of an ad-hoc brilliant hack. Finally, some of the optimisation techniques pioneered by Jim (including ML-style type inference) point the way to a potential >10x (10x, not 10%) performance improvement, possibly at the price of minor language changes towards partial static typing.
As part of the Stajano World Tour 1998 I had to leave the conference early
to give a presentation in another continent for the publication of my second book
on Walt Disney comics, so I missed the last couple of presentations
(including an intriguing talk on the Mayan calendar) and, to my greatest
disappointment, the developers' day.
For the demos session, everybody was crammed into a comparatively small room. Exhibitors were given either a table or some wall space, depending on whether they had a demo or a poster, and visitors took turns looking at the stuff. I presented two posters: Nothing better than a Python to write a Serpent and VCK: the visual cryptography kit (the links let you download the abstracts that appeared in the proceedings, the posters themselves and the related free software). I spent the whole session demonstrating and explaining the two projects to interested groups of visitors, so I didn't get much of a chance to properly see the work of the other exhibitors, which was a shame.
I did however catch a glimpse of Jason
Asbahr's amazing Beyond demo, a video game framework based on a 3D
virtual world similar to a medieval DOOM on steroids, in which the game
logic was controlled by Python. Having been heavily involved in DOOM stuff for years in a
previous life, my jaw dropped.
I also had a peek at Jeff Bauer's demo of PythonCE (written by Brian Lloyd, who wasn't there) on various palmtops. Exciting! Exciting! (I've been carrying a palmtop in my pocket for years, though I'm trying to stay away from the beasts right now because of RSI problems.) And Jeff added a remote interpreter so that, as a developer, you can work on a proper keyboard and screen while genuinely running stuff on your little CE machine (not just under emulation). He also started a mailing list.
Meeting all those great Python people face to face was probably the best reward for coming. I collected a neat little stack of business cards, including the brilliant fanfold ones from Håkan Karlsson and Fredrik Lundh of Secret Labs / Pythonware, makers of PIL, edible little blue dolphins and many other fine things...
Back to Frank Stajano's home page at ORL or at the University of Cambridge
validated (recheck)