RSS / XML feed

Evolving DSLs


October 17 2006

Recently I have heard several people musing about the difficulties of evolving a DSL. Actually, musing might be being rather polite - rather, I have heard several people arguing vehemently that creating a DSL inevitably leads to doom when the requirements for the DSL change. This is a very interesting point, because in my opinion the ability for a DSL to evolve is critical.

Paul Hudak got a lot right in a sequence of papers he published on DSLs in the mid-90's (including this web-friendly version). Specifically he notes that DSLs generally start tackling a small problem, then need to grow bigger as more aspects of the problem are tackled. [He also noted - and I think he may have been the first to articulate this so eloquently - that DSLs eventually tend to evolve into a badly designed general purpose language, but that's beyond the scope of this entry.] The way in which this evolution of requirements happens is almost always unpredictable, because it is the act of building and using the DSL that gives users the insight to change their requirements. In my mind, Hudak is entirely correct; I believe that the terms DSL and need to evolve go hand in hand.

The question one has to ask oneself is thus fairly simple: are DSLs hard to evolve? I think the answer is that, yes, today DSLs are hard to evolve. Why is this? Well, there are two types of DSL in common use. The first is standalone (often called external) DSLs such as Make. The second is integrated (often called internal) DSLs such as those that Hudak talks about, or those that are frequently talked about in conjunction with Ruby. These two types of DSLs are fundamentally different. Standalone DSLs are flexible, but an awful lot of work to create, because in a sense they're a complete implementation of a mini-programming language. Integrated DSLs aren't very much work to create, but they tend to have an unhealthy coupling to their host language, which means they have many limitations imposed upon them both in terms of what they can express and how they can express it. The difference between most standalone and integrated DSLs is so severe that I sometimes wonder if the umbrella term DSL is entirely helpful, but that's an argument for another time.

The irony is that standalone and integrated DSLs share one thing in common: their implementations are typically hard to change. The reasons for this are rather different. Standalone DSLs have fairly large implementations, and are subject to all the problems that any non-trivial implementation suffers from e.g. the interaction between components is often complex and brittle. Integrated DSLs on the other hand are often small but rather hackish in nature; they frequently rely on stretching often somewhat obscure language features to near breaking point. At some point, either the language feature can be stretched no further or, worse, the whole hackish facade comes crumbling down. Please don't get me wrong: I enjoy a cunning hack as much as the next man, but cunning hacks are not what I want to base a whole approach on.

In my opinion, while DSLs are implemented in one of these two ways, they will always be hard to evolve. Therefore I agree with those who point out that, at the moment, implementing a DSL is an almost guaranteed way of giving oneself huge problems when evolution rears its inconvenient head. My argument is that the approaches I've outlined above are fundamentally flawed. What one wants is a way of implementing DSLs with relatively little code but which don't rely on abusing language features. Since small, well written programs are generally considered fairly evolvable, this should give one a reasonable chance of having DSLs that are evolvable. This has been one of my goals in the recent new version of Converge; it's far too early to tell if that's been achieved yet, but I'm fairly convinced already that this approach is, at the very least, no worse than the traditional approaches.

Link to this entry

All articles
 
Last 10 articles
Free Text Geocoding
Extended Backtraces
Designing Sane Scoping Rules
Some Lessons Learned from Icon
IEEE Software Special Issue on Dynamically Typed Languages
How Difficult is it to Write a Compiler?
When Are Macros Useful?
Filling in a Gap
Are Multicore Processors the Root of a New Software Crisis?
The High Risk of Novel Language Features
 
 
DSLs
Martin Bravenboer
Eelco Visser
 
Modelling
Grady Booch
Steve Cook
Keith Duddy
Jack Greenfield
Steven Kelly
Stuart Kent
Michael Lawley
Kerry Raymond
Jim Steel
Alan Cameron Wills
 
OS
Marc Balmer
Mike Erdely
KernelTrap
OpenBSD Journal
 
Programming
Artima
Peter Bell
Gilad Bracha
Ian Cartwright
Code Generation Network
Bram Cohen
Adrian Colyer
Bruce Eckel
Jonathan Edwards
Daniel Ehrenberg
Fabien Fleutot
Chad Fowler
Mark Guzdial
Elliotte Rusty Harold
Jeremy Hylton
Ralph Johnson
Ralf Laemmel
Lambda the Ultimate
Patrick Logan
Niclas Nilsson
Keith Packard
Havoc Pennington
Guido van Rossum
Keith Short
Software Engineering Radio
Diomidis Spinellis
Markus Voelter
Phil Wadler
Eugene Wallingford
Marcus Widerberg
Steve Yegge