The Muddle of UML
At some point in 2008, I began to wonder whether UML was worth the effort. Maybe it’s taken me longer than the rest of the tech community and everyone else out there is laughing at me for continuing on using it, but there was finally a light bulb going on that said: "surely there could be something better?"
This article explores where my frustrations come from. I'll say upfront that perhaps this isn't constructive criticism. And it is criticism: I'm focusing on the bad stuff here rather than any of the good stuff. That's clearly unfair, but hey, this is already going to be a fairly long article. So, if you're still intent on understanding what I think the main problems are in UML, then read on.
Now, don’t get me wrong- I love diagrams. At least as much as the next man, and probably a lot more than the next man is likely to admit. In fact, if I’m the one holding the marker pen in the meeting, the next man will probably be revising downwards his love of diagrams as the whiteboard becomes ever more crowded with multicoloured lines.
If there’s one sure-fire way of taking over a meeting, then it’s to have a diagram. It is almost supernatural in the way a diagram can hypnotise a group of sane people into thinking about a problem in exactly the way you want them to. Visually, there’s usually little to focus on in a meeting, apart from doodling in the margin or looking for your colleagues facial ticks, so a diagram is entirely visual-bandwidth-hogging once introduced. Introduce the right diagram yourself and then all the discussion will tend to use the diagram as the starting point, even if to disagree with it.
This probably helps explain why, in the 90’s UML got so popular. Martin Fowler’s entire career seems to be based on his initial success with UML distilled, and the idea was really simple: let’s introduce a common visual language to the diagrams we throw up on whiteboards, so different sets of developers have something common to work from. However, that’s almost where the problems start.
Problem 1: Mystery Meat
Again back in the 90’s I was introduced to the expression ‘mystery meat’, which described a design flaw on website navigation. The idea was that each link on the site’s home page was an image, and when you rolled over the image with the mouse pointer the text would appear telling you what the link did. This was all achieved with the heavily over-used ‘onMouseOver’ and ‘onMouseOut’ HTML tags. Remember those? They were the days, eh?
You don’t get to use the term “mystery meat” a lot in life, so to be able to invoke it here is simply a joy akin to being reacquainted with a long-lost friend. UML has lots of mystery meat. There are diamonds (filled and unfilled), various different types of arrow (including filled heads and outline heads), dotted lines, stick men, boxes, ovals and so on.
No one in the world knows what any of them mean.
Actually, that’s not true. Many developers have a passing familiarity with some of them, and know that a triangle-ended arrow means inheritance whilst a black diamond means composition. Sadly, John from marketing was away the day they did UML notation so this immediately puts up a barrier between him and the diagram, and puts him off still further from coming to meetings with the IT guys.
It gets even worse when the IT chaps try to explain that the stick man with ‘Zeus’ written under it actually does not mean a man, or even a god in this context, but actually a legacy mailshot program which was given the name ‘Zeus’ and runs every fortnight to send out the bills. When is a stick-man not a man? When it’s an “actor” which is a different computer system. Great.
UML is full of this stuff, and it seems to pervade the whole community. Alistair Cockburn (who I have a lot of time for) introduced a further notation including fish, clam-shells, kites and clouds for marking up use cases depending on their position in a hierarchy of use cases. Great if you know what it means, but if the MD puts his head around the door during the meeting, some poor sucker is going to have to explain why the whiteboard looks like a child’s picture of a trip to the seaside.
Problem 2: UML doesn’t fit my language
UML was conceived initially for object modelling, and showing the relationships and dependencies between various objects in a system.
That’s all very well, but objects are not created equally in all languages. Pre Java 5, for example, there were no generics. That is, a collection could contain anything: cats, sausages, love, Wednesday. After Java 5, there were some efforts to stop this being the case. However, what this meant was that when I drew a diagram showing how a client can have many open complaints, this was not something I could enforce in java without extra code: the diagram could contain more information about the relationship than the code.
Conversely, there are things you can express in the code which don’t readily fit into a UML diagram. UML has the notion of packages. Great, are those the same as Java packages? Or Java Jars? Maybe.
What about components? If I’m describing a swing interface in UML (heaven forbid), should I be using components for my JTable? Now that we’re all using spring, about half my classes are components. Are these components in the same sense that UML understands them?
So the nub of problem 2 is that if we’re using UML to communicate to our fellow coders, or just people trying to understand what we’ve built, then there is a translation needed: you need to translate out of the native language that you wrote the code in, and turn it into UML, which everyone can read… and then back again.
Unless you had a really great tool which could do it for you, right?
Problem 3: Code Generation Sucks
Imagine my profound disappointment the first time I tried turning my UML diagram into a set of java classes (or was it database tables? I forget). What a mess. Stub code everywhere, tons of badly named variables, broken javadocs and crazy nonsensical package structures. This is going to take a while to clean up. Never mind, best get on with it.
The problem with code generation is that you don’t get something for nothing and garbage in equals garbage out. The second problem with code generation is that now you’ve got two versions of everything: the original UML diagram contains the model you spent hours crafting and the Java code which also contains the model you spent hours crafting… minus a few bits that didn’t make it through because the languages don’t fit (see problem 2). I can’t remember who came up with the really good software concept of “Once and Only Once” but this clearly violates that, because now its twice, it’s in two places, and it’s a hell of a lot less clear in the Java place than it was in the UML diagram.
Oh well, we can’t afford to worry too much about that – there’s a deadline on this after all and I have to introduce a new entity in the diagram that we didn’t think of before… and hey, where have all my code changes gone?
In the early days, this was a common story, with the generative tool randomly blatting the code it couldn’t understand, or because you’d missed some (mystery meat) javadoc tag to tell it to keep it.
With a few minor exceptions, generated code is usually a disaster, and expecting people to make changes to the generated code is usually an even bigger disaster. But what if we could somehow get the UML model to completely describe the system, or describe it so that there is very little to change after you’ve build the model? Sounds like a great idea…
I'm sure you can't wait until part two to see the remaining issues I have to get off my chest.