Re-use is a mantra for a lot of software developers. The key idea is that where the same functionality is used in different places, you should centralise and share the necessary code so that it only has to be maintained in one place. This means that when a change needs to be made, that change is automatically applied in all the places that the functionality is used. This is well known stuff, I'm not telling you anything new here.
Robustness is something that nature is very good at. Animals and plants show a great adaptability to new and changing conditions. Nature makes use of re-use as a technique. Our DNA contains much in common with that of most of the animals on the planet. However, there is a big difference here from software re-use. Nature does not centralise the DNA. Nature copies-and-pastes DNA from one animal to the next, and then lets it change from there. If a species of animal suddently evolves wings, that doesn't mean that you or I will suddenly evolve wings, because we are now working from separate copies of DNA.
Am I suggesting that nature has it right and software developers have it wrong? That instead of centralised re-use, we should do copy-and-paste re-use? Well, yes and no. What I am suggesting is that both of these techniques are necessary in software development, and it is important to think about which is the better re-use technique in different circumstances.
There are various ways to implement centralised re-use of software. The traditional way is to share libraries of particular functions. With the rise of object-oriented programming languages, inheritance mechanisms are now a favoured way of sharing functionality. Other re-use mechanisms can be more subtle. For example, sometime a single XML schema is used to represent a superset of many messages that can be sent between systems. That is a kind of sharing; sharing and re-use are just two names for the same thing in this context.
So what is wrong with centralised re-use, then? Well there are a couple of things to watch out for. A centralised library of functions means that you can apply bug fixes to just one place in the code. That's good, right? It would be good except that it also allows you to introduce new bugs that can impact everyone who is sharing that library. When your library is used by mission critical systems, that can be an unacceptable risk.
Suppose you have 10 applications that share a library, but each only uses 50% of the functionality in the library. They won't all be using the same 50%, of course. There may be a bug in a function that affects only 1 of the systems. If you roll out a new version of the library, you risk introducing new bugs that could affect all 10 systems. Is it worth affecting 10 systems to fix the bug in 1 system? In some cases, it just won't be worth it. This is the kind of situation to look for, where you should follow nature's lead and copy-and-paste instead, at least for the functionality that should no longer be shared.
The rule here is that some of your current shared libraries will need to be forked in future, i.e. cut-and-pasted to produce multiple versions of some of the previously shared functionality, each version catering to a different use. Even without the risk driver for this approach, it is also well known that if you have too many separate sets of requirements for a single shared piece of code, what you end up producing is complex and hard to maintain. Your shared code may not be able to be all things to all people. A key reason for centralisation is to simplify maintenance, but you have to watch out for centralised code which has become so complex that it is harder to maintain that would a set of separate and simpler copy-and-pasted versions of the code.
For the XML world, the single monolithic schema approach to message design has the same issues. If systems have to share a single XML format for many functions, then the format has to change whenever just one of those functions change, and the change could impact every system unnecessarily. Having many small schemas instead of one large schema may sound like more work to write and maintain, but the payoff comes when you can change the schema for just one system without having to change the schemas that affect the other systems. That means you avoid having to re-test all of the other systems, and you avoid introducing any extra operational risk to those other systems. Testing is never a 100% catch-all, so if you change anything, you can't avoid the risk that you will introduce a bug that will bring down a production system, possibly with expensive consequences.
There is clearly a balance that you need to find. My personal solution is to start by sharing code, because centralisation is good wherever different systems really do want the same thing, and really do want the same bugs fixed at the same time. As soon as there is too much divergence between the requirements of the different clients of the code, that is when you need to fork the "DNA" and do some managed copying-and-pasting.
Some of these issues were touched upon in a presentation I gave on "Schema design when the goal is loose coupling" at the recent XML Summer School in Oxford. They will also be part of my talk at XML 2005 in November.
Why mention this at all? Simply because in my day-to-day consultancy and standards work, I continually come across people who insist on sharing code or using a single monolithic schema, even when the complexity of doing so is clearly hurting them. Sometimes you have to put those textbook mantras aside, and really consider all of your options.
Recent Comments