One thing I've done a few times for banking clients is to build systems for quality control and release of large sets of XML Schemas. Just as you check code before a release - compile it, test it, run style checks over it - you also need to do the same kind of checking for XML Schemas. You test that they are valid Schemas, you regression test them using XML examples, you run style checks over them, and perhaps also generate code or other artefacts from them. What you do, whether it's for Java/C# or whether it's for XML Schemas, is to check some kind of IT resource using some set of rules. You can call it validation, but that's just a long word for checking.
If you are checking XML documents using a set of structural/formatting rules defined by an XML Schema, you call it validation. People are used to validators that check XML and log errors, just as they are used to compilers that check code and log errors. It's very familiar, it's a common model for how such tools work. However, I've found it to be a problem when I'm building XML Schema checking frameworks. Why?
One reason is that, with checks like style checks, today's disallowed style is tomorrow's allowed style, and vice-versa. As organisations develop their XML Schemas over time, they adjust their style checks. Type extension or substitution groups might be banned today, they might be allowed tomorrow. To build a flexible checking framework, you want to be able to change quickly the interpretation or the severity of a particular feature, of a particular style. This is why I found it best to break things up into the three stages - "Interrogate, Report, Act".
- Interrogate
- Understand the resource(s) you are checking, the structure, the formatting, the style, whatever you need to understand. Don't make any judgements at this stage - the judgements are the things you need to be able to change, so you don't want to embed them into the interrogations.
- Report
- Put the results of your interrogations into a common reporting format. If you have XML Schema validation results, Schematron validation results, ad-hoc XQuery/XSLT validation results, and Schema compiler validation results (as an example), they won't all be in the same format. Some might be XML, others will be text. Get them into a common format for reporting. That's important so that you can slice and dice your interrogation results, and display them in a consistent way that gives developers, testers and managers the most appropriate summaries. It's also important so that you can easily add new results without impact on your presentation and drill-down code.
- Act
- Once all interrogation results are in a common reporting format, make your judgements and perform any consequent actions. Throw exceptions, log errors or warnings, or choose to ignore particular results if that is what the system's users have configured (there can be good reasons for ignoring particular test results for particular resources - but for sanity, make sure you capture the reasons for doing so in the configuration information).
This is the layering of concerns that has worked well for me. I've used it for XML and XML Schemas, but it is a general approach that can be applied to any kind of validation or checking process. By not making too early a judgement of thumbs up or thumbs down, you will have a more flexible checking framework that is more easily configurable and extensible than it would otherwise have been.
Recent Comments