Skip to main content

On Defining Messages

“Defining Message Formats” is the title of a message posted on the Service Oriented Architecture mailing list [1]  which attracted a lot of attention.  The post summarizes  the dilemma faced by solution architects when they have to define a service interface:

1. Base the internal message format on the external message standard (MISMO for this industry). This message set is bloated, but is mature, supports most areas of the business, and is extensible for attributes specific to this company and its processes.
2. Create an XML-based message set based on the company's enterprise data model, which the company has invested a great amount of money into, and includes a very high percentage of all attributes needed in the business. In a nutshell, we've generated an XML Schema from ER Studio, and will tinker with that construct types that define the payloads for messages.
3. Use MISMO mainly for its entity definitions, but simplify the structure to improve usability. We benefit from the common vocabulary, but would ultimately define dozens of transactions over the coming years that MISMO has already defined.

I am arguing in favor of a variant of option 2 and provides details of how it can be implemented.

First, contrary to some architects recommend, I don’t think that an industry standard is a good option for internal integration. These standards tend to be bloated, ambiguous semantically or too generic. But most importantly, they tend to have one, grand view of the land when in fact different actors see entities In different ways. For example, in telecom, in the service layer a Service entity “uses” a Resource while in the resource layer, a Resource “hosts” Service(s). The TMF model models the first relationship but not the second.

Option 2 relies on the fact that the organization has an enterprise data-model. Sometimes there is a formally defined enterprise data-model with a significant effort allocated to maintenance and governance. But more often there is no formal enterprise data-model, instead the collection of all the data-models is the “de facto” enterprise data-model.

Data modelling can be done at several levels of abstraction: conceptual (or ontology), logical and physical. Another way to categorize the data model is by scope: often different sub-systems have different data-model. I call a “shared data-model” the intersection of the sub-system data-models. The “shared conceptual data-model” is in the realm of the enterprise architecture and business architecture. It translates in a shared logical data-model which is used by integration architects to create the service interfaces (XSD, Java or C# classes).

The “shared logical data-model” is the starting point for the definition of messages exposed by a service. But how to get from the data-model to the message format?

In the same message thread, Michael Poulin suggests [2] that each component can/should have a “personalized” view of the shared logical data-model. Based on the recommendation, the message format is the physical data-model (XSD) of the logical view used by the component. This is a sensible approach for message definition and often used in practice. The challenge is creating views of a logical OO model. While the concept is very common to the database area, it is not used at all in the OO domain.

The decision on how to define the API is project specific. In my experience I found that starting from a shared logical data model of the critical applications in the ecosystem to be a practical and extensible approach.

1 Defining Message Formats.
2 Re: [service-orientated-architecture] Defining Message Formats.

Popular posts from this blog

Performance Testing a New CRM

Performance testing  is challenging, frustrating, often underestimated typically getting attention only after an incident. How did we the performance test and what did we learn during the development and implementation of  web-services for a new CRM system?

What Can Category Theory Do for Me?

Category Theory is one of the hot topics in computer science. There are many blogs, youtube videos and books about it. It is an elusive subject, with the potential to be the ultimate unifier of everything (math, quantum physics, social justice), to allow us to write the best possible programs and many other lofty goals. I decided to explore the field and see how it can help me. Below is a summary of 12 weeks of reading and watching presentations about Category Theory. It took me some time to select the study material. In the end I decided to use David Spivak’s book (Category Theory for Sciences, David Spivak, 2014) and the youtube recording of a 2017 workshop. These provided both a rigorous approach and a lighter version in the youtube videos. In parallel we explored other sources for a more practical perspective. The CTfS book is a systematic introduction to Category Theory, with definitions and proofs. I liked that, it is a solid foundation. The examples are mostly from math (Set,