Skip to main content

On Defining Messages

“Defining Message Formats” is the title of a message posted on the Service Oriented Architecture mailing list [1]  which attracted a lot of attention.  The post summarizes  the dilemma faced by solution architects when they have to define a service interface:


1. Base the internal message format on the external message standard (MISMO for this industry). This message set is bloated, but is mature, supports most areas of the business, and is extensible for attributes specific to this company and its processes.
2. Create an XML-based message set based on the company's enterprise data model, which the company has invested a great amount of money into, and includes a very high percentage of all attributes needed in the business. In a nutshell, we've generated an XML Schema from ER Studio, and will tinker with that construct types that define the payloads for messages.
3. Use MISMO mainly for its entity definitions, but simplify the structure to improve usability. We benefit from the common vocabulary, but would ultimately define dozens of transactions over the coming years that MISMO has already defined.

I am arguing in favor of a variant of option 2 and provides details of how it can be implemented.

First, contrary to some architects recommend, I don’t think that an industry standard is a good option for internal integration. These standards tend to be bloated, ambiguous semantically or too generic. But most importantly, they tend to have one, grand view of the land when in fact different actors see entities In different ways. For example, in telecom, in the service layer a Service entity “uses” a Resource while in the resource layer, a Resource “hosts” Service(s). The TMF model models the first relationship but not the second.

Option 2 relies on the fact that the organization has an enterprise data-model. Sometimes there is a formally defined enterprise data-model with a significant effort allocated to maintenance and governance. But more often there is no formal enterprise data-model, instead the collection of all the data-models is the “de facto” enterprise data-model.

Data modelling can be done at several levels of abstraction: conceptual (or ontology), logical and physical. Another way to categorize the data model is by scope: often different sub-systems have different data-model. I call a “shared data-model” the intersection of the sub-system data-models. The “shared conceptual data-model” is in the realm of the enterprise architecture and business architecture. It translates in a shared logical data-model which is used by integration architects to create the service interfaces (XSD, Java or C# classes).

The “shared logical data-model” is the starting point for the definition of messages exposed by a service. But how to get from the data-model to the message format?

In the same message thread, Michael Poulin suggests [2] that each component can/should have a “personalized” view of the shared logical data-model. Based on the recommendation, the message format is the physical data-model (XSD) of the logical view used by the component. This is a sensible approach for message definition and often used in practice. The challenge is creating views of a logical OO model. While the concept is very common to the database area, it is not used at all in the OO domain.

The decision on how to define the API is project specific. In my experience I found that starting from a shared logical data model of the critical applications in the ecosystem to be a practical and extensible approach.

References:
1 Defining Message Formats. http://tech.groups.yahoo.com/group/service-orientated-architecture/message/14783.
2 Re: [service-orientated-architecture] Defining Message Formats. http://tech.groups.yahoo.com/group/service-orientated-architecture/message/14784.

Popular posts from this blog

Performance Testing a New CRM

This article is a summary of the approach and lessons learned during the performance test a web-services implementation of a new CRM system.

On Data Consistency Pattern

 Data consistency along with Multi-Step Process and Composite Application are the most common integration scenarios. At a very high level, multiple components manipulate and store the same entity. When one component modifies the entity the others must be notified of this change and ensure a consistent state of the entity across the whole system.  In real applications, to address the complexities around keeping the entity consistent in a distributed system, the business and system architects impose constraints that simplify the patterns of interaction and lead to a robust solution. The rest of these article contain typical approaches to Data Consistency. Depending on the type of entity there could be design variations but they all share the approach. To make this document more concrete, I will use the Customer entity as an example. One solution is to centralize all the Customer changes in one component, the “Customer Master,” and all the other component keep read-only copies of the enti