SOA and metadata data architecture issues
- Mark Skilton
- Sep 7, 2005
- 7 min read
The impact of data quality and data architecture are significant for successful SOA.
The integration of data models to enable the development of services.
Definitions
Service definition
Service between participants
A bus
A contract
Service taxonomy
Enterprise service
Business service
IT service –coarse grained
IT service – fine grained
Definition of a service
Contract
Exposed API
Public Schema
Business logic
Definition of data within that service concept
Data semantics and syntax
Definition of the service function
Definition of business logic
Input and output posting conditions
SOA from a data architecture point of view may be message based services or passing document based services.
All messages carried by services are described and constrained by the integrated data model, and expressed in XML Schema. This metadata is very high semantic and strategic value because it described the business process interaction content and rules.
In Message based SOA the best practice is to achieve optimum efficiency in the development cycle by :
Externalising the schemas
Exposing the meta models
Standardizing
Federating
A to Z of SOA
Insert the top 10 things of SOA
Insert the top 10 things of SOA failure Performance curve of SOA
Dimensions
Building optimal efficiency in SOA involves a number of competing dimensions
Establishing a critical mass of Services through reuse in the development cycle
Constraining/leveraging the business processes of an organisation though making architects, developers, project managers use a coherent set of XML schemas and to drive all service development from those schemas.
Establish a performance architecture that enables.
Balance of extensibility of message formats to a point where they become unmanageable
{insert volume, performance vs SLA targets matrix SOA performance curve}
Granularity
Granularity refers to the identification and management of specific components that can be combined to make services. See Service taxonomy.
In granularity the impact on performance is the balance between defining and managing fine grained IT services and the deployment of these services through to specific end point channels be it either a Portal; web site, business process engine etc.
The choice and method of services use affects the design and performance issues for a architecture. The SOA perspective here is that services are a core element of the approach and that the link between IT services and their use at the end-point channels is in effect the SOA deployment architecture – the services framework composed of different technology layers: data, database, network, integration servers, applications etc. The level of complexity of services flow between these affects the performance and SLA dynamic of the solution. Granularity is often defined in the SOA terminology in design time through meta models. A more difficult challenge is managing and reusing the run-time components of service. (see other article on data architecture and SOA)
{insert a conceptual service flow model here}
Coupling
Tight versus loose coupling refers to the degree of connection between different components of the architecture; particularly around message interactions between applications. Identification of this is more typically aligned with the B2B solution where loose coupling between trading parties is a typical feature. Coupling refers to the act of joining together and in this context the level of interdependency and flexibility this brings.
Two rules typically affect this are:
The level of multiple interaction a message or component
Where specific interactions are established there may not need to be loose coupling
The complexity of the interaction itself
Where there are complex control and data structures then the level of coupling may be tight in order to minimise the level of
Tight versus loose coupling

Designing for performance in SOA
So does this mean SOA is applicable to all architectural situations? How can performance be achieved in SOA projects?
The assumption here is yes it is in that it SOA is a strategic paradigm first and foremost.
Coupling can be seen as referring to features of specific types of service patterns within a SOA framework.
Specifically the design of service components can then be driven through tightly coupled architecture.
Organisationally there are issues of governance with multiple teams of developers and project teams collaborating on developing services. By managing this through a definition of schemas and models and driving services form this establishes the model for SOA development management.
The service interaction
Between Client and Server (sender and receiver; producer and consumer)
This is a transaction between the two entities. This transaction needs a definition of the type of transaction and the operation that is required to be performed. E.g. create customer account; transfer money, receive order.
There are two main ways to do this:
Interface semantics
The requested activity is encoded in the operation of the server component’s interface
Payload semantics
The requested operation is embedded in the message it self
Interface semantics
Use self descriptive names e.g. SaveCustomer(), retrieveCustomer(), transferMoney().
RPC style interfaces provide this type of semantically rich interface
Payload semantics
The requested transaction is embedded into the message
This can be done in two ways
as part of the message header
as part of the application specific payload
This is widely used in MOMs that provide APIs with functions such as MQGET() / MQPUT or sendMessage() / onMessage() / receiveMessage()
The semantics of these are purely technical.
Interface semantics vs Payload semantics
Interface semantics
Provide well defined interfaces that are easy to understand
Changes to these interfaces require modifications to all applications that depend on the particular interface
Payload semantics
Changes to message formats can have lesser impact on different components of the system
New functions can be added to a system by creating new message types. (Consumers not dependent on a message type remain unaltered.)
Payload semantics have a weaker coupling at the type level
Document centric messages
Self-descriptive data structures such as XML are an approach to handling document centric messages
Document centric messages are semantically rich messages where the operation name , its parameters, and the return type are self-descriptive.
The passed parameters and returned object can be extremely flexible and can include any number of optional parameters. SOAP is particularly suited to this type of solution.
As long as the XML schema impose only loose constraints on the document structure then parameters can be extended in any manner without breaking compatibility with previous software versions.
Document centric messages – XML centric schemas
Advantages
Extensible with minimal impact to applications
Is a pervasive standard with organisations invested heavily in XML metadata driven systems.
Disadvantages
Each extension makes the schema increasingly more complex
The format can increase payload record size/volume and affect performance of the solution where minimal formats could reduce message volume size.
Managing the complexity if the schemas can reach a point where they become unmanageable
The Data implications for Organisations implementing SOA
While the management of the Service message definitions can be developed.
Management of the metadata is less easy. This area is less well served in the IT landscape.
Metadata representing the physical data and the logical data representations evolve with business process development in organisations.
The ability to manage changes to the data models and to deploy these versions through to all instances using that data.
This metadata evolution management occurs in both multiple development teams in side organisations. It also occurs from ongoing business changes internally and from ongoing external integration and business changes.
This is not a schema versioning problem but about the scalability of metamodels to meet business needs and services execution.
Establish an enterprise data dictionary
Facilities for loading existing metadata as an integrated model
Methods for removing duplications and redundancy
Tools for supporting data use in model-driven architectures
Control tools for release management and collaboration.
It is essential to have the ability to control and syndicate metadata in a managed way. For the quality of services in an SOA are directly affected by the quality of the data and its semantical relevance.
The extensible Service Structure
Top level: WSDL and Policy metadata define
These tend to be more static and define the interaction of the service.
Bottom level: The payload
In document centric payload the schema is implicit and does not need an XML schema to define it.
In message centric payload then a schema is needed XSD.
Advantages of externalised schemas for message-based SOA are
Enforceable contracts fro processing behaviour
Visible specifications for developers
Public interfaces for new partners in the SOA
Schema-based access to standard infrastructure such as parsers, transformation engines, etc
Insulation for services from changes to schemas
Support for business analysts when planning changes
Disadvantages of a metadata driven application environment are the same as the limitations of metadata and XML schemas
XML schemas describing payloads are application specific, bespoke metadata that is subject to change , manual involvement
Schemas change and with the large volume of families of schemas and associated assets such as transformations etc this presents a significant maintenance management problem
e.g. Order-to-invoice trading transaction involving multiple players.
Externalised Schemas and transformations
one for each service
Describe and reference same data objects every time
Service Abstraction strategy
The Interface WSDL is exposed in a reusable fashion
Data Abstraction strategy
Making the schema as generic as possible
Value/pairs
Versioning and Impact problems
Managing XML infrastructure
Paradigms of architecture
Establishing an approach to your organisation can driven the architectural consideration from different perspectives.
Process driven architecture
Driven by business process design
Typically utilising a BPMS solution
Service driven architecture
Driven by application and integration dynamics
Typically driven by multi-channel formats and composite portal and web site services
Event driven architecture
Driven by high interaction, high volume transactions
Typically driven by high availability performance architectures
Looking at aspects of process, service, people, portal, data and other dimensions.
There are elements that all need to be considered in developing and enterprise architecture
Depending on your choice of technologies driven the architectural paradigm and the types of services your develop.
Scenarios
Enterprise Services
Developing an B2B and collaborative exchanges
Business Services
Development of process interactions through a business process management system
Use of process fragments
IT Services – coarse grained
Development of Integration services that are syndicated to multiple users
The establishment of service bus technologies
IT Services – fine grained
Development of COTS and other platform based web services and message exchanges
Comments