EA and Agile Projects

A recent comment on my post about the value of enterprise architecture suggested that enterprise architects should be directly involved in Scrum projects. In that post I suggested that the primary value of EA is “continuity” and “self-correction” across all the IT initiatives. If these qualities represent the value of EA, then the mechanism for realising that value must involve communication.

Communication is the key to ensuring that IT project deliverables provide cohesion across organizational, geographical and temporal boundaries. I have previously written that communication and governance are effectively the same thing. The problem is that “governance” needs to overcome the image of heavy-handed, top-down, hierarchical “rules” handed down by “out of touch” enterprise architects – like commandments on tablets of stone. EA must be seen as an enabler, not a roadblock.

But communication and governance must also be relevant to the organizational culture. So if the culture embraces agile development methods such as Scrum, then on the face of it the governance structure ought to conform with agile practices which favour direct verbal communication over documentation. If it is true that an embedded business representative can communicate requirements more effectively than a written requirements document, then perhaps it is also true that an embedded enterprise architect can “govern” more effectively than a written EA document.

However, on reflection, I don’t think the analogy of EA as a “customer” of the project is quite as strong as I first thought. There are a couple of differences between “business requirements” and “EA requirements”.

First, business requirements for a project are communicated primarily to that project. Few if any other IT projects would be interested in these particular business requirements which we are trying to encode into an IT solution. However, EA requirements must be communicated consistently across multiple projects and consistency demands that the EA requirements – the “governance principals” – need to be encoded in some written form, independent of an individual architect or even team of architects.

Second, business requirements change very frequently within the timespan of an IT project. Indeed the development process itself can change business requirements. This is why agile methods value verbal communication methods over written – so as to allow for frequent change in business requirements. EA requirements will not change as rapidly. Even though the Enterprise Architecture will evolve over time, it should not change over the course of a single project. Indeed if your Enterprise Architecture undergoes sudden rapid shifts (e.g. adopting an overnight SOA mandate) then you have a recipe for IT disaster. Hence the agile “insight” that written business requirements cannot keep up with reality, is not quite as relevant when it comes to EA requirements.

So my response to Paul’s comment is still: Yes! I agree that an EA representation on a Scrum project would be a good idea – to communicate, elucidate and interpret EA requirements to the project and importantly to take new lessons learned back from the project into the governance structures. However, this is no substitute for EA “documentation” – artefact-based governance – which is necessary to ensure consistency and continuity.

The Value of Enterprise Architecture

There has lately been a lot of discussion about Enterprise Architecture (hereafter EA) in my neighbourhood of the blogosphere. Perhaps this is a sign of adolescence for EA. Its existence is apparent, and now EA needs to decide who it should hang out with and what it wants to be when it grows up. As part of a wider discussion, Richard Veryard has posted some excellent thoughts on the importance of a value proposition for EA. I totally agree with Richard’s view that EA is not just about making projects successful and since this is my blog, I’ll put my view on what I think is the value of EA. EA provides two perspectives which are vital in any long-lived and complex system such as an enterprise.

Continuity: EA provides continuity across organizational boundaries, geographies and – most importantly – the span of time. This is something that any individual project – successful or not – could not do. During the lifetime of an enterprise there are multitudes of projects executed in different places, by different people – outsourced and insourced. Ultimately all these projects must pull in approximately the same direction to support the business strategy.

To take a hypothetical example: “The data warehouse project in Bangalore being built by Tatosys must support the new marketing system being built in Sydney by Bluehair Consulting. And both are critical components for the mobile commerce platform slated for 2012 go-live by either PDQ Global Services or IBS (an HQ company) depending on who bids the lowest fixed price. And by-the-way the core transaction systems run on a mainframe that we are phasing out in the next three years as part of our 2002 strategic plan.”

Each of these projects is a major undertaking in its own right. Even if every project is 100% successful, you cannot be certain they will all fit together in a coherent and efficient manner. Making them fit is more than just a business problem. It goes beyond getting the functionality right and encompasses deeper technical layers of standards, frameworks and systems partitioning which are neither the domain of the business nor of business analysts.

Self Correction: The second value proposition of EA that I see is the role of “external observer and governor”. Enterprises exist in a constant flux of technologies, fashions, methodologies and business requirements. Someone needs to have the ability to evaluate new practices and make the decision to incorporate those that are valuable and relevant. This is not something that individual projects can do, although they may have a role in trialing new practices.

Each of these areas of change exist in the domain of different parts of the organization. The business is hopefully across changes in business requirements, the Project Management Office is (perhaps) across changes in software development methodologies, and the IT department always want to try out the cool new tools. But normally these parts of the organization don’t interact outside the constraints of an individual project. You need some constituency that is across all these areas and helping to guide the amalgamation of new “best practices” into the enterprise. I think that EA is the natural venue for this interaction.

So is this just the “old fashioned” value proposition of “EA-as-IT-planning”, instead of the cool new view of “EA-as-business-strategy”? I see it as something midway between the two. Certainly “EA-as-IT-strategy” is close, but also “EA-as-trusted-advisor-to-business-strategy”. Leave the business strategy to the business, but when it comes to the mechanics of implementing business strategy in terms of systems, processes and technologies, the strategy-makers should be able to rely on EA to guide them as to what works, what doesn’t – and who might take them for a ride.

Richard mentions that he’s “not convinced that the EA value proposition is understood by its customers”. I would go so far as to say I don’t think the EA value proposition is even trusted by its customers. EA needs to engage with its customers, figure out the value proposition and then execute.

Dimensions of Coupling

Coupling is one of the most fundamental measures of “quality” for an information system. The concepts of coupling and cohesion appear in software design best practices for at least a couple of decades. And these concepts are also vital to the development of distributed systems. As core as the concept of coupling is, it is difficult to find a real definition in the distributed systems context. Coupling is like obscenity – we can’t define it, but we know it when we see it.

Which is why I was pleased to see Ian Robinson’s post which presented coupling as lying on two dimensions – temporal and behavioural and even put in place some characteristics which helps you put a rough measure on the degree of coupling. Coincidently, I had drafted my own version of this some time ago, but it had never made it to publication.

Like Ian, I was trying to quantify coupling so that we can understand what constitutes a tightly or a loosely coupled system and we can have some approach to measure it and therefore have a method to decide between design trade-offs in satisfying the various requirements of our distributed systems. While Ian presents a conceptually clean two-dimensional picture, I felt the true story involves multiple interacting dimensions.

While I was researching this, I happened to find a book extract which covers what I wanted to say and more. The full extract is well worth reading, but is summarised in the following table:

Level Tight Coupling Loose Coupling
Physical coupling Direct physical link required Physical intermediary
Communication style Synchronous Asynchronous
Type system Strong type system (e.g., interface semantics) Weak type system (e.g., payload semantics)
Interaction pattern OO-style navigation of complex object trees Data-centric, self-contained messages
Control of process logic Central control of process logic Distributed logic components
Service discovery and binding Statically bound services Dynamically bound services
Platform dependencies Strong OS and programming language dependencies OS- and programming language independent

Here we have no less than seven dimensions to the coupling equation.

The final paragraph of this article highlights the costs of loose-coupling (and only some of the benefits).

However, in most cases, the increased flexibility achieved through loose coupling comes at a price, due to the increased complexity of the system. Additional efforts for development and higher skills are required to apply the more sophisticated concepts of loosely coupled systems. Furthermore, costly products such as queuing systems are required. However, loose coupling will pay off in the long term if the coupled systems must be rearranged quite frequently.

I think this understates the benefit. “Rearranged frequently” seems to only cover design-changes. But it should also cover “runtime rearrangement” such as partitioning across redundant components for the purpose of load-balancing and fault-tolerance. In such cases, “loose-coupling” provides significant value in higher uptime and scalability of distributed systems.

Update May 14, 2009: Richard Veryard has pointed me to his paper “Component Based Service Engineering” (subscription required) which discusses an even wider range of coupling beyond the technical layers into Process, Organizational and Business layers. The CBDi Wiki has a table summarizing all the coupling dimensions identified in Richard’s paper.

One section of the paper struck a chord with me:

How can we have loose coupling and hard-wiring at the same time? The answer comes as soon as we recognize that coupling is multidimensional or multilayered. My head is connected (coupled) to the rest of my body in several different ways. Even if I could introduce some technology to decouple the nervous system, that doesn’t allow me to remove my head….With Web Services, SOAP simply removes one set of the hard-wired connections. Other forms of coupling remain.

This was written in 2003 and proves quite prescient in that many SOA projects in the interim have failed to achieve their goals by simply adopting out-of-the-box “web services” which only address one or two of the many dimensions of coupling.

Ian Robinson on Coupling

In my opinion, coupling is the most fundamental attribute of a system architecture and tight coupling is probably the most common architectural problem I see in distributed systems. The manner in which system components interact can be a chief determinant of the scalability and reliability of the final system.

So I really like Ian Robinson’s post on Temporal and Behavioural Coupling where he uses two coupling dimensions and the inevitable magic quadrant to classify systems based on their degree of temporal and behavioural coupling.

See Ian’s post for the slick professional graphics, but to summarise – event-oriented systems with low coupling¬† occupy the “virtuous” third quadrant of the matrix. Conversely the brittle “3-tier” applications that many of us struggle with, occupy the “evil” first quadrant where coupling in both dimensions is high.

However I’m a little miffed to see no mention of my favourite “document-oriented message” in Ian’s diagram. As Bill Poole writes; document messages have lower behavioural coupling than command messages, but more than event messages. So would you put document-oriented messages near the middle top of the matrix between command-oriented and event-oriented messages? Unfortunately that would break the symmetry. But it also highlights another problem.

Any type of message – document, command or event-oriented could temporally be tightly or loosely coupled. Temporal coupling is more a property of the message transport than of the message type. So I suggest that the two coupling dimensions are characterised as follows:

  • Temporal coupling – characterised by message transport from RPC (tight coupling) through to MOM (loose coupling).
  • Behavioural coupling – characterised by the message type from event-oriented (tight) through document-oriented to event-oriented (loose).

It so happens that distributed 3-tier systems generally employ both command-oriented messages and RPC transports – hence making them inherently “evil”. Whereas events (being asynchronous)¬† are naturally virtuous by typically being carried over MOM transports (it’s difficult to request an event notification).

Between heaven and hell, it is in the murky mortal realms of SOA where we need to be constantly mindful of the interactions between message type and transport – lest our system ends up in limbo.

56 Architecture Case Studies

The recent brouhaha about Twitter scalability has highlighted the growth of the latest spectator sport in the blogosphere – “armchair architect”. Everyone’s a Monday morning expert on which language/database/framework is/isn’t the secret to extreme scalability. The real secret is the architecture and organizational maturity. Here are some case studies to prove it:

A Conversation with Werner Vogels where Werner talks about using service-orientation to scale out massively distributed services which power the Amazon e-commerce platform. This is one of my favourites because it covers organizational as well as technical aspects of scalability. One of the unique attributes of Amazon is that service-orientation pervades everything – even their organizational structure. Developers are responsible for running their own services.Werner characterises the adoption of services as a challenging and major learning experience, but it has become one of their main strategic advantages. Key lessons learned:

  • service-orientation is an excellent technique to achieve isolation and high levels of ownership and control
  • prohibiting direct database access allows scaling and reliability improvements without affecting clients
  • a single unified service access mechanism supports service aggregation, routing & tracking
  • service orientation improves development and operational processes leading to more agility
  • giving developers operational responsibility enhances the quality of the services.

The eBay Architecture (PDF) covers the evolution of eBay from 1998 to 2006. It’s a great example of how continuous reinvention is needed to keep up with rapidly growing scaling requirements. Frank Sommers writes a good summary and discussion where he argues that organizational capability is just as important as technical architecture for scalability. Another key ingredient of the eBay story is the ability to discard “conventional wisdom” when required. This is covered in an interview with Dan Pritchett, revealing some of the “rules” that eBay bends in order to scale.

  • eBay.com doesn’t use transactions – mainly for scalability and availability reasons
  • different data is treated in different ways – so best effort suffices for some
  • references the CAP theorem – consistency, availability, partitioning – pick any two
  • many have arrived at the same idea – and transactions are the first to go

Scalable Web Architectures is a great presentation on scalable web architectures in general and Flickr in particular. Also check out Cal Henderson’s list of his other presentations.

Architectures You’ve Always Wondered About provides slides from QCon London 2009 presentations. Case studies about eBay, Second Life, Yahoo!, Linked-In and Orbitz.

Avoiding the Fail Whale is a video in which Robert Scoble interviews architects from FriendFeed, Technorati and iLike.

Improving Running Components at Twitter – Evan Weaver describes how Twitter learned to scale by moving to a messaging architecture.

Real Life Architectures at High Scalability – provides a huge collection of pointers to architecture case studies from around the web.