Entries Tagged 'services' ↓

The Power of Later

Steve Vinoski nominates RPC as an historically bad idea, yet the synchronous request reply message pattern is undeniably the most common pattern out there in our client-server world. Steve puts this down to “convenience” but I actually think it goes deeper than that. Because RPC is actually not convenient – it causes an awful lot of problems.

In addition to the usual problems cited by Steve, RPC makes heavy demands on scalability. Consider the SLA requirements for an RPC service provider. An RPC request must respond in a reasonable time interval – typically a few tens of seconds at most. But what happens when the service is under heavy load? What strategies can we use to ensure a reasonable level of service availability?

  1. Keep adding capacity to the service so as to maintain the required responsiveness…damn the torpedoes and the budget!
  2. The client times-out, leaving the request in an indeterminate state. In the worst case, the service may continue working only to return to find the client has given up. Under continued assault, the availability of both the client and the server continues to degrade.
  3. The service stops accepting requests beyond a given threshold. Clients which have submitted a request are responded to within the SLA. Later clients are out of luck until the traffic drops back to manageable levels. They will need to resubmit later (if it is still relevant).
  4. The client submits the request to a proxy (such as a message queue) and then carries on with other work. The service responds when it can and hopefully the response is still relevant at that point.

Out of all these coping strategies, it seems that Option 2 is the most common, even when in many cases one of the other strategies is more efficient or cost-effective. Option 1 might be the preferred option given unlimited funds (and ignoring the fact it is often technically infeasible). In the real world Option 2 more often becomes the default.

The best choice depends on what the service consumer represents and what is the cost of any of the side-effects when the service fails to meet its SLA.

When the client is a human – say ordering something at our web site:

  • Option 2 means that we get a pissed off user. That may represent a high, medium or low cost to the organization depending on the value of that user. In addition there is the cost of indeterminate requests. What if a request was executed after the client timed-out? There may be a cost of cleaning up or reversing those requests.
  • Option 3 means that we also get a pissed off user – with the associated costs. We may lose a lot of potential customers who visit us during the “outage”. On the positive side, we minimise the risk/cost of indeterminate outcomes.
  • Option 4 is often acceptable to users – they know we have received their request and are happy to wait for a notification in the future. But there are some situations where immediate gratification is paramount.

On the other hand, if the client is a system participating in a long-running BPM flow, then we have a different cost/benefit equation.

  • For Option 2, we don’t have a “pissed off” user. But the transaction times out into an “error bucket” and is left in an indeterminate state. We must spend time and effort (usually costly human effort) to determine where that request got to and remediate that particular process. This can be very costly.
  • Option 3  once again has no user impact, and we minimise the risk of indeterminate requests. But what happens to the halted processes? Either they error out and must be restarted – which is expensive. Alternatively they must be queued up in some way – in which case Option 3 becomes equivalent to Option 4.
  • In the BPM scenario, option 4 represents the smoothest path. Requests are queued up and acted upon when the service can get to it. All we need is patience and the process will eventually complete without the need for unusual process rollbacks or error handling. If the queue is persistent then we can even handle a complete outage and restoration of the service.

So if I am a service designer planning to handle service capacity constraints, for human clients I would probably choose (in order) Option 3, 4 and consider the costs of option 2. For BPM processes where clients are “machines” then I would prefer Option 4 every time. Why make work for myself handling timeouts?

One problem I see so often is that solution designers go for Option 2 by default – the worst of all the options available to them.

Standalone Web Services and the Decline of the Browser

Back around 1997 I speculated to a colleague that the general purpose web browser would disappear in 10 years. My rationale was that the user experience of HTML and the web at the time was so limited that users would be enticed by installable widgets that combined the advantages of the desktop for rich interactivity plus an HTTP “back end” for access to web-based services.

Skip ahead 12 years and obviously my prediction hasn’t come true – quite. The main reason is that Rich Internet Applications (RIAs) have been achieved inside the browser using technologies like Flash and AJAX and the challenges of installation and maintenance have held back widespread adoption of desktop widgets.

However there is one other reason why widgets have not taken off and it has nothing to do with the widget. It all has to do with the hitherto tight coupling between the server and the client in web-based “services”. All web applications to-date have had the back-end services and the front-end “browser experience” developed and marketed by the one company as a complete unit.

Until recently there were no standalone web services. Very few, if any public web services have been designed with multiple different and independent front-ends in mind. Where they have emerged, they tended to be “hacks” created by individuals who came up with creative ways to re-purpose back-end services. Perhaps the pioneer was the use of Gmail as a file system.

I think this situation is about to change dramatically, driven by a number of factors:

  1. New platforms such as iPhone and Android make web-based widgets more compelling compared with the generic browser on those platforms.
  2. There are now well-established standards for web services in platform neutral formats (I’m thinking JSON and XML)
  3. Cloud computing further reduces the entry cost for service providers.

The main element that remains is for the business models to emerge around pure service delivery. Twitter is the most visible example that a company can provide web-based services as its primary mission and leave the development of user interfaces to others. Biz Stone of Twitter states that:

“The API has been arguably the most important, or maybe even inarguably, the most important thing we’ve done with Twitter. It has allowed us, first of all, to keep the service very simple and create a simple API so that developers can build on top of our infrastructure and come up with ideas that are way better than our ideas…”

Biz then goes on to describe how the API is central to their business model plans.

Another good example of standalone web services is how OpenID provides authentication services which are completely independent of the consuming business/application. Sites like StackOverflow can then outsource the boring authentication bits of their site and concentrate on more important content.

This is where the service-oriented architecture of the web really becomes interesting…a loosely coupled ecosystem of service consumers and providers providing the real architecture for serendipity.

Update 04/05/09: I recently found an interesting slide presentation on this topic entitled “The Rise of the Widgets” – well worth a look.

Gartner’s Top 10 Strategic Technologies for 2009

Gartner has nominated their top 10 strategic technologies for 2009. In priority order they are:

1. Virtualization
2. Business Intelligence
3. Cloud Computing
4. Green IT
5. Unified Communications
6. Social Software and Social Networking
7. Web Oriented Architecture
8. Enterprise Mashups
9. Specialized Systems
10. Servers – Beyond Blades

My primary interests in SOA and CEP are represented in items 7 and 2 respectively (with some applicability to 8).

Business Intelligence. Business Intelligence (BI), the top technology priority in Gartner’s 2008 CIO survey, can have a direct positive impact on a company’s business performance, dramatically improving its ability to accomplish its mission by making smarter decisions at every level of the business from corporate strategy to operational processes. BI is particularly strategic because it is directed toward business managers and knowledge workers who make up the pool of thinkers and decision makers that are tasked with running, growing and transforming the business. Tools that let these users make faster, better and more-informed decisions are particularly valuable in a difficult business environment

(emphasis added).  Complex Event Processing is a key enabling technology for real-time business intelligence. The ability to abstract “rules” out of data using analytics and then have those rules executed in real-time to match against “events” running around your enterprise bus is what will make BI achieve the needs of business to make informed decisions “faster”. Traditional BI where you feed data into a mammoth DWH and then analyse the results at some point later suffer from high latencies and reduced information currency. CEP can reduce these latencies from weeks or months down to seconds.

Web-Oriented Architectures. The Internet is arguably the best example of an agile, interoperable and scalable service-oriented environment in existence. This level of flexibility is achieved because of key design principles inherent in the Internet/Web approach, as well as the emergence of Web-centric technologies and standards that promote these principles. The use of Web-centric models to build global-class solutions cannot address the full breadth of enterprise computing needs. However, Gartner expects that continued evolution of the Web-centric approach will enable its use in an ever-broadening set of enterprise solutions during the next five years.

Recognition that SOA is more than just “Web Services”and that at least one other architectural paradigm is available to SOA implementers.

Interesting that BPM has dropped from the list after featuring in the previous two years. And SOA itself has not been mentioned since the list for 2006. It looks like SOA has become more business as usual – part of the IT “furniture” – as I predicted some time ago.

Push versus Pull

From OSCON via O’Reilly Radar here’s a good case study of an architectural decision driven by the system requirements rather than the usual religious considerations that pollute the bloggosphere.

FriendFeed needed update info from Flikr but a REST-based “pull” approach is highly inefficient in this case. Instead the solution architects opted for a “push” approach using xmpp as the message transport. This is a really good presentation because it goes into the architectural choices and implications of “push” versus “pull”.

I characterize this as “pull vs push” rather than “REST vs xmpp” (or “REST vs *” or “why REST is crap”) because fundamentally it comes down to the best choice of how to synchronize changes between systems. You make this choice based on the usage characteristics of the different systems, the likely traffic volumes this will result in and the consequential resource impacts. Having made the choice between push or pull you then choose the appropriate message transport.

The web doesn’t do a lot of “push” and consequently there is not a lot of discussion about push and REST. Dare Obasanjo characterises it nicely:

Polling is a good idea for RSS/Atom for a few reasons

  • there are a thousands to hundreds of thousands clients that might be interested in a resource so the server keeping track of subscriptions is prohibitively expensive
  • a lot of these end points aren’t persistently connected (i.e. your desktop RSS reader isn’t always running)
  • RSS/Atom publishing is as simple as plopping a file in the right directory and letting IIS or Apache work its magic

The situation between FriendFeed and Flickr is almost the exact opposite. Instead of thousands of clients interested in document, we have one subscriber interested in thousands of documents. Both end points are always on or are at least expected to be. The cost of developing a publish-subscribe model is one that both sides can afford.

Inside the firewall, the situation is often more akin to that between FriendFeed and Flikr. This is why messaging is more common inside the firewall than outside – not because of any universal superiority between REST versus messaging, but because the system requirements are different and often favour a push approach rather than pull.

While your over at Dare’s excellent Blog, be sure to also check out his discussion of push versus pull in the context of scaling Twitter and MS Exchange.  These are important considerations for designers of federated systems such as federated databases or federated messaging systems. The example of FriendFeed to Flikr could be considered as the first incremental step toward a federation.

Waiting for the great leap forward

One of the original and fundamental tenets of the SOAP standard was that the SOAP message is independent of the underlying transport. Ostensibly you could use SOAP over HTTP, JMS, email, FTP etc. but the reality is that a standard binding has only ever existed for SOAP over HTTP. To paraphrase Henry Ford – “you can have any SOAP transport you like, as long as its HTTP”.

While HTTP is undoubtedly a good choice for SOAP – given its ubiquity – there is at least one other transport which demands attention. This is the JMS transport which is widely used inside the firewall of many organizations. Of all the companies that I work with, their SOA infrastructure heavily relies on JMS transports inside the firewall, with HTTP transports outside the firewall or to selected service end-points such as web pages. Of course my experience has significant selection effects, but nevertheless JMS is an important transport in many SOAs. Testament to this is that every major web-services product vendor (save Microsoft) supports SOAP over JMS (and even Microsoft now has SOAP over MSMQ as an important part of WCF).

The fly in the ointment is that there has never been a standardized binding for SOAP over JMS and as a result there is little interoperability between SOAP/JMS solutions provided by different platform vendors. If you happen to have any combination of different web-service platforms in your organization, then they cannot easily communicate with each other using SOAP over JMS without performing some unnatural acts.

Some of issues that need to be considered with a SOAP binding to JMS are:

  • How do you represent the message content – text or binary? Most vendors have chosen a text message representation, but that has problems with multi-byte encodings, so other vendors have gone with a byte message representation.
  • What headers do you define and what should their names be? How do you use the standard JMS headers? different vendors have different naming conventions and semantics.
  • In the WSDL description, how do you represent the connection details to the JMS provider?
  • How do your service endpoints manage the different message exchange patterns that are available with message-oriented transports?

Each of the vendors went their own way on many of these issues and as far as interoperability was concerned they basically ceded the field to HTTP. They made life difficult for large organizations with heterogeneous platforms and in my opinion didn’t do themselves any favours on the way. (Actually SOAP-encoding interoperability was so broken for a while that noone noticed the JMS issues…so maybe it wasn’t so bad).

Subsequently it was great to see some of the vendors get together a couple of years ago to agree on a standard SOAP binding for JMS that addresses most of the important considerations. The result was a Member Submission to W3C in September last year. My understanding is that this submission was previously circulated through most of the vendor community so hopefully it has general agreement on the technical details.

This has now taken its first steps to standardization with the initiation of a SOAP-JMS Binding Working Group who aim to publish a recommendation by April next year. Hopefully vendor support of the binding will be hot on its trail.

Note that the standard binding won’t address the fact that different JMS implementations do not interoperate. For example, a TIBCO JMS client will not be able to talk to a Websphere JMS provider because JMS is an API standard, not a wire-protocol standard. What the SOAP/JMS binding standard does mean is that once you have settled on a standard JMS provider for your services, you could define your service description in standard WSDL and your service provider (say Websphere or TIBCO or WSO2) and your service consumer (say TIBCO or WebLogic or Axis) would be able to communicate directly using SOAP over JMS “out of the box”.

Its been eight years (almost to the day) since SOAP 1.1 came out with the HTTP binding. Wouldn’t it be great if a standard JMS binding could be achieved within the decade! It’s been a very long wait. The JMS binding should have happened a lot sooner and I can’t say the “wait has been worth it” but it does fill an important hole in the Web-Services standards.

So what do we do in the meantime? You can eschew JMS altogether and stick with HTTP, but that requires another lot of hard work. You can stick with one and only one service platform, but that is difficult in large heterogeneous organizations – which is where SOA is supposed to provide maximum benefit. Or you can continue to do what many SOA implementers have done and deal with SOAP directly at the JMS layer – effectively using SOAP as plain-old-XML over JMS. I wrote more about this approach recently.

Another thing you can do in the meantime is ask your vendor when will they support the new SOAP/JMS binding?